Skip to main content
  • Loading metrics

The Mutational Robustness of Influenza A Virus


A virus’ mutational robustness is described in terms of the strength and distribution of the mutational fitness effects, or MFE. The distribution of MFE is central to many questions in evolutionary theory and is a key parameter in models of molecular evolution. Here we define the mutational fitness effects in influenza A virus by generating 128 viruses, each with a single nucleotide mutation. In contrast to mutational scanning approaches, this strategy allowed us to unambiguously assign fitness values to individual mutations. The presence of each desired mutation and the absence of additional mutations were verified by next generation sequencing of each stock. A mutation was considered lethal only after we failed to rescue virus in three independent transfections. We measured the fitness of each viable mutant relative to the wild type by quantitative RT-PCR following direct competition on A549 cells. We found that 31.6% of the mutations in the genome-wide dataset were lethal and that the lethal fraction did not differ appreciably between the HA- and NA-encoding segments and the rest of the genome. Of the viable mutants, the fitness mean and standard deviation were 0.80 and 0.22 in the genome-wide dataset and best modeled as a beta distribution. The fitness impact of mutation was marginally lower in the segments coding for HA and NA (0.88 ± 0.16) than in the other 6 segments (0.78 ± 0.24), and their respective beta distributions had slightly different shape parameters. The results for influenza A virus are remarkably similar to our own analysis of CirSeq-derived fitness values from poliovirus and previously published data from other small, single stranded DNA and RNA viruses. These data suggest that genome size, and not nucleic acid type or mode of replication, is the main determinant of viral mutational fitness effects.

Author Summary

Like other RNA viruses, influenza virus has a very high mutation rate. While high mutation rates may increase the rate at which influenza virus will adapt to a new host, acquire a new route of transmission, or escape from host immune surveillance, data from model systems suggest that most new viral mutations are either lethal or highly detrimental. Mutational robustness refers to the ability of a virus to tolerate, or buffer, these mutations. The mutational robustness of a virus will determine which mutations are maintained in a population and may have a greater impact on viral evolution than mutation rate. We defined the mutational robustness of influenza A virus by measuring the fitness of a large number of viruses, each with a single point mutation. We found that the overall robustness of influenza was similar to that of poliovirus and other viruses of similar size. Interestingly, mutations appeared to be more easily accommodated in hemagglutinin and neuraminidase than elsewhere in the genome. This work will inform models of influenza evolution at the global and molecular scale.


The predictable burden of seasonal influenza and the unpredictability of the next pandemic are attributable in large part to the rapid evolution of influenza virus [14]. Like other RNA viruses, influenza viruses replicate with extremely low fidelity, with a mutation rate of roughly 2 x 10−5 substitutions per nucleotide copied per cellular infection [57]. Influenza viruses also undergo reassortment of their genomic segments, a combinatorial exchange of genetic material analogous to recombination in other RNA viruses [8,9]. Together, low replicative fidelity and frequent reassortment allow influenza virus populations to generate significant diversity. This capacity may allow influenza viruses to maintain, or to quickly generate, the requisite mutations that mediate cross species transmission, escape from neutralizing antibody, or drug resistance [10].

The focus on mutation as a driving force in viral evolution has tended to downplay the tremendous fitness cost of mutation [11,12]. Here, we define viral fitness as the capacity of an individual, or population, to generate infectious progeny. Most mutations have deleterious effects on fitness, which suggests that mutational tolerance may play a significant role in determining the genetic diversity that can be maintained within a population [13]. Mutational robustness refers to phenotypic stability in the face of mutation [1416]. High mutation rates select for increased mutational robustness [17], and a more robust population can increase its genetic diversity without a dramatic alteration in mean fitness. A virus’ intrinsic robustness may influence its fitness in vitro and virulence in vivo [18].

The impact of individual mutations on viral fitness is the mutational fitness effect (MFE). A virus’ mutational robustness is described in terms of the strength and distribution of the MFE [19]. Together with mutation rate, the MFE governs many aspects of evolution including: the relative importance of selection vs. genetic drift, the efficiency of selection and adaptation, the impact of recombination (or reassortment), and the role of epistasis in fixing new and beneficial mutations. It is therefore essential for accurate models of molecular evolution [20,21]. A virus’ sensitivity to mutation may also determine the effectiveness of lethal mutagenesis [2224].

Early studies of mutational fitness effects relied on mutation accumulation (MA) experiments, where the imposition of extreme bottlenecks propagates and fixes newly generated mutations by drift as opposed to selection [2529]. An alternative approach commonly used in RNA viruses is to accelerate mutation accumulation by passaging virus in the presence of mutagenic drugs [3032]. While both methods provide valuable information about the average fitness impact of random mutation, uncertainty about the number of mutations per clone and the fitness effect of each individual mutation makes it difficult to accurately model a distribution. These issues also complicate more recent, high throughput methods based on next generation sequencing [3335]. Furthermore, none of these approaches provide accurate estimates of the fraction of mutations that are lethal to the virus.

A less exhaustive, but more controlled assay for MFE is to measure the fitness of a set of viral clones, each with a randomly selected single nucleotide substitution. This approach unambiguously assigns fitness values, or selection coefficients, to individual mutations. It is also highly quantitative and allows for estimation of the lethal fraction. The first such study, of vesicular stomatitis virus (VSV), found that over 90% of random single nucleotide mutations reduce replicative fitness and 40% are lethal in this negative sense RNA virus [13]. Subsequent work by Sanjuan, Elena, and colleagues found somewhat similar distributions of MFE in f1 (ssDNA phage), phiX 174 (ssDNA phage), QB (+ssRNA phage), and tobacco etch virus (+ssRNA virus) [19,3640]. Together these data suggest that despite their differences in genome organization and replication strategy, ssRNA and ssDNA viruses are equally sensitive to mutation.

Despite their importance to pathogen evolution, all available genome-wide studies of viral MFE have been performed in evolutionary model systems. Here we characterize the distribution of MFE in influenza A virus, a segmented negative sense RNA virus whose evolutionary dynamics are important to global health. We used site directed mutagenesis to generate a library of plasmids encoding 128 influenza A viruses, each with a single point mutation in the A/WSN33/H1N1 genetic background. The large number of mutants allowed us to define the MFE across the genome and to compare the lethal fraction and MFE between segments coding for the surface proteins to those coding for the internal proteins. We find that the MFE of influenza A are remarkably similar to those of other viruses with varying genome structure. While similar proportions of mutations in the surface proteins and internal proteins were lethal, the average impact of mutation appeared to be less deleterious in HA and NA. Our results suggest that the size and compactness of a virus’ genome, and not necessarily the genomic nucleic acid or mode of replication, are major determinants of its intrinsic mutational robustness.


Our primary goal was to determine the distribution of mutational fitness effects across the influenza A genome. We generated all single nucleotide mutations in the commonly used laboratory strain, A/WSN/33/H1N1, hereafter referred to as WSN33 or wild type (WT) [41]. In WSN33, the 8 genomic segments range in size from 0.9 to 2.3 kb. We planned to make 149 mutants and grouped them into two libraries–“genome-wide” and “comparison”–of pre-specified size and composition. Of the 149 total mutations we attempted (S1 Table), we successfully generated 128 (86%). For our “genome-wide” library, we reasoned that in order to achieve an unbiased distribution of mutations throughout the genome, our library should contain a number of mutations on each segment that is proportional to the size of each segment—PB2 17.2%, PB1 17.2%, PA 16.4%, HA 13.1%, NP 11.5%, NA 10.4%, M 7.5%, NS 6.5%, of the total genome respectively. We used a custom R script to choose randomly the nucleotide position and substitution type in accordance with this distribution. For our unbiased genome-wide analysis (n = 95), we generated 14 (14.7%) PB2, 16 (16.8%) PB1, 16 (16.8%) PA, 14 (14.7%) HA, 12 (12.6%) NP, 10 (10.5%) NA, 8 (8.4%) M and 5 (5.3%) NS mutations. Though we failed to generate 14% of attempted mutations, this did not significantly alter the distribution of the mutations across the eight segments (S1A Fig).

Viral surface proteins that are targeted by the immune system often exhibit greater sequence diversity than internal structural and enzymatic proteins. In many cases, the relationship of this diversity to the intrinsic mutational robustness of the genes encoding these surface proteins is unknown. Therefore, a secondary goal of our study was to compare the distribution of mutational fitness effects for the HA- and NA-encoding segments to the other 6 segments, which code for internal proteins. We improved our power to detect a difference by generating an additional 18 HA and 15 NA mutations. In this aggregate “comparison” library (n = 128), 45% (n = 57) of our library is contained on segments coding for HA and NA and 55% (n = 71) on the other 6 segments. (S1B Fig). The entire data set includes 38 transition mutations and 90 transversion mutations, which is in the range of what one would expect by chance (Fisher test, p = 0.75).

Identification of lethal mutations

Previous work indicates that a substantial, but varying proportion, of mutations in RNA viruses are lethal. Accurate assessment of this lethal fraction is an essential, and non-trivial, task. In a reverse genetic approach, the efficiency of transfection and viral recovery are key parameters, since a failed transfection and a lethal mutation will both give supernatants with undetectable titers. We used two approaches to address this problem. In the first, we estimated the transfection failure rate and calculated the probability of miscalling a viable mutant as lethal. Over the course of this study, we transfected the WT virus 19 individual times, and never failed to recover virus in our P0 supernatants. As in [13], we quantified the expected probability of a transfection failure as if we had observed a single WT transfection failure. This conservative approach gave a per transfection failure rate of < 5.26%. Because we attempted three independent transfections for each candidate lethal, the likelihood of repeated transfection failure is 0.05263, or 1.46 x10-4. In a set of 128 viruses, we would expect to falsely identify fewer than one (128 * 1.46 x 10−4 = 0.019) virus as lethal. This probabilistic model assumes equal transfection efficiency for WT and mutant viruses. We also tested one of our mutants, PB2_14 (PB2 C532A), which had moderately reduced fitness (0.81, see below). Here too, we successfully recovered virus in 19 individual transfections.

We also observed several cases of mutants with undetectable titers at P0 and a moderate P1 titer after blind passage of the supernatant from transfected cells. We characterized these P1 stocks by next generation sequencing, and in all cases, found either reversion of the introduced mutation or contamination of the culture by other viruses that were transfected or passaged on the same day. We therefore considered a mutation to be lethal if we had undetectable virus in three independent transfections or if we were unable to recover the mutation in a P1 stock after blind passage of the P0 stock. Using these criteria, there were 30 lethal mutations (31.6%) in the genome-wide dataset and 38 lethal mutations in the combined dataset (29.7%, Table 1). The lethal fraction did not differ appreciably between either the HA/NA encoding segments and the rest of the genome (16/57 vs. 22/71; Fisher exact test, p = 0.85) or the HA encoding segment and the rest of the genome (7/32 vs. 31/96; Fisher exact test, p = 0.37). We had 62% to detect a two fold difference in the lethal fraction for HA/NA and 52% power to detect a two fold difference in the lethal fraction for HA. No synonymous mutations were lethal, but 2 non-coding mutations and a stop codon loss were lethal.

Clonality of viable mutants

As in other studies of viral mutational fitness effects, we measured the fitness of P1 viral stocks rather than the initial transfection supernatant. Given the error prone replication of RNA viruses, we considered it possible that second-site mutations might accumulate in the short time between transfection and completion of the first passage. This might be exaggerated in the less fit viruses, since there would be strong positive selection of a compensatory mutation. If a compensatory mutation swept through the population, it would confound assignment of a fitness value to the initial mutation. Therefore, we sequenced the entire genome of the P1 stocks for 78 of our mutants on the Illumina platform to identify any second site mutations and their frequency. In all but 3 cases, the desired mutation was present at >94%, and was present at >99% in nearly all. Very few additional mutations were identified at >2% (Table 2, S2 Table). We relied on Sanger sequencing alone to confirm the desired mutation in replicate stocks of the remaining 13 viable viruses. We conclude that these stocks are essentially clonal and that the fitness of these populations will reflect the impact of each individual mutation.

Measurement of viral fitness

In prior studies of MFE, fitness has been measured as the difference in exponential growth rates for WT and mutant strains, measured either in parallel or in direct competition[19,36,37,39,42]. Given the relative imprecision of one step growth curves for quantification of growth parameters in influenza and many other viruses, we measured the relative fitness of each mutant in direct competition with the WT over serial passage [18]. In this assay, the mutant is competed with a tagged WT reference, which has a cluster of synonymous mutations in the PB1 open reading frame. These mutations allowed us to distinguish the barcoded WT from a non-barcoded mutant in a mixed infection using quantitative reverse transcription PCR with primers specific for their respective sequences. Importantly, this tagged WT virus competed equally well with the untagged WT virus over 6 passages, demonstrating the selective neutrality of the marker (Fig 1A). In our serial passage competition assay, the change in relative frequency of the WT and mutant over time is the difference in growth rate, or the selection rate constant (Fig 1B). The exponent of this value is the relative fitness. We have shown previously that this assay can provide precise measurements of relative fitness with as few as 3 technical replicates [18], although it is less sensitive for weakly deleterious or beneficial mutations with a relative fitness close to 1.

Fig 1. Direct competition assay for relative fitness.

(A) Equal infectious units of a barcoded version of the WT were competed against WT at an moi of 0.01, and the amount of each virus at each passage was compared to the input by RT-qPCR as described in the methods. The slope of the regression of the difference in the log10 change in ratio for each virus over time is the fitness. The assay was performed in triplicate and the slopes of the three lines are 0.007, 0.129, and 0.007, which corresponds to a fitness of 1.02 ± 0.008 for the barcoded virus relative to WT. (B) Sample data for two single nucleotide mutants. Each was competed against the barcoded WT as in (A) and relative fitness measured as calculated in the methods. One replicate each of NA-18 (circles, fitness = 0.64) and HA-40 (squares, fitness = 1.02) is shown. Note that we fit our regressions through passages 1–4 and excluded P0 as slight deviations from a 1:1 ratio of the two viruses in the inoculum can skew the slope when fit through this data point.

We performed our competition assays in A549, a cancerous human lung epithelial cell line, which supported efficient replication of WSN33 and may be more physiologically relevant to influenza virus replication than other commonly used lines. Passages were carried out at a multiplicity of 0.01 infectious units (TCID50) per cell. Given that the burst size of influenza A in A549 cells is <100 per cell, most passages will represent 2 cellular infection cycles with minimal co-infection in the second cycle. We note that this moi corresponded to just 10,000 infectious units at each passage, and we cannot exclude that genetic drift could lead to some variability in the fitness measurements with a transfer population of this size [43].

Distribution of mutational fitness effects

We were able to measure the fitness of 89 out of the 90 viable mutants. Despite repeated attempts with multiple stocks, we were unable to obtain data for HA-8. In competitions with this mutant, we consistently saw large reductions in titer of both HA-8 and the wild type after a single passage, perhaps due to dominant negative effects or the impact of defective interfering particles. With the exception of PB1-6, which was measured twice, all fitness values are based on 3 replicate competition assays, and the mean fitness and standard deviation are reported in Table 3 (see also Figs 2 and 3). Seventy-one of the mutations in our library were nonsynonymous, and these viruses had relative fitness values ranging from 0.26 to 1.12 (mean 0.80, standard deviation 0.20). Nearly all of the 18 viruses with synonymous mutations exhibited a fitness close to 1 (mean 0.93, standard deviation 0.21). The sole exception was PA-6/G240A, which is synonymous in the canonical PA open reading frame (Leu72). Neither this mutation, nor any of the other PA mutations were within the alternate PA-X reading frame [44]. Three mutations affected proteins in two different reading frames: PB1 A251C (fitness 0.45), NS G648U (fitness 0.92), and NS G650C (fitness 0.78). Three out of the 4 nonsense mutations were lethal (see Table 1), the exception was NS-5/G648T (NS1 E208stop, NEP M50I), which had a fitness of 0.92. Consistent with the important role of 5’ and 3’ translated regions in RNA synthesis and packaging, 2 out of the 3 non-coding mutations were lethal [45]. The 3rd was HA-8 (see above).

Fig 2. Location and fitness for all mutations.

Each mutation in Tables 1 and 3 is shown in its reading frame(s) with substitution type (nonsynonymous, synonymous, or noncoding) and fitness (see legend).

Fig 3. Histograms and cumulative distribution functions of influenza A virus mutational fitness effects.

Data are shown for all single nucleotide mutants (A, n = 128), the randomly selected genome-wide dataset (B, n = 95), the HA and NA dataset (C, n = 57), and the “internal” dataset (D, n = 71). Relative fitness values, bin width 0.1, are shown on the x-axis, and number of mutations in each histogram bar (left) and percent in cumulative distribution (right) are shown on the y-axes. The cumulative distribution functions show only the viable mutations (fitness > 0).

HA mutational fitness effects in head and stem regions

We mapped our HA mutants to the protein structure in order to observe structural patterns in HA fitness effects (Fig 4). Similar to previous studies on mutational fitness effects in influenza A hemagglutinin [35,46], we observed that the HA head seemed to tolerate mutations better than the stem domain. The mutants on the HA1 head domain had an average fitness of 0.77 and a lethal fraction of 2/14. Mutants on the HA2 stem region had an average fitness of 0.56 and a lethal fraction of 4/11. While suggestive, these differences in mean fitness did not achieve statistical significance (p = 0.258, Mann Whitney U test). Two mutations were located in one of the four H1N1 antigenic sites [47], K170N (Sa, fitness 1.01 ± 0.12) and S206R (Sb, fitness 0.72 ± 0.14).

Fig 4. Fitness impact of mutations on HA.

Mutations were placed onto a structural model of the hemagglutinin protein (PBD 1RVX). Shown are mutations on the head and stem regions, HA1 and HA2. Non-coding mutations (HA-8) and mutations on the signal peptide (HA-11), splice site (HA-1), and transmembrane domain (HA-40, HA-15, HA-10, HA-20) are not shown. Mutations are color coded as follows according to their relative fitness: lethal mutations are red, 0.6–0.8 are orange, 0.8–1.0 are yellow, and 1.0–1.2 are green. We found no HA mutations with a fitness between 0–0.6.

Fitting mutational fitness effects to probability density functions

We compared the distributions of fitness effects by plotting the fitness of all mutants as histograms and overlaying fitness values for the viable mutants as cumulative distribution functions. We show distributions for the total dataset (n = 128, Fig 3A), the randomly selected genome-wide dataset (n = 95, Fig 3B), the expanded HA/NA dataset (n = 57, Fig 3C), and the comparison “internal” 6 segment dataset (n = 71, Fig 3D). As above, the lethal fraction in each group was ~30%. Of the viable mutants, 47 out of 90 were weakly deleterious or neutral (fitness 0.85–1.05). Seven were weakly beneficial (fitness 1.05–1.15). We are conservative in classifying weakly deleterious and weakly beneficial mutations, since the fitness values for many of these mutants were not statistically different from 1. The mutations in 35 of the 90 viable mutants were more deleterious (fitness < 0.85). The mean and standard deviation for the viable mutants were: 0.82 ± 0.21 in the total, 0.80 ± 0.22 in the genome-wide, 0.88 ± 0.16 in the HA/NA, and 0.78 ± 0.24 in the internal datasets, respectively. While the HA/NA and internal datasets had a similar lethal fraction (Table 1 and associated text), the fitness impact in the viable fraction tended to be lower in the HA/NA-encoding segments relative to the internal segment group, but did not achieve statistical significance (p = 0.08, Mann Whitney U test).

The fitness values of the viable mutants were not distributed normally, and we therefore determined which type of distribution best fit the data. We modeled our empiric data on the deleterious mutations against exponential, gamma, beta, Weibull, and lognormal predictions and determined the best fit based on their Akaike information criteria (AIC, Table 4). For each dataset, the beta model was the best at capturing the distribution of fitness effects. We also fit distributions from a large, recently published dataset of poliovirus mutants [48]. Here, the beta model also provided the best fit. These beta models are described by two shape parameters, α and β, in which the expected value of the beta distribution is α/(α+β). The α and β parameters for each of the random, surface, and internal influenza MFE datasets as well as the poliovirus MFE dataset are similar (Table 4). Some have suggested that deleterious mutational fitness effects may be better explained by more complex models [49]. We therefore attempted to combine the above distributions with uniform distributions, in which we allowed the latter to describe a proportion of our mutants’ fitness effects. However, in the cases tested, adding this uniform distribution did not improve the fit of our models. While the influenza A and poliovirus mutational fitness effects were both best described by a beta distribution, there was no similar consensus among previously characterized viruses [19]. VSV was best described by a lognormal + uniform distribution, TEV by a beta distribution, phiX174 by an exponential distribution, QB by a gamma distribution, and F1 by a log normal distribution. In most of these fits, the AIC of the beta distribution did not differ appreciably from the AIC of the best-fit probability density function.

Fitness in alternate genetic backgrounds

All of our mutations were generated in the context of WSN33, a lab-adapted H1N1 strain. Recent work suggests that epistasis is quite prevalent across the influenza virus genome, and we therefore considered it likely that the same mutations would have distinct fitness effects in other genetic backgrounds [5053]. To understand the degree to which our fitness measurements for specific WSN33 mutations are generalizable, we compared our data to those obtained for analogous mutants in WSN33 and other genetic backgrounds. Bloom and colleagues have used deep mutational scanning (DMS) to broadly sample most of the possible amino acid substitutions in the HA and nucleoprotein open reading frames [35,46,54,55]. Deep sequencing of mutant libraries before and after passage was used to infer site preferences for every amino acid at every site and to calculate site entropy, or inherent tolerance of substitution at a given site. Site entropy will tend to capture the general mutability of a position, while site preference is likely to be more specific for a mutation in a given genetic background. Importantly for our comparison, many of the plasmids had multiple mutations and the site preferences represent the average effect of a mutation in the background of very similar, but distinct, sequences.

We first compared our 43 HA point mutants to DMS data on HA from WSN33 [35,46]. Here, we observed a statistically significant, and reasonably strong, correlation between our fitness values and site entropy for nonsynonymous substitutions (Fig 5A, Spearman r = 0.56, p = 0.0018). Consistent with the fact that both datasets were generated in the WSN33 background, the correlation between the fitness of viruses with our substitutions (nonsynonymous and synonymous) and the site preference for the same substitution was similar (Spearman r = 0.66, p < 0.001). It isn’t clear the degree to which the modest correlations for site entropy and site preference are attributable to differences in experimental set-up or the scale and error of measurement for two very different datasets. Of note, Thyagarajan and Bloom observed a similarly modest correlation (Pearson r = 0.48, p < 10−10) between their DMS data on WSN33 and a DMS separate study by Wu et al. [33], see discussion in [35].

Fig 5. Correlation of fitness values with site entropy and preference.

(A) Fitness of nonsynonymous HA mutants vs. site entropy (top) and fitness of all HA mutants vs. site preference (bottom) as reported in [46]. Unscaled values are shown. Correlations were similar for scaled values, see S3 Table. (B) Fitness of nonsynonymous NP mutants vs. site entropy (top) and fitness of all NP mutants vs. site preference (bottom) as reported for PR8 in [55]. (C) Fitness of nonsynonymous NP mutants vs. site entropy (top) and fitness of all NP mutants vs. site preference (bottom) as reported for A/Aichi/1968 (H3N2) in [55]. Note that scale in (A) for site preference is different as there were synonymous mutants in this dataset, but not in the NP datasets.

We next compared the fitness of our 12 NP mutants to DMS data on NP from influenza A/Puerto Rico/8/1934 (H1N1, PR8) and A/Aichi/1968 (H3N2) [54,55]. In PR8, a closely related lab strain, we found a stronger correlation between fitness and site entropy (Fig 5B, Spearman r = 0.79, p = 0.0037). The correlation between WSN33 fitness and PR8 site preference was weaker (Spearman r = 0.60) but did achieve statistical significance (p = 0.0450). Interestingly, our WSN33 fitness values were correlated with site entropy in the more distantly related Aichi H3N2 strain (Fig 5C, Spearman r = 0.59, p = 0.0465). The correlation with site preference was weaker for the WSN33-Aichi comparison than the WSN33-PR8 comparison (Spearman r = 0.26) and did not achieve significance (p = 0.4031). Together, our fitness values are reasonably correlated with site entropy in distinct strains and subtypes, and these data suggest that a sizeable portion of mutational fitness effects are attributable to the general mutational tolerance at a given site. The observation that fitness was correlated with site preference for HA in WSN33 and NP in PR8 (H1N1), but not in Aichi (H3N2) suggests that the fitness impact of these mutations will vary with genetic background due to intragenic or intergenic epistasis. We are cautious in this interpretation given the differences in the underlying experimental design, sample size, and nature of the datasets, as outlined above. Given that Doud and Bloom found general conservation in site preference between PR8 and A/Aichi/1968, it is possible that our differences could be due to the type and number of amino acid substitutions in our smaller NP dataset.

We further explored the generalizability of our MFE data by querying the influenza research database for H1N1 sequences containing any of our nonsynonymous mutations. We reasoned that of our WSN33 amino acid substitutions, only those with weakly deleterious, neutral, or beneficial fitness effects would be observed in circulating strains. Indeed, we found only two mutations, both in HA, that were present at reasonably high frequencies in more recent, circulating strains (S4 Table, S2 Fig). One was weakly deleterious, Q381L (fitness 0.93 ± 0.06, present in 78.7%) and the other was beneficial, HA M420I (fitness 1.12 ± 0.07, present at 79.7%). Consistent with what one would expect, our deleterious mutations had extremely low prevalence in the database. The prevalence of only two other non-synonymous mutations was above 0.1%, HA N101I (fitness 0.97 ± 0.06, present in 0.247%) and NA D135N (lethal, present in 0.173%). The significance of these mutants in circulating strains is unclear as they represent a relatively small number of sequences in the entire database (54/21,805 and 28/16187, respectively).

Comparison to other viruses

Sanjuan, Elena, and colleagues have used a similar approach to characterize the mutational fitness effects for 5 viruses: VSV, TEV, QB, phiX174, and F1 [13,19,37,40,56]. This set of viruses includes viruses with both DNA and RNA genomes (among which the authors found similar distribution patterns), and influenza is the first segmented virus to be described by this method. We compared the lethal fraction, mean, and variance of our mutational fitness effects to these previously characterized viruses. Because relative fitness values in VSV, QB, phiX174, and F1 were measured as differences in exponential growth rate, we first transformed our serial passage fitness values based on the wild type exponential growth rate in our experimental conditions (S5 Table). A similar transformation has been described for TEV to allow comparison of exponential growth rates across systems (see Methods and [19]). Overall, we found that the lethal fraction and our scaled fitness values closely matched those for these other small DNA and RNA viruses. We also calculated the skewness and kurtosis for each data set to quantitatively describe the shape of the distribution (Table 5). Following the general property that mutations are more likely to be deleterious than beneficial, we found negative values for skewness in each of our data sets. Kurtosis, measuring the “peakness” of a distribution, was above 5 for each of our data sets. A positive kurtosis value indicates that a probability density function has a heavier tail and more values near the mean than predicted by a Gaussian distribution.


We report the first genome-wide study of the mutational fitness effects of single nucleotide mutations in influenza A virus. Unlike other studies of mutational tolerance in influenza, we took great pains to define the lethal fraction, a key parameter in the distribution of MFE. We ensured the relative clonality of nearly all of our stocks by next generation sequencing and used a highly quantitative assay for fitness measurements. Both the lethal fraction and the overall distribution of MFE of influenza A are quite similar to what has been found for ssDNA and other RNA viruses. Consistent with what has been assumed, but to our knowledge never shown, the surface proteins of influenza virus appear to be slightly more tolerant of point mutation than the internal viral proteins. This finding did not achieve statistical significance. These results have important implications for quantitative models of influenza evolution and our general understanding of the mutational robustness of RNA viruses.

The MFE of many viruses were initially explored by mutation accumulation (MA) experiments involving serial plaque to plaque transfers [2529]. While these studies suggested that many mutations were deleterious, MA experiments are generally unable to assign fitness values to individual mutations. Similarly, mutagen sensitivity has been used to define the global impact of mutation and the relative robustness of viral strains or even multiple viral species [23,30,31,57,58]. Neither gives a reliable estimate of the lethal fraction and comparisons across viral families can be confounded by differences in host cells, basal viral mutation rates, and the pleiotropic effects of commonly used mutagens on cellular metabolism. More recently, several groups have used higher throughput assays to measure viral mutational tolerance [33,35,48]. While these approaches allow for impressively large datasets, they are similarly imprecise in assigning fitness values to individual mutations. Because lethal and strongly deleterious mutations will be present at extremely low frequency, they are easily lost from populations with serial passage and variably ascertained by even the best deep sequencing approaches. For example, the impressive characterization of poliovirus MFE by CirSeq could only quantify the average fitness of a mutation across a range of haplotypes, given that each mutation can potentially arise in the setting of a genome that already bears a different mutation [48]. Many mutations were also present at very low frequency, varied significantly across passages, and were presumably subject to clonal interference. Measurement of fitness effects from these data also required extensive modeling of genetic drift. In influenza A virus, two different groups have used random mutagenesis of plasmids and next generation sequencing of recovered viruses to infer the mutational robustness of hemagglutinin [33,35] and nucleoprotein [54,55]. Because many of the plasmids had multiple mutations and the sequencing assays were unable to accurately measure fitness, these important works were nevertheless unable to model the distribution of MFE. The only genome-wide study of influenza A MFE, of which we are aware, utilized transposon-based insertional mutagenesis, and is less informative about the impact of single nucleotide mutation [59].

We chose a lower throughput, and in our opinion more reliable, approach to define the distribution of MFE in influenza A virus. Sanjuan, Moya, and Elena pioneered the use of small libraries of viruses with single nucleotide substitutions to model the MFE of vesicular stomatitis virus [13]. Since this initial study, Sanjuan has used a similar approach to define the MFE of the phages QB, f1, and phiX174 [36,37], and Elena has applied it to tobacco etch virus [40]. The advantage of this approach is that it links specific mutations to fitness values, provides precise fitness measurements, and accurately estimates the lethal fraction. We improved on this method by generating a larger genome-wide library of mutants and by also including sub-libraries to compare the impact of mutations on different viral proteins. Our exhaustive characterization of the lethal fraction by repeated transfection, documentation of the relative clonality of each stock by deep sequencing, and use of a highly quantitative competition assay also ensure the quality of our model of influenza A MFE. An obvious drawback of this approach is that it was only feasible to analyze 128 single nucleotide mutations. While we took great pains to ensure random selection of our mutations, it is possible that our random sample of mutations is biased in some way from the overall set of roughly 39,000 possible SNV in influenza A viruses. We are encouraged that our distribution is similar to what has been found in both high and low throughput studies [19,48].

Fitness values are specific to the environmental conditions in which they are measured [60]. For example, in cell culture they can vary with host cell type, temperature, and multiplicity of infection. We chose to use A549 cells, as they are derived from a respiratory epithelium and support relatively efficient replication of influenza virus. The variability and inefficiency of primary cells present difficulties for a large-scale comparative study such as ours. We chose a relatively low multiplicity to reduce bias in fitness measurements due to complementation and reassortment in multiply infected cells. We note, however, that our relatively low transfer populations in the serial passage experiment could contribute to the observed variability in replicate fitness measurements through genetic drift [43]. While our cell-based assay will capture elements of fitness related to virus interactions with the host cell machinery and evasion of the innate immune response, it does not model the impact of adaptive immunity. While we would argue that many of the deleterious mutations are likely to be so in many environments, some mutations that are only weakly deleterious in cell culture may be subject to stronger purifying selection in nature. Finally, our mutations in WSN33 may not have the same fitness impact in H3N2, or other subtypes, due to intragenic and intergenic epistasis. The observed correlation between site entropy and fitness on a set of a HA and NP mutants suggests that the overall distribution of MFE will be conserved.

Across viral systems, increased genetic variability is often observed in surface or structural proteins relative to internally located, non-structural proteins. This bias could be attributable to local differences in mutation rate or mutational robustness and may have implications for evasion of host antibody [61]. It is therefore interesting that the fitness impact of mutations in the segments coding for hemagglutinin and neuraminidase appear to be slightly less than those in proteins encoded by the other six genomic segments. Host-specific evolutionary rates are also higher for HA and NA relative to the other 6 segments [62]. The 6 internal segments have similar rates, and we were underpowered to distinguish differences in robustness among them. Our finding of a trend toward increased mutational tolerance in the HA protein, and the head in particular, is consistent with the tolerance of HA to transposon insertion [59] and deep mutational scanning experiments [35]. These data suggest that HA is more robust to mutation, possibly as a consequence of this history of strong and repeated selection for antigenic escape. A byproduct of this robustness could be the greater exploration of antigenic sequence space and subsequent immune escape [63]. Here, mutational robustness would increase evolvability, or the capacity for adaptive evolution. This concept is attractive in view of the proposed epochal evolution of influenza A H3N2 on neutral networks [64]. We note that our study was of an H1N1 virus, and there may be important differences in the mutational robustness of H3N2 and H1N1 surface proteins. This is especially important given their different evolutionary rates [65]. As we and others have pointed out, the relationship between mutational robustness and evolvability is complex, and there are clearly situations in which increased robustness will either increase or decrease evolvability [16,6668].

The genome-wide distribution of mutational fitness effects of influenza A virus is similar to those of poliovirus ([48] and this work), VSV, tobacco etch virus, and phages F1, QB, and phiX174 [19,3640]. Because we relied on an assumption of exponential growth over the course of the competition assay to transform our fitness values into exponential growth rates (see Methods), it is possible that we have either over- or underestimated the similarities. Given that this collection includes a negative sense ssRNA virus (VSV), a segmented negative sense RNA virus (influenza A), three positive sense ssRNA viruses, and two ssDNA viruses, it is clear that the type and polarity of the genomic nucleic acid are not major determinants of robustness. Their similarity likely reflects shared constraints due to small and compact genomes [19]. Small genome viruses often have overlapping reading frames or regulatory elements that are more sensitive to genetic disruption. Larger genomes allow for more genetic redundancy and modularity, which could limit the impact of point mutations. Interestingly, the hypersensitivity of these viruses to mutation at an individual level may actually promote robustness at a population level [69]. In large populations under strong purifying selection, deleterious variants are rapidly purged and the wild type sequence preserved. We note that recombination and reassortment in RNA viruses are often considered to be adaptive as they could also increase robustness by purging deleterious mutations from populations [7072]. Given the association between sex and mutation fitness effects in evolutionary theory, it is interesting that we find similar robustness of an asexual virus (VSV), a recombining virus (poliovirus) and a reassorting virus (influenza). While our assay conditions minimized reassortment, a long-term life history of reassortment does not appear to have altered the intrinsic mutational fitness effects of influenza virus.

The evolutionary dynamics of influenza A virus have been intensively studied at the global and molecular level. Our data on mutational fitness effects will be useful in modeling these processes across scales. We expect that further comparative studies in this area might elucidate important similarities and differences between different influenza subtypes and between influenza and other viruses.

Materials and Methods

Viruses and cells

Madin Darby canine kidney cells (MDCK) were provided by Dr. Arnold Monto (University of Michigan) and A549 human lung epithelial cells were provided by Dr. Michael Bachman (University of Michigan). Human embryonic kidney 293T fibroblasts were provided by Dr. Raul Andino (UCSF). Both cell lines were maintained in Dulbecco’s Modified Eagle Medium (DMEM, Invitrogen) supplemented with 10% fetal bovine serum (Gibco and HyClone), 25mM HEPES (Invitrogen), and 0.1875% bovine serum albumin (Life Technologies). Viral infections were performed in DMEM supplemented with 25mM HEPES, 0.1875% bovine serum albumin, and 2μg/ml TPCK-trypsin (Worthington). The 8 plasmid reverse genetic system containing the genomic segments for influenza A/WSN33 (pHW181-PB2, pHW182-PB1, pHW-183-PA, pHW184-HA, pHW185-NP, pHW186-NA, pHW187-M, pHW188-NS) was a kind gift of Robert Webster ([41], St. Jude’s Children’s Research Hospital)

Site directed mutagenesis

An R script was used to select randomly the nucleotide position and base change for all mutants in this study and to design optimal oligonucleotides for mutagenesis by polymerase chain reaction (PCR). Noncoding and promoter regions were included. Due to sequence context and GC content, several primers had to be designed manually (S1 Table). One hundred and eight of the single nucleotide mutants were generated using the QuickChange site directed mutagenesis kit (Agilent Technologies) according to the manufacturer’s protocol. The remaining 21 single nucleotide mutants were generated by overlap extension PCR [73]. Here, the outer primers contained 5’ BsmBI or BsaI restriction sites for subsequent cloning into pHW2000 [41,74]. The barcoded PB1 segment (555) was made using the Quickchange protocol and primers pPolPB1_555f 5’ GATCACAACTCATTTCCAACGGAAACGGAGGGTGAGAGACAAT 3’ and pPolPB1-555r ATTGTCTCTCACCCTCCGTTTCCGTTGGAAATGAGTTGTGATC. In each plasmid clone, the presence of the desired mutation(s) and the absence of second site mutations were verified by sequencing of the entire influenza segment.

Transfection and viral stocks

Equal quantities of MDCK and 293T cells were seeded at a total density of 1,000,000 cells per well of a 6 well plate 24 hours prior to transfection in complete DMEM (see above). Each transfection mixture contained 1μg of the mutant plasmid, 1μg of each of the other 7 wild type plasmids, 16μl of TransIT-LT1 (Mirus) and 250μl Optimem (Gibco). These reagents were incubated together for 45 minutes at room temperature and applied dropwise to the cellular monolayer. After 24 hours, the media was changed to viral infection media (see above). Recombinant passage 0 virus was harvested 48 hours post-transfection, clarified by centrifugation at 200 x g for 3 minutes, and stored in aliquots with 0.5% glycerol at minus 70°C. Passage 1 (P1) stocks were generated by a single passage on MDCK cells at a multiplicity of infection (moi) ≤ 0.001. Virus was applied to cells for one hour and aspirated, and then viral media was added. Passage 1 stocks were harvested at 48 hours post-infection. All stocks were titered by tissue culture infectious dose (TCID50, [75]). We subjected all transfection supernatants with undetectable P0 titers to blind passaging. One or 0.333 ml of virus was applied to a confluent T75 or T25 flask, respectively. These cultures were monitored for cytopathic effect and titered at 48 hours.

Next generation sequencing

Viral RNA was harvested from 200μl of each P1 supernatant using either Purelink Viral RNA (Invitrogen) or Qiamp Viral RNA (Qiagen) kits. Multiplex reverse transcription-PCR amplification of all 8 influenza virus genome segments was performed on RNA samples using Superscript III with HiFi platinum Taq (Invitrogen 12574) and primers Uni12/Inf1 (5’-GGGGGGAGCAAAAGCAGG-3’), Uni12/Inf3 (5’-GGGGGAGCGAAAGCAGG-3’), and Uni13/Inf1 (5’-CGGGTTATTAGTAGAAACAAGG-3’) [76]. Seven hundred fifty nanograms of the each amplified cDNA were sheared to an average size of 300 to 400 bp using a Covaris S220 focused ultrasonicator. Sequencing libraries were prepared using the NEBNext Ultra DNA library prep kit (NEB E7370L), Agencourt AMPure XP beads (Beckman Coulter A63881), and NEBNext multiplex oligonucleotides for Illumina (NEB E7600S). Indexed samples were pooled in equal quantities and sequenced on an Illumina MiSeq instrument with 2 x 250-base paired end reads.

Sequencing reads that passed standard Illumina quality control filters were binned by index and aligned to the reference genome using bowtie [77]. Single nucleotide variants (SNV) were identified and analyzed using DeepSNV [78]. The DeepSNV algorithm relies on a clonal control to estimate the local error rate within a given sequence context and to identify strand bias in base calling. It then applies a hierarchical binomial model based on mutation calls for test and control at each base and position to identify true-positive SNV. The clonal control was a library prepared in an identical fashion from 8 plasmids containing the A/WSN33/H1N1 genome and sequenced in the same flow cell. Code for our implementation of DeepSNV can be found at, and can be accessed anonymously. True positive SNV were identified from the raw output tables (S2 Table) by applying the following filtering criteria in R: (i) Bonferonni corrected p value < 0.01, (ii) average MapQ score on variant reads > 30, (iii) average phred score on variant positions > 35, (iv) average position of variant call on a read > 50 and < 200, (v) variant frequency > 0.02. Application of these criteria to a viral spike-in dataset with equivalent genome copy number inputs resulted in > 95% sensitivity and > 99.99% specificity for SNV present at > 1% of the population [79].

Competition assays

Competitions were performed on A549 cells in 12 well plates, plated at a density of 2.6 x 105 per well 24 hours prior to infection. Cells were infected at a total MOI of 0.01 with an equal TCID50 of WT and a given mutant. Three replicate wells were infected with each pair of viruses. Passage 1 virus was harvested after 48 hours and one replicate was titered by TCID50. This titer was used to calculate the dilution factor necessary to maintain an MOI of 0.01 for subsequent passages. Four passages were performed for each competition. RNA was harvested from each passage using PureLink 96 well RNA mini kits (Invitrogen). Random hexamers were used to prime cDNA synthesis with 1/10 of the RNA. Each cDNA was analyzed by real time PCR using three different primer sets with duplicate PCR reactions for each sample/primer set. The first set, PB1_149f 5’ CAGAAAGGGGAAGATGGACA 3’ and PB1_360r 5’ GTCCACTCGTGTTTGCTGAA 3’, were used to quantify total viral genomic RNA. The second set, pPol1PB1_555f 5’ TCAGAGAAAGAGACGAGTGAG 3’ and pPol1PB1_555r 5’ AAACCCCCTTATTTGCATCC 3’, were used to quantify the amount of mutant viral RNA. The third set, pPol1PB1_555fm 5’ ATTTCCAACGGAAACGGAGGG 3’ and pPol1PB1_555r 5’ AAACCCCCTTATTTGCATCC3’, were used to quantify the amount of barcoded WT viral RNA. We verified that the levels of RNA (Cycle threshold, Ct) were well correlated with the infectious titer (TCID50/ml) for the mutants shown in Fig 1B. Duplicate wells were averaged and values were excluded from subsequent analysis if duplicates wells differed by > 0.5 Ct or any of the Ct were out of the empirically determined linear range for that primer pair. Relative amounts of WT and mutant RNA were determined by normalizing the cycle thresholds for each to those of the common (PB1_149f and PB1_360r) primer set (DCt = CtVirus-CtPB1 total). The normalized values for each virus passages 1–4 were then compared to passage 0 to obtain a ratio relative to P1 (ΔΔCt = CtPX-CtP0). This relative Ct value was converted to reflect the fold change (Δratio = 2-logΔΔCt). The change in ratio of the mutant relative to the change in ratio of the WT as a function of passage is the fitness ([ΔratioMut-ΔratioWT]/time).

Transformation of fitness into relative growth rate

Except for poliovirus, the data on other viruses were extracted from [19]. Here, the relative fitness values for VSV, QB, phiX174, and F1 were calculated as differences in the exponential growth rate. The data on TEV were originally derived by qPCR as in the present study. In this comparative analysis, Sanjuan applied a correction to the TEV data to enable direct comparisons of growth rate. We applied this same correction to our data to transform the values to something resembling an exponential growth rate. The exponential growth rate, r, of influenza A on A549 in a 12 well dish at moi 0.01, is derived from the equation: Nt = N0ert, where t = 2 days (48 hour passage), N0 is 10,000 infectious units (the amount at time 0), and Nt is the titer after time t, or 48 hours. For Nt, we used the mean of three replicates for HA-40 (9.3e5), which has a fitness of 1 (see Fig 1B). This is similar to the scaled output of WSN33 that we have observed at different moi and different well sizes. Solving for r gives a value of 2.26 per day. We then applied the following correction from [19], y = (lnx + r)/r, where y is the relative exponential growth rate of a given mutant, x is the relative fitness measured by qPCR and reported in Tables 1 and 3, and r is the estimate of the WT exponential growth rate under our experimental conditions (2.26 per day). For example, a fitness of 0.8 in our assay is transformed to a relative exponential growth rate of 0.90. The relative exponential growth rates, as calculated, are reported in S5 Table. We note that using a value 2e6 for Nt (which would be roughly consistent with the average dilution factor applied at each passage in many of the competition experiments, gives a growth rate of 2.65 per day and relative exponential growth rate of 0.916 for a virus with a fitness of 0.8.


All secondary analysis and statistical tests were performed in R and GraphPad Prism6. R and Python scripts for analysis of Influenza Research Database data (as of July 8, 2016) and other analyses are available at

Supporting Information

S1 Fig. Distribution of single nucleotide variants in the influenza A genome.

(A) A randomly selected genome-wide dataset (n = 95), in which the number of mutants generated per segment (grey bars) closely matches the expected distribution based on percentage of the genome contained on each segment (black bars). (B) Extra HA and NA mutations were generated to compare larger numbers of mutations on these segments, n = 57, to those on the other 6 (internal) segments, n = 71.


S2 Fig. Frequency of mutations in the Influenza Research Database.

Shown are the frequencies of nonsynonymous amino acid substitutions in our mutant dataset (y-axis) and their fitness values (x-axis).


S2 Table. Secondary mutations as determined by Illumina sequencing.


S3 Table. Site entropy and site preference values for HA and NP mutants.


S4 Table. Raw data on mutation frequency from the Influenza Research Database.


S5 Table. Relative exponential growth rates for all mutants.



We thank Raul Andino, Ashley Acevedo, and Leonid Brodsky for providing the raw data on the poliovirus mutational fitness effects. We thank Matt Pauly for creating the mutation selection script, Anna Berezovsky and Keqin Gong for technical assistance, and Robert Woods for helpful discussion and a critical reading of the manuscript.

Author Contributions

  1. Conceived and designed the experiments: EV SEW JTM WF ASL.
  2. Performed the experiments: EV SEW JTM WF ASL.
  3. Analyzed the data: EV SEW JTM WF ASL.
  4. Contributed reagents/materials/analysis tools: EV SEW JTM WF ASL.
  5. Wrote the paper: EV ASL.


  1. 1. Webster RG, Bean WJ, Gorman OT, Chambers TM, Kawaoka Y. Evolution and ecology of influenza A viruses. Microbiol Rev. 1992;56: 152–179. pmid:1579108
  2. 2. Nelson MI, Holmes EC. The evolution of epidemic influenza. Nat Rev Genet. 2007;8: 196–205. pmid:17262054
  3. 3. Ghedin E, Sengamalay NA, Shumway M, Zaborsky J, Feldblyum T, Subbu V, et al. Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution. Nature. 2005;437: 1162–1166. pmid:16208317
  4. 4. Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC. The genomic and epidemiological dynamics of human influenza A virus. Nature. 2008;453: 615–619. pmid:18418375
  5. 5. Sanjuán R, Nebot MR, Chirico N, Mansky LM, Belshaw R. Viral mutation rates. J Virol. 2010;84: 9733–9748. pmid:20660197
  6. 6. Nobusawa E, Sato K. Comparison of the mutation rates of human influenza A and B viruses. J Virol. 2006;80: 3675–3678. pmid:16537638
  7. 7. Parvin JD, Moscona A, Pan WT, Leider JM, Palese P. Measurement of the mutation rates of animal viruses: influenza A virus and poliovirus type 1. J Virol. 1986;59: 377–383. pmid:3016304
  8. 8. Marshall N, Priyamvada L, Ende Z, Steel J, Lowen AC. Influenza Virus Reassortment Occurs with High Frequency in the Absence of Segment Mismatch. Basler CF, editor. PLoS Pathog. 2013;9: e1003421. pmid:23785286
  9. 9. Tao H, Steel J, Lowen AC. Intrahost Dynamics of Influenza Virus Reassortment. J Virol. 2014;88: 7485–7492. pmid:24741099
  10. 10. Russell CA, Fonville JM, Brown AEX, Burke DF, Smith DL, James SL, et al. The potential for respiratory droplet-transmissible A/H5N1 influenza virus to evolve in a mammalian host. Science. 2012;336: 1541–1547. pmid:22723414
  11. 11. Elena SF, Sanjuán R. Adaptive value of high mutation rates of RNA viruses: separating causes from consequences. J Virol. 2005;79: 11555–11558. pmid:16140732
  12. 12. Belshaw R, Gardner A, Rambaut A, Pybus OG. Pacing a small cage: mutation and RNA viruses. Trends Ecol Evol (Amst). 2008;23: 188–193.
  13. 13. Sanjuán R, Moya A, Elena SF. The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus. Proc Natl Acad Sci USA. 2004;101: 8396–8401. pmid:15159545
  14. 14. de Visser JAGM, Hermisson J, Wagner GP, Ancel Meyers L, Bagheri-Chaichian H, Blanchard JL, et al. Perspective: Evolution and detection of genetic robustness. Evolution. 2003;57: 1959–1972. pmid:14575319
  15. 15. Elena SF, Carrasco P, Darós J-A, Sanjuán R. Mechanisms of genetic robustness in RNA viruses. EMBO Rep. 2006;7: 168–173. pmid:16452927
  16. 16. Lauring AS, Frydman J, Andino R. The role of mutational robustness in RNA virus evolution. Nat Rev Micro. Nature Publishing Group; 2013;11: 327–336.
  17. 17. Wilke CO, Adami C. Evolution of mutational robustness. Mutat Res. 2003;522: 3–11. pmid:12517406
  18. 18. Lauring AS, Acevedo A, Cooper SB, Andino R. Codon Usage Determines the Mutational Robustness, Evolutionary Capacity, and Virulence of an RNA Virus. Cell Host and Microbe. Elsevier Inc; 2012;12: 623–632.
  19. 19. Sanjuán R. Mutational fitness effects in RNA and single-stranded DNA viruses: common patterns revealed by site-directed mutagenesis studies. Philos Trans R Soc Lond, B, Biol Sci. 2010;365: 1975–1982. pmid:20478892
  20. 20. Fonville JM. Expected Effect of Deleterious Mutations on Within-Host Adaptation of Pathogens. Kirkegaard K, editor. J Virol. 2015;89: 9242–9251. pmid:26109724
  21. 21. Koelle K, Rasmussen DA. The effects of a deleterious mutation load on patterns of influenza A/H3N2's antigenic evolution in humans. Traulsen A, editor. eLife. eLife Sciences Publications Limited; 2015;4: e07361.
  22. 22. Bull JJ, Sanjuán R, Wilke CO. Theory of lethal mutagenesis for viruses. J Virol. 2007;81: 2930–2939. pmid:17202214
  23. 23. Graci JD, Gnädig NF, Galarraga JE, Castro C, Vignuzzi M, Cameron CE. Mutational robustness of an RNA virus influences sensitivity to lethal mutagenesis. J Virol. 2012;86: 2869–2873. pmid:22190724
  24. 24. O'Dea EB, Keller TE, Wilke CO. Does Mutational Robustness Inhibit Extinction by Lethal Mutagenesis in Viral Populations? Wahl L, editor. PLoS Comput Biol. 2010;6: e1000811. pmid:20548958
  25. 25. Lázaro E, Escarmís C, Domingo E, Manrubia SC. Modeling viral genome fitness evolution associated with serial bottleneck events: evidence of stationary states of fitness. J Virol. 2002;76: 8675–8681. pmid:12163587
  26. 26. Duarte E, Clarke D, Moya A, Domingo E, Holland J. Rapid fitness losses in mammalian RNA virus clones due to Muller's ratchet. Proc Natl Acad Sci USA. National Academy of Sciences; 1992;89: 6015–6019.
  27. 27. Clarke DK, Duarte EA, Moya A, Elena SF, Domingo E, Holland J. Genetic bottlenecks and population passages cause profound fitness differences in RNA viruses. J Virol. 1993;67: 222–228. pmid:8380072
  28. 28. Eyre-Walker A, Keightley PD. The distribution of fitness effects of new mutations. Nat Rev Genet. 2007;8: 610–618. pmid:17637733
  29. 29. Montville R, Froissart R, Remold SK, Tenaillon O, Turner PE. Evolution of mutational robustness in an RNA virus. PLoS Biol. 2005;3: e381. pmid:16248678
  30. 30. Lee CH, Gilbertson DL, Novella IS, Huerta R, Domingo E, Holland JJ. Negative effects of chemical mutagenesis on the adaptive behavior of vesicular stomatitis virus. J Virol. 1997;71: 3636–3640. pmid:9094637
  31. 31. Crotty S, Cameron CE, Andino R. RNA virus error catastrophe: direct molecular test by using ribavirin. Proc Natl Acad Sci USA. 2001;98: 6895–6900. pmid:11371613
  32. 32. Novella IS, Presloid JB, Beech C, Wilke CO. Congruent Evolution of Fitness and Genetic Robustness in Vesicular Stomatitis Virus. J Virol. 2013;87: 4923–4928. pmid:23408631
  33. 33. Wu NC, Young AP, Al-Mawsawi LQ, Olson CA, Feng J, Qi H, et al. High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution. Sci Rep. 2014;4.
  34. 34. Qi H, Olson CA, Wu NC, Ke R, Loverdo C, Chu V, et al. A quantitative high-resolution genetic profile rapidly identifies sequence determinants of hepatitis C viral fitness and drug sensitivity. PLoS Pathog. Public Library of Science; 2014;10: e1004064.
  35. 35. Thyagarajan B, Bloom JD. The inherent mutational tolerance and antigenic evolvability of influenza hemagglutinin. eLife. 2014;3: e03300–e03300.
  36. 36. Peris JB, Davis P, Cuevas JM, Nebot MR, Sanjuán R. Distribution of fitness effects caused by single-nucleotide substitutions in bacteriophage f1. Genetics. Genetics Society of America; 2010;185: 603–609.
  37. 37. Domingo-Calap P, Cuevas JM, Sanjuán R. The fitness effects of random mutations in single-stranded DNA and RNA bacteriophages. PLoS Genet. 2009;5: e1000742. pmid:19956760
  38. 38. Cuevas JM, Domingo-Calap P, Sanjuán R. The fitness effects of synonymous mutations in DNA and RNA viruses. Mol Biol Evol. 2012;29: 17–20. pmid:21771719
  39. 39. Cuevas JM, Duffy S, Sanjuán R. Point mutation rate of bacteriophage PhiX174. Genetics. 2009;183: 747–749. pmid:19652180
  40. 40. Carrasco P, la Iglesia de F, Elena SF. Distribution of Fitness and Virulence Effects Caused by Single-Nucleotide Substitutions in Tobacco Etch Virus. J Virol. American Society for Microbiology; 2007;81: 12979–12984.
  41. 41. Hoffmann E, Neumann G, Kawaoka Y, Hobom G, Webster RG. A DNA transfection system for generation of influenza A virus from eight plasmids. Proc Natl Acad Sci USA. 2000;97: 6108–6113. pmid:10801978
  42. 42. Cuevas JM, Domingo-Calap P, Sanjuán R. The fitness effects of synonymous mutations in DNA and RNA viruses. Mol Biol Evol. Oxford University Press; 2012;29: 17–20.
  43. 43. Voronin Y, Holte S, Overbaugh J, Emerman M. Genetic drift of HIV populations in culture. PLoS Genet. 2009;5: e1000431. pmid:19300501
  44. 44. Jagger BW, Wise HM, Kash JC, Walters K-A, Wills NM, Xiao Y-L, et al. An overlapping protein-coding region in influenza A virus segment 3 modulates the host response. Science. American Association for the Advancement of Science; 2012;337: 199–204.
  45. 45. Watanabe T, Watanabe S, Noda T, Fujii Y, Kawaoka Y. Exploitation of Nucleic Acid Packaging Signals To Generate a Novel Influenza Virus-Based Vector Stably Expressing Two Foreign Genes. J Virol. 2003;77: 10575–10583. pmid:12970442
  46. 46. Doud MB, Bloom JD. Accurate Measurement of the Effects of All Amino-Acid Mutations on Influenza Hemagglutinin. Viruses. 2016;8.
  47. 47. Xu R, Ekiert DC, Krause JC, Hai R, Crowe JE, Wilson IA. Structural basis of preexisting immunity to the 2009 H1N1 pandemic influenza virus. Science. American Association for the Advancement of Science; 2010;328: 357–360.
  48. 48. Acevedo A, Brodsky L, Andino R. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature. 2014;505: 686–690. pmid:24284629
  49. 49. Elena SF, Ekunwe L, Hajela N, Oden SA, Lenski RE. Distribution of fitness effects caused by random insertion mutations in Escherichia coli. Genetica. 1998;102–103: 349–358. pmid:9720287
  50. 50. Bloom JD, Gong LI, Baltimore D. Permissive secondary mutations enable the evolution of influenza oseltamivir resistance. Science. 2010;328: 1272–1275. pmid:20522774
  51. 51. Gong LI, Suchard MA, Bloom JD. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife. 2013;2: e00631. pmid:23682315
  52. 52. Gong LI, Bloom JD. Epistatically Interacting Substitutions Are Enriched during Adaptive Protein Evolution. Weinreich DM, editor. PLoS Genet. 2014;10: e1004328. pmid:24811236
  53. 53. Kryazhimskiy S, Dushoff J, Bazykin GA, Plotkin JB. Prevalence of epistasis in the evolution of influenza A surface proteins. PLoS Genet. 2011;7: e1001301. pmid:21390205
  54. 54. Bloom JD. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol Biol Evol. Oxford University Press; 2014;31: 1956–1978.
  55. 55. Doud MB, Ashenberg O, Bloom JD. Site-Specific Amino Acid Preferences Are Mostly Conserved in Two Closely Related Protein Homologs. Mol Biol Evol. 2015;32: 2944–2960. pmid:26226986
  56. 56. Peris JB, Davis P, Cuevas JM, Nebot MR, Sanjuán R. Distribution of fitness effects caused by single-nucleotide substitutions in bacteriophage f1. Genetics. Genetics; 2010;185: 603–609. pmid:20382832
  57. 57. Pauly MD, Lauring AS. Effective lethal mutagenesis of influenza virus by three nucleoside analogs. J Virol. 2015;89: 3584–3597. pmid:25589650
  58. 58. Smith EC, Blanc H, Vignuzzi M, Denison MR. Coronaviruses Lacking Exoribonuclease Activity Are Susceptible to Lethal Mutagenesis: Evidence for Proofreading and Potential Therapeutics. Diamond MS, editor. PLoS Pathog. 2013;9: e1003565. pmid:23966862
  59. 59. Heaton NS, Sachs D, Chen C-J, Hai R, Palese P. Genome-wide mutagenesis of influenza virus reveals unique plasticity of the hemagglutinin and NS1 proteins. Proceedings of the National Academy of Sciences. National Acad Sciences; 2013;110: 20248–20253.
  60. 60. Orr HA. Fitness and its role in evolutionary genetics. Nat Rev Genet. 2009;10: 531–539. pmid:19546856
  61. 61. Cuevas JM, Geller R, Garijo R, López-Aldeguer J, Sanjuán R. Extremely High Mutation Rate of HIV-1 In Vivo. Rowland-Jones SL, editor. PLoS Biol. 2015;13: e1002251–19. pmid:26375597
  62. 62. Worobey M, Han G-Z, Rambaut A. A synchronized global sweep of the internal genes of modern avian influenza virus. Nature. Nature Publishing Group; 2014;: 1–16.
  63. 63. Plotkin JB, Dushoff J. Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus. Proc Natl Acad Sci USA. 2003;100: 7152–7157. pmid:12748378
  64. 64. Koelle K, Cobey S, Grenfell B, Pascual M. Epochal evolution shapes the phylodynamics of interpandemic influenza A (H3N2) in humans. Science. 2006;314: 1898–1903. pmid:17185596
  65. 65. Bedford T, Riley S, Barr IG, Broor S, Chadha M, Cox NJ, et al. Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature. 2015;523: 217–220. pmid:26053121
  66. 66. McBride RC, Ogbunugafor CB, Turner PE. Robustness promotes evolvability of thermotolerance in an RNA virus. BMC Evol Biol. 2008;8: 231. pmid:18694497
  67. 67. Cuevas JM, Moya A, Sanjuán R. A genetic background with low mutational robustness is associated with increased adaptability to a novel host in an RNA virus. J Evol Biol. 2009;22: 2041–2048. pmid:19702844
  68. 68. Elena SF. RNA virus genetic robustness: possible causes and some consequences. Curr Opin Virol. 2012.
  69. 69. Krakauer DC, Plotkin JB. Redundancy, antiredundancy, and the robustness of genomes. Proc Natl Acad Sci USA. 2002;99: 1405–1409. pmid:11818563
  70. 70. Fonville JM, Marshall N, Tao H, Steel J, Lowen AC. Influenza Virus Reassortment Is Enhanced by Semi-infectious Particles but Can Be Suppressed by Defective Interfering Particles. Lauring A, editor. PLoS Pathog. 2015;11: e1005204–30. pmid:26440404
  71. 71. Xiao Y, Rouzine IM, Bianco S, Acevedo A, Goldstein EF, Farkov M, et al. RNA Recombination Enhances Adaptability and Is Required for Virus Spread and Virulence. Cell Host and Microbe. Elsevier Inc; 2016;19: 493–503.
  72. 72. Simon-Loriere E, Holmes EC. Why do RNA viruses recombine? Nat Rev Micro. 2011;9: 617–626.
  73. 73. Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR. Site-directed mutagenesis by overlap extension using the polymerase chain reaction. Gene. 1989;77: 51–59. pmid:2744487
  74. 74. Hoffmann E, Stech J, Guan Y, Webster RG, Perez DR. Universal primer set for the full-length amplification of all influenza A viruses. Arch Virol. 2001;146: 2275–2289. pmid:11811679
  76. 76. Zhou B, Donnelly ME, Scholes DT, St George K, Hatta M, Kawaoka Y, et al. Single-reaction genomic amplification accelerates sequencing and vaccine production for classical and Swine origin human influenza a viruses. J Virol. 2009;83: 10309–10313. pmid:19605485
  77. 77. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10: R25. pmid:19261174
  78. 78. Gerstung M, Beisel C, Rechsteiner M, Wild P, Schraml P, Moch H, et al. Reliable detection of subclonal single-nucleotide variants in tumour cell populations. Nature Communications. Nature Publishing Group; 1AD;3: 811–8.
  79. 79. McCrone JT, Lauring AS. Measurements of intrahost viral diversity are extremely sensitive to systematic errors in variant calling. J Virol. American Society for Microbiology; 2016;: JVI.00667–16.