Figure 1.
The three secondary mutations V234M, R222Q, and D344N largely explain the differences in total surface-expressed activity and protein between 1999 and 2007 seasonal H1N1 neuraminidases.
Shown are wildtype (WT) and indicated mutants of the A/New Caledonia/20/1999 neuraminidase, in addition to WT and H274Y neuraminidases from the A/Brisbane/59/2007 (BR07) strain. All neuraminidases contain C-terminal epitope tags, except for the untagged WT and H274Y A/New Caledonia/20/1999 variants. For the measurements, 293T cells were transfected with plasmids encoding the neuraminidase proteins. After 20 hours, the cells were assayed for the total surface-expressed neuraminidase activity (top panel) or protein using an antibody against the epitope tag (bottom panel). Bars show the mean and standard error for at least six replicates.
Figure 2.
PIPS is the most effective computational approach for retrospectively identifying the secondary mutations that increased seasonal H1N1 neuraminidase surface expression and activity.
The histograms show the distribution of predicted effects for all possible single amino-acid mutations to the A/New Caledonia/20/1999 neuraminidase, for each of the four computational approaches (CUPSAT, FOLDX, the consensus approach, and PIPS). The A/Brisbane/59/2007 strain contains nine mutations in the crystallized ectodomain portion of the neuraminidase relative to the A/New Caledonia/20/1999 strain. The three mutations that were experimentally show to enhance neuraminidase surface expression or activity (R222Q, V234M, and D344N) are indicated with red squares, while the other six mutations are indicated with green circles. The units for the different prediction methods are arbitrary, but in all cases more negative numbers correspond to mutations that are predicted to be more favorable. Shown are one-sided -values for the hypothesis that the prediction method assigns more negative values to the known enhancing mutations (red squares) than the other six mutations (green circles), as determined using the Mann-Whitney test. The most successful computational approach appears to be PIPS, which correctly places all three red squares to the left of all six green circles.
Table 1.
Top twelve PIPS predicted neuraminidase mutations to pandemic H1N1.
Figure 3.
Several of the predicted secondary mutations partially counteract the decrease that H274Y causes in total surface-expressed activity and protein for the pandemic H1N1 neuraminidase.
Shown are wildtype (WT) and indicated mutants of the A/California/4/2009 neuraminidase. All neuraminidases contain C-terminal epitope tags, except for the untagged WT. For the measurements, 293T cells were transfected with plasmids encoding the neuraminidase proteins. After 20 hours, the cells were assayed for the total surface-expressed neuraminidase activity (top panel) or protein using an antibody against the epitope tag (bottom panel). Bars show the mean and standard error for at least six replicates.
Figure 4.
Combining several secondary mutations can fully counteract the effect of H274Y on surface-expressed pandemic H1N1 neuraminidase activity.
Shown are wildtype (WT) and indicated mutants of the A/California/4/2009 neuraminidase, all containing C-terminal epitope tags. For the measurements, 293T cells were transfected with plasmids encoding the neuraminidase proteins. After 20 hours, the cells were assayed for the total surface-expressed neuraminidase activity (top panel) or protein using an antibody against the epitope tag (bottom panel). Bars show the mean and standard error for at least six replicates.
Figure 5.
Growth in tissue-culture of pandemic H1N1 variants carrying neuraminidase mutations.
The plot at left shows growth in media lacking oseltamivir, while the plot at right shows growth in media containing 50 nM oseltamivir. Viruses contain all genes from the A/California/4/2009 strain with the T197A mutation to hemagglutinin, with the exception of the PB1 segment which is engineered to carry GFP. MDCK-SIAT1-CMV-PB1 cells were infected with the viruses at initial multiplicities of infection of infectious particles per cell. At the indicated times, viral supernatants were harvested and titered on fresh cells. Shown are the mean and standard error for four replicates.
Figure 6.
Sites of the mutations mapped onto the neuraminidases protein structure.
Shown in dark green is one monomer from an N1 neuraminidase crystal structure ([37], PDB code 3BEQ]. Residue 274 (N2 numbering) is shown in red, and the sites of the secondary mutations (N1 numbering) are shown in blue. Oseltamivir (yellow spheres) is modeled in its binding site based on a related crystal structure ([83], PDB code 2HU0). The other three monomers of the full neuraminidase tetramer are shown in light green, based on modeling from a related crystal structure ([83], PDB code 2HU0). The image was rendered with PyMOL.
Figure 7.
Rationale for assuming that the fixation probability of a mutation depends on its effect on evolutionarily constrained protein properties.
(A) Evolution is assumed to select in a threshold manner for properties such as folding, stability, or expression (approximated by the variable ). A mutation deleterious to
will not be tolerated by a protein that has a marginal value of
(top panel). But the same mutation is tolerated by a protein with an extra buffer in
(bottom panel). (B) Most mutations are deleterious to
, and therefore have positive
values. Shown is an example distribution of
for all mutations to a protein, taken from [49]. (C) The time-averaged probability distribution of
for an evolving protein will tend towards values just marginally below the threshold. Shown is an example of this distribution, taken from [49]. (D) As a consequence, mutations with negative
values will generally be tolerated, but those with positive
are less likely to be tolerated. Shown is a plot of the relationship between the probability
that mutating residue
from
to
will be tolerated as a function of the associated
value, as defined in Equation 3.
Figure 8.
An example phylogenetic tree .
This tree shows the sequence data for five sequences at a single site
. The amino acid codes at the tips of the branches (
,
,
,
, and
) show the residue identities for the five sequences at this site. The variables at the internal nodes (
,
,
,
) are the amino acid identities at the site for the ancestral sequences, and must be inferred. The numbers next to the nodes are unique identifiers for the nodes. The branch lengths (
,
,…) are proportional to the time since the divergence of the sequences.