Fig 1.
Heat-map illustrating the relationship between molecular and antigenic evolution.
Cells are colored by mean log HI titer for each pairing of antiserum and test virus present in the full dataset. Test viruses and reference viruses used to generate post-infection ferret antisera are sorted phylogenetically on the HA gene along the vertical and horizontal axes respectively. Phylogenies are shown to the left for test viruses and above for reference viruses. The color key for HI titers is shown in the histogram at top left along with the number of assays yielding each titer. S1 Fig provides examples of the observed variability in HI titer for the most frequently used virus.
Fig 2.
HA positions implicated in antigenic evolution and locations of associated substitutions in HA1 phylogeny.
(A) Amino acid positions that affect antigenic phenotype modeled on the HA structure of A/Puerto Rico/8/34 (Protein Data Bank ID: 1RU7) [24]. Surface representation of the front monomer is shown with HA2 in cyan and HA1 in blue with residues of the receptor-binding site colored pink. Positions with substitutions that can explain antigenic change in multiple locations across the phylogeny are shown in red. Residues adjacent to the position of the K130 deletion are colored orange with the locations of the co-occurring R43L, F71I and S271P substitutions are colored yellow. Residues are labeled on the front HA1 monomer and shown as spheres on the remaining backbones. (B) HA1 phylogeny showing positions of significant antigenic substitutions. Color changes mark the locations of substitutions associated with changes in antigenic phenotype of at least 0.5 antigenic units. The position of the branch associated with the greatest drop in cross-reactivity is marked (*). Black circles indicate the positions of viruses included in the influenza vaccine over the period of HI data collection and are labeled: A/Bayern/7/95 (V1), A/Beijing/262/95 (V2), A/New Caledonia/20/99 (V3), A/Solomon Islands/3/2006 (V4) and A/Brisbane/59/2007 (V5). Branch length indicates the estimated number of nucleotide substitutions per site.
Table 1.
HA1 amino acid substitutions that correlate with antigenic change.
Fig 3.
Observed and predicted antigenic impact of amino acid substitutions.
The mean antigenic impact of each substitution predicted from modeling (Table 1) plotted against the mean observed impact averaged across antisera in the panel (S3 Table). 95% confidence intervals are shown for both. Each point shows the observed mean antigenic impact (ΔHA, change in HI titer for a recombinant virus relative to its parent virus) of a particular amino acid substitution (labeled at top) with each antiserum in the panel. Red points indicate that the reference virus lacked the amino acid substitution, so the predicted impact of mutation is a reduction in titer; blue points indicate that the reference virus shared the substitution, so the predicted impact of mutation is an increase in titer. The number of points for each substitution is dependent on whether it was inserted into one or both (Neth93 and Neth93 Δ130) parental viruses and on the number of antisera used. A negative observed antigenic impact indicates a change in HI titer in the opposite direction to that predicted. Mean titers used to calculate antigenic and non-antigenic effects of substitutions are shown in S4 Table and as a heat-map in S2 Fig.
Table 2.
Comparison of predicted and observed antigenic impacts of HA1 amino acid substitutions assessed by HI.
Fig 4.
Position of substitutions ΔK130 and K141E on an antigenic map.
Map locations are shown for a representative example from a Bayesian multidimensional scaling model that estimates virus location, antiserum location, reference virus immunogenicity and test virus receptor-binding avidity. Gridlines represent single antigenic units, two-fold dilutions in the HI assay. Viruses are shown as colored circles and antisera are shown as grey points. Viruses are colored in relation to the substitutions ΔK130 and K141E: 130K 141K (red, n = 36), Δ130 141K (yellow, n = 273), Δ130 141E (blue, n = 193). Viruses with other amino acid combinations are colored black (n = 4).
Fig 5.
Sequence-based prediction of antigenic phenotype.
Observed and predicted HI titers plotted on log2 scale (antigenic units) using representative models trained with data for 90% of the pairs of virus and antiserum. Predictive models contained terms for A) Average titers for each reference virus, B) Antigenic cluster-defining substitutions ΔK130 and K141E, C) All 18 antigenic substitution(s) shown in Table 1, D) All 18 antigenic substitution(s) shown in Table 1 with additional terms that estimate differences in test virus receptor-binding avidity (non-antigenic variation in titer associated with each virus). Each model was fitted to the same training dataset comprising 90% of all pairs of virus and antiserum and predictions for the remaining data are shown. Incremental improvements in mean absolute prediction error are shown alongside SEM and 95% upper limits in S5 Table.
Fig 6.
Prediction error through time for models used to predict HI titers of viruses isolated in the following year.
The mean, absolute difference between observed titers for viruses isolated in a given year and titers predicted using models trained to HI data collected in previous years is shown. Predictive models included terms for cluster-defining substitutions ΔK130 and K141E only (solid blue line) or for all 18 substitutions in Table 1 (solid red line). For each model, shaded areas show the lower 95% credible interval on the absolute prediction error. In each year the blue 95% credible interval extends vertically on the y-axis above the red 95% credible interval. Mean, absolute prediction errors averaged across the twelve years are shown as dashed lines.