PLoS PathogplosplospathPLOS Pathogens1553-73661553-7374Public Library of ScienceSan Francisco, CA USAPPATHOGENS-D-18-0099510.1371/journal.ppat.1007291Research ArticleBiology and life sciencesMicrobiologyMicrobial evolutionViral evolutionBiology and life sciencesEvolutionary biologyOrganismal evolutionMicrobial evolutionViral evolutionBiology and life sciencesMicrobiologyVirologyViral evolutionBiology and life sciencesEvolutionary biologyEvolutionary immunologyBiology and life sciencesEvolutionary biologyEvolutionary geneticsMedicine and health sciencesInfectious diseasesViral diseasesInfluenzaPhysical sciencesPhysicsWavesTraveling wavesMedicine and health sciencesEpidemiologyGenetic epidemiologyBiology and life sciencesEvolutionary biologyEvolutionary processesNatural selectionBiology and life sciencesOrganismsVirusesRNA virusesOrthomyxovirusesInfluenza virusesInfluenza A virusBiology and life sciencesMicrobiologyMedical microbiologyMicrobial pathogensViral pathogensOrthomyxovirusesInfluenza virusesInfluenza A virusMedicine and health sciencesPathology and laboratory medicinePathogensMicrobial pathogensViral pathogensOrthomyxovirusesInfluenza virusesInfluenza A virusBiology and life sciencesOrganismsVirusesViral pathogensOrthomyxovirusesInfluenza virusesInfluenza A virusAntigenic evolution of viruses in host populationsAntigenic evolutionhttp://orcid.org/0000-0002-0744-6705RouzineIgor M.ConceptualizationFormal analysisInvestigationMethodologyValidationWriting – original draftWriting – review & editing^{1}^{2}*RozhnovaGannaConceptualizationInvestigationMethodologySoftwareVisualizationWriting – original draft^{2}^{3}^{4}Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, LCQB, F-75004 Paris, FranceInstitute of Theoretical Physics, University of Cologne, GermanyBioISI – Biosystems and Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, PortugalJulius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The NetherlandsVignuzziMarcoEditorInstitut Pasteur, FRANCE
The authors have declared that no competing interests exist.
* E-mail: igor.rouzine@upmc.fr920181292018149e1007291165201823820182018Rouzine, RozhnovaThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
To escape immune recognition in previously infected hosts, viruses evolve genetically in immunologically important regions. The host’s immune system responds by generating new memory cells recognizing the mutated viral strains. Despite recent advances in data collection and analysis, it remains conceptually unclear how epidemiology, immune response, and evolutionary factors interact to produce the observed speed of evolution and the incidence of infection. Here we establish a general and simple relationship between long-term cross-immunity, genetic diversity, speed of evolution, and incidence. We develop an analytic method fusing the standard epidemiological susceptible-infected-recovered approach and the modern virus evolution theory. The model includes the factors of strain selection due to immune memory cells, random genetic drift, and clonal interference effects. We predict that the distribution of recovered individuals in memory serotypes creates a moving fitness landscape for the circulating strains which drives antigenic escape. The fitness slope (effective selection coefficient) is proportional to the reproductive number in the absence of immunity R_{0} and inversely proportional to the cross-immunity distance a, defined as the genetic distance of a virus strain from a previously infecting strain conferring 50% decrease in infection probability. Analysis predicts that the evolution rate increases linearly with the fitness slope and logarithmically with the genomic mutation rate and the host population size. Fitting our analytic model to data obtained for influenza A H3N2 and H1N1, we predict the annual infection incidence within a previously estimated range, (4-7)%, and the antigenic mutation rate of U_{b} = (5 − 8) ⋅ 10^{−4} per transmission event per genome. Our prediction of the cross-immunity distance of a = (14 − 15) aminoacid substitutions agrees with independent data for equine influenza.
Author summary
Spread of many RNA viruses in a population represents a competition between host immune responses and viral evolution. RNA viruses accumulate mutations in immunologically important regions to escape immune recognition in hosts previously exposed to infection, while the immune system responds by producing new memory cells. Despite recent advances in data collection and their analysis, it remains conceptually unclear how epidemiology, immune response, and evolutionary factors interact to produce the observed speed of evolution and its incidence. By combining the standard epidemiological approach with the modern theory of viral evolution, we predict a general relationship between long-term cross-immunity, antigenic diversity of virus, its evolution speed, infection incidence, and the time to the most recent common ancestor. We apply these theoretical findings to available data on influenza virus to determine two important parameters of its evolution and confirm the model. Current strategies of vaccination against influenza should take into account stochastic fluctuations in fitness effect of mutations predicted by the theory.
http://dx.doi.org/10.13039/501100001665Agence Nationale de la RechercheJ16R389http://orcid.org/0000-0002-0744-6705RouzineIgor M.This work has been partly supported by Deutsche Forschungsgemeinschaft grant SFB 680/C2 to Michael Lässig, http://www.dfg.de/, and Agence Nationale de Recherche grant J16R389 to IMR, http://www.agence-nationale-recherche.fr/. The funding agencies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.PLOS Publication Stagevor-update-to-uncorrected-proofPublication Update2018-10-05Data AvailabilityAll data in the work are from published studies cited in the text.Introduction
Spread of many RNA viruses occurs as a race between host immune responses and rapid viral evolution. The development of treatment and effective preventive measures such as vaccines and therapeutic interference particles [1–3] requires understanding of the mechanics of viral evolution at the scale of a population. To evade immune recognition by hosts previously exposed to infection, in a never-ending chase, viruses accumulate mutations in immunologically relevant regions of the genome [4]. Despite advances in the collection and analysis of epidemiological and genomic data, it remains conceptually unclear how epidemiology, immune response, and evolutionary factors interact to produce the observed evolution speed and the incidence of infection.
Influenza virus infects 5-15% of the world population. The global spread and reinfection of the same individuals is caused by rapid evolution of antibody-binding regions [4]. A large amount of information has been obtained, including world-wide circulation [5–7], genetic maps of virus variants and antibodies, molecular mechanisms, and fitness effect of specific mutations [4, 8–10]. Vigorous data analysis and computer simulation helped to understand many features of influenza virus evolution [7, 11–15]. In particular, recent work [15] offers an inference model to predict short-term evolution of influenza, which is helpful for optimization of vaccination strategy. However, the more general connection between the population-scale viral parameters and its evolutionary behavior is still lacking.
The aim of this work is to establish general and simple relationships for the speed of virus evolution, genetic diversity, and annual incidence in terms of population parameters, and to train them on the available data for influenza virus. We propose a general analytic approach combining a susceptible-infected-recovered (SIR) framework [11, 16] with the stochastic evolution theory [17–25]. Using the experimental observation that phylogenetic trees of influenza virus have a vine-like structure with short branches [4], we focus on virus evolution along the one-dimensional trunk. Analysis demonstrates that the evolution under immune memory occurs in the form of a traveling wave in antigenic space, with fitness landscape moving together with the wave. The fitness slope (effective selection coefficient) can be expressed in terms of the cross-immunity distance.
We provide analytic predictions for the speed, incidence, and the average time to most recent common ancestor in terms of population parameters, including reproduction number, population size, and cross-immunity distance. Then we discuss how the punctuated nature of influenza evolution alternating small-effect and large mutations [4, 14] may be interpreted from the stochasticity of evolution.
Model and methodsStrain-structured epidemiological model
We start by describing briefly our model and approach. The details are given in S1 Appendix. Standard models of evolution focus on the dynamics of virus strains (variants), while standard epidemiological models study the transmission of a virus in a host population. For viruses that evolve to evade immune memory of previously infected hosts, evolutionary and epidemiological dynamics are tightly coupled [26]. Here we adopt a strain-based formulation of epidemiological models, in which all individuals are infected or recovered. Recovered individuals are classified according to their current ability to respond to various viral strains which represent genetic variants of an antibody-binding region of the virus (e.g., hemaglutinin gene for influenza virus). Each infected individual is assumed to be infected with a single strain denoted by x. We measure the “antigenic coordinate” x as genetic distance in terms of non-synonymous nucleotide substitutions. Infection by a viral strain is cleared in several days or weeks leaving in the recovered individual immunological memory that provides full protection against the same strain and partial protection against infection by genetically close strains. We assume one-dimensional space, x, that represents the trunk of the phylogenetic tree. For each recovered individual, we keep track only of the memory of the most recent infection [11, 12]. In S1 Appendix, Section 1.3.3, we show that this approximation has a modest effect on the final results.
Let i(x, t) denote the density of individuals currently infected with strain x, and r(x, t) be the density of individuals whose last infection was with strain x and who then recovered. The model is represented by a system of differential equations that describe the dynamics of the distributions i(x, t) and r(x, t):
dr(x,t)dt=-r(x,t)R0∫x∞K(x-y)i(y,t)dy+i(x,t),di(x,t)dt=i(x,t)[R0∫-∞xK(y-x)r(y,t)dy-1]+(mutationterm)
We assume that each individual is either infected or recovered, as given by the normalization condition
∫-∞+∞[r(x,t)+i(x,t)]dx=1.
The treatment of mutations, which are assumed to be rare, will be described below in subsection Mutation.
Eq 1 describe the following epidemiological processes. Firstly, recovered individuals from strain x can be infected with strain y and their susceptibility is proportional to the cross-immunity function K(x − y), which depends on the genetic distance between strains x and y, so that K(x − y) > 0, y > x; K(x − y) ≡ 0, y < x; K(−∞) = 1.
Here we assume that individuals recovered from strain x can be infected only by strains y ahead of x in time, y > x, so that K(u) is zero when its argument u is zero or positive (Fig 1, blue curve). In fact, there is a narrow region at the leading edge, where the backward infection could be possible. However, since the edge region is very narrow in the parameter range of interest, this process has a minor effect on the results (see the details in S1 Appendix, Section 1.3.2).
10.1371/journal.ppat.1007291.g001One-dimensional epidemiological model predicts a steady traveling wave along fitness axis.
A) Frequencies of recovered individuals (black curve) and the infected individuals (red histogram) in population in the reference frame moving with the wave. Here X axis plots the antigenic coordinate in that reference frame, u = x − ct. Black solid line shows analytic prediction for r(u) (Eq 3). Histograms show the result of a full stochastic simulation of the epidemiological model, Eq 1. Blue line is cross-immunity function K(u) (Table 1). Parameters (example): R_{0} = 2, a = 9, U_{b} = 5.8 × 10^{−6}, N = 10^{8}. Units of the values on the axes are given in Table 1 and Eq 1. A wave in the rest frame of reference is shown in S2B Fig.
Secondly, infected individuals with the density i(x, t) recover. Thirdly, individuals infected with a strain x may produce a mutant strain x′ with a small probability, as explained below (Mutation). We measure time in the units of infectious period, t_{rec}, so that recovery rate is 1, and transmission rate equals the basic reproduction number, R_{0}, defined as the reproduction number in a population of previously uninfected individuals.
Mutation
So far we have considered only dynamics of strains x which already exist. What drives the antigenic evolution is the emergence of new viral strains. Each strain x occasionally and accidentally undergoes a mutation event which changes its ability to be recognized by antibodies (phenotype). We describe this as a variable change in its antigenic coordinate Δx > 0. The new influenza strain with a new antigenic coordinate, x + Δx, is either cleared from the individual or (with small probability) transmitted to another person. The model parameters describing random mutations are the average rate U_{b} per genome per infectious period (Table 1) and the distribution of Δx. The actual distribution may be quite complex [27]; here, we consider a class of exponential distributions [23]. Specifically, we assume that with each mutation, the value of Δx is drawn randomly with the following probability density
ρ(Δx)=e-(Δx)βΓ(1+1/β),
where β is a fixed parameter.
10.1371/journal.ppat.1007291.t001Model parameters: Input (upper rows) and output (lower rows).
Notation
Name
Unit
H3N2
H1N1
R_{0}
Basic reproduction number
1^{b}
1.8^{a}
1.46^{a}
t_{rec}
Recovery time
day
5^{a}
5^{a}
U_{b}
Mutation rate per genome
1/t_{rec}|yr
5 10^{−4}|0.036^{c}
8 10^{−4}|0.058^{c}
a
Crossimmunity distance
AA
15^{c}
14^{c}
K(u)
Crossimmunity function
1
|u|/(a + |u|)
|u|/(a + |u|)
N
Population size
1
10^{8}
10^{8}
β
Mutation distrib. parameter
1
2
2
σ
Average selection coefficient
1
0.048^{d}
0.028^{d}
365NitrecN
Annual incidence
1/yr
0.07^{d}
0.04^{d}
c
Substitution rate
1/t_{rec}|yr
0.036|2.6^{a}
0.031|2.26^{a}
T_{MRCA2}
Pairwise coalescent time
Year
3.03^{a}
4.59^{a}
^{a} Known from published data for influenza A strains H3N2 and H1N1 [7, 13, 30, 31]
^{b} Unit “1” stands for “dimensionless”.
^{c} Input parameter of the model which was adjusted to fit published data.
^{d} A value predicted for the best-fit parameter set
Genetic drift
Below in Results, we introduce the critically important factor of random genetic drift [28, 29] by allowing the number of new infections to vary randomly among the sources of transmission. The model parameters and their estimates used in the analysis are summarized in Table 1.
Results
The model described in the previous section establishes a general analytic relationship between immunological, epidemiological, and evolutionary properties of a virus causing non-chronic infection. Using the analytic approach described in the previous section, below we predict the evolution speed, the incidence of influenza in a population, and the time to the most recent common ancestor. Then, we test analytic results with stochastic simulation and compare them to available data on influenza strain A H3N2.
Recovered individuals and the traveling wave
Below we analyze epidemiological dynamics in two steps. First, we assume that, in the realistic parameter range, a ≫ 1, the frequency of infected individuals, i(x, t) represents a solitary peak, much more narrow in genetic distance x than the frequency of recovered individuals, r(x, t). Using this fact, we find analytically the form of r(x, t). Second, we apply the well-developed theory of asexual evolution [18–21, 23] to obtain parameters of the distribution of infected individuals i(x, t). Details are given in S1 Appendix; here we present the main steps of the derivation.
We start our analytic derivation by noting that, in the limit of small mutation rates, the main role of mutation is to form new strains with antigenic coordinate x larger than for already existing strains. For already existing strains, mutation is negligible. This assumption is intuitively clear and is verified in the relevant parameter range, using estimates of mutation rate U_{b} (Table 1).
Neglecting the mutation term in Eq 1, we seek for a traveling wave solution of the form r(x, t) = r(x − ct) and i(x, t) = i(x − ct) where x − ct ≡ u is the relative antigenic coordinate of a strain and c = d 〈x〉/d t is the wave speed defined as the average number of non-synonymous nucleotide substitutions per year. Without loss of generality, we choose the peak of the infected wave i(u) to be at u = 0, [di(u)/du]_{u=0} = 0. The traveling wave solution of Eq 1 for infected and recovered individuals then reads
i(u)=Acf(u),r(u)≈{Aexp[-AR0∫u0K(v)dv],u<0,0,u>0,
where A is a constant found from the normalization condition ∫-∞+∞[r(u)+i(u)]du=1, and f(u) is a narrow peak with unit area and a width much less than the width of the recovered distribution, r(u). The wave speed c and the shape of the infected density f(u) are to be determined later on.
At large R_{0}, K(v) in Eq 3 can be expanded linearly near zero, so that density of the recovered becomes a half of a Gaussian
r(u)≈2R0πae-(R0uaπ)2,u<0;0,u>0
and A = 2R_{0}/(πa). The fraction of infected individuals in population
NinfN=∫-∞∞i(u)du=Ac=2R0cπa
is assumed to be much smaller than 1. Then the annual incidence of infection is expressed in terms of cross-immunity distance, evolution speed, and basic reproduction number as
Annualincidence=2R0cπa365trec,
which is a directly testable prediction.
Analytic solution, Eqs 3 and 4, is based on the assumption that the infected wave i(u) is much more narrow than the recovered wave r(u). To verify the validity of this approximation, we compare the Eq 3 with Monte-Carlo simulation based on Eq 1. The simulation confirms the existence of a steady traveling wave with two linked components moving to the right in antigenic coordinate (Fig 1). Infected wave i(u) is, indeed, a narrow peak. The time-averaged solution for recovered individuals obtained from simulation agrees fairly well with the analytic prediction (black line). Recovered wave r(u) displays a sharp increase near the maximum of i(u) and a slowly decaying tail at u < 0. The sharp increase is due to continuous recovery of infected individuals. The decaying tail is caused by reinfection of recovered individuals once they become genetically remote from the moving front of wave r(u). This derivation captures only the shape of the recovered peak leaving the narrow infected peak undefined.
Moving fitness landscape
In order to determine the infected individual distribution, i(u), we use standard traveling wave theory [18–23]. The interesting feature of the selection due to immune escape is that the fitness landscape which controls the traveling wave travels with the wave. Moreover, it is the wave itself which creates its own landscape, as follows: the recovered create a landscape for the infected evolution, which moves the recovered distribution forward in x, and so on.
To derive the form of landscape on the human population level, we use the standard definition of viral fitness as the average number of secondary infections caused by an infected individual [28, 32–34]. (The reproductive number must not to be confused with the basic reproductive number R_{0}, which is its maximum value, i.e. the value in a totally susceptible population.) Here we choose to define fitness w(x, t) as the log of R_{0} − 1, i.e., the exponential expansion rate of the density of infected individuals i(x, t) measured per infectious period:
w(x,t)=∂lni(x,t)∂t=R0∫-∞xK(y-x)r(y,t)dy-1.
The form of w(u) obtained from Eqs 7 and 3 is shown in Fig 2 (red line).
10.1371/journal.ppat.1007291.g002Traveling fitness landscape and its linear approximation near the infected peak.
Red curve: analytic result (Eq 7). Gray circles: Monte-Carlo simulation based on Eq 1. Black line: linear approximation with the average selection coefficient σ = 0.066 (Eq 8). Parameters as in Fig 1: R_{0} = 2, a = 9, U_{b} = 5.8 × 10^{−6}, N = 10^{8}. For the accuracy of linear approximation, see S1 Fig.
The asymptotic cases of the fitness landscape w(u) are
w(u)≈{R0-1,u≫a,σu,|u|≪a,-1,u<0,|u|≫a.
where
σ=-R0∫-∞0dKdur(u)du
has the meaning of the fitness landscape slope, or the average selection coefficient. According to Eq 8, w(u) is positive for u > 0 and negative for u < 0, indicating that viruses are selected for in front of the infected peak and selected against in the wake of the wave. For large positive or negative u, |u| ≫ a, we predict saturation of w(u). At u = 0, w(0) = 0, which is equivalent to the fact that the actual reproduction number is exactly 1 at the peak of the wave. Within the range |u| ≪ a, where the narow peak of the infected individuals is located, fitness landscape can be expanded linearly with slope σ > 0 which represents the average selection coefficient of a mutation event. For sufficiently large R_{0}, from Eqs 4 and 9, σ can be approximated by a series in 1/R_{0}σ(a,R0)=1a[R0-2+3π2R0+O(1R02)],
where a ≡ 1/|K′(0)|, and the second and third terms are small corrections to the first term. Thus, the average selection coefficient σ of the traveling fitness landscape is inversely proportional to the cross-immunity distance a. It also increases linearly with the basic reproduction ratio R_{0} when R_{0} is large. The two correction terms in Eq 10 depend on the form of cross-immunity function in Table 1. For an alternative form K(x) = 1 − exp(−x/a), they are smaller by factors of 2 and 6, respectively. The overall agreement for the entire landscape w(u) between the analytic prediction and simulation is quite good (Fig 3).
10.1371/journal.ppat.1007291.g003Analytic results for the evolution speed are confirmed by stochastic simulation.
Simulation is performed at fixed parameters R_{0} = 2, a = 9; U_{b} and β as shown. Solid and dashed lines are analytic results for the wave speed, c (Eq 6, S14, S16-S18) at two values of mutation rate U_{b} which define the broadest range of interest for RNA viruses, and two values of parameter β to test sensitivity to the density of selection coefficient distribution. Symbols show results either performed by full stochastic simulation of the SIR model (Eq 1) or by a reduced simulation with σ = 0.066 (S1 Appendix).
Antigenic diversity and the speed of evolution
We get further insight into the dynamics of the model by predicting the speed of viral evolution c. So far, we have left this value undetermined because it weakly affects the shape of the density of recovered individuals r(x, t), Eq 3. In contrast, the density of infected individuals i(u), which is much more narrow, needs to be determined simultaneously with c. Our result for the average selection coefficient σ, Eq 10, reduces the problem of epidemiological evolution to models of asexual populations with many diverse sites where the speed was derived previously in terms of population size, selection coefficient and mutation rate ([18–23]). We consider a case with randomly distributed selection coefficient s = σΔx, where mutational distance Δx is sampled from distribution in Eq 2 with large parameter β.
This section contains the central result of our analysis: Antigenic diversity Var[x] = < (Δx)^{2} > and adaptation rate v defined as the average rate of fitness increase (“fitness flux”) depend on crossimmunity range a and other parameters [23]
Var[x]=2logNinfσlog(βσ/Ub)v=σ2Var[x]
Another measure of evolution rate is the average substitution rate cc=(σ2/s*)Var[x]s*=σ[2βlogσUb]1β-1
where s* represents the most probable fitness gain of a mutation established in a population [23]. Note that s* is larger than the average selection coefficient σ. The expressions for Var[x] and s* are approximate, within the accuracy of logarithms inside the large logarithms. For more accurate expressions, see S1 Appendix.
To apply these results to our case of antigenic evolution, we substitute average selection coefficient σ from Eq 10 and infected population size N_{inf} from Eq 5. Then the metrics of evolution speed c, v are expressed in terms of a and epidemiological parameters (Table 1). In the limit of very large β, Eqs 11–14 match results of a model with constant selection coefficient σ [18, 20].
We verified analytic results for wave speed c by Monte-Carlo simulation in a wide range of N and U_{b} (Fig 3). We used two methods: full simulation of initial Eq 1 with randomly distributed mutational effects, and a reduced Moran algorithm with linearized fitness landscape (symbols in Fig 3). We observe that our analytic prediction of a logarithmic increase of c with N and U_{b} follows simulation quite well, except at smallest U_{b} and N explored in our study. Logarithmic dependencies are characteristic for asexual evolution models ([18–23, 35, 36]). Abbreviations IS, CI, MM near symbols indicate different regimes regarding the number of genomic sites evolving within the same time frame: selection sweeps at isolated sites (IS), pairwise clonal interference (CI) [23, 35, 36], and multiple-mutation regime (MM) [18–21, 23]. The traveling wave models are designed for MM regime, which explains the discrepancy at smallest U_{b} and N. We also observe that the steepness of the selection coefficient distribution, β, weakly affects the predicted speed.
Our analysis predicts that substitution rate of antigenic mutations c, Eq 13, is inversely proportional to the cross-immunity distance a and increases logarithmically with host population size and mutation rate. The average selection coefficient at the population level, σ, is also inversely proportional to a, Eq 10. An alternative measure of the evolution speed, the adaptation rate v, Eq 12, is inversely proportional to a^{2}. The annual incidence of infection, Eq 6 also scales as 1/a^{2}.
Time to the most recent common ancestor
Taking advantage of recent theoretical progress in asexual phylogeny [24, 25, 38], we also calculated an important observable quantity, the time to the most recent common ancestor of two co-existing viruses (S1 Appendix, Eqs S20-S21).
TMRCA2=z2log(Nσ)v
Here numeric factor z depends on the distribution of mutational effect Δ[x] [24, 25]. The predicted values are z = 1.5 in the case of fixed mutational effect Δ[x], and z = 3 in the case of the Gaussian distribution of Δ[x] (Eq 2 with β = 2). Because the Gaussian case is more realistic, and because we are not aware of any results for T_{MRCA2} for other forms of distribution, below we choose the value β = 2 for data fitting.
Comparison with influenza A data
To test the model, we compared its predictions with available data on influenza A H3N2 and H1N1, as follows. The input parameters of the model and the output (predicted) parameters are summarized in Table 1. The values of input parameters such as population size N, reproduction ratio in the absence of immune recognition R_{0} (during a major pandemic caused by antigenic shift), and recovery time t_{rec} have been measured [7, 13, 30, 31]. In contrast, parameters a and U_{b} result from biological interactions at multiple biological scales (cell, host, population) and are hard to come by. On the other hand, data on two parameters predicted by the model, T_{MRCA2} and substitution rate c, are available. Therefore, we opted to adjust the unknown input parameters a and U_{b} to fit available data for the two predicted parameters (Fig 4A). We assumed a total susceptible population of N = 10^{8} individuals, which corresponds to a large country.
10.1371/journal.ppat.1007291.g004For influenza A virus, the model predicts annual incidence and cross-immunity which agree with observations.
Shown is the best-fit to combined immunological, epidemiological, and evolutionary data available on influenza A strains H3N2 (red and blue colors) and H1N1 (magenta and cyan colors). (A) X and Y-axis are the cross-immunity scale, a, and the mutation rate per genome per transmission event, U_{b}, respectively. Analytic predictions for the evolution speed c (red and magenta curve, Eq 13) and T_{MRCA2} (blue and cyan, Eq 15 with z = 3) are shown as contours of constant heights taken from data [7] (Extended Data Table 1 and refs). Population size is estimated N ∼ 10^{8} [31]. Dashed lines show the intersection points where both parameters fit experimental values. (B) Solid curves: The same three quantities for H3N2 as a function of population number N at the best-fit values of a and U_{b}. Dashed lines correspond to N = 10^{8}. (A and B) Input from data [7, 31]: R_{0} = 1.8, c = 2.6 AA/year, T_{MRCA2} = 3.0 years for H3N2 and R_{0} = 1.46, c = 2.3 AA/year, T_{MRCA2} = 4.6 years for H1N1. Infection cycle time t_{rec} = 5 days. Predicted annual incidence of infection of (4 − 7)% and the cross-immunity scale a = (14 − 15) AA are in good agreement with independent data [37].
It is evident that strain H2N3 has a faster evolution rate and a shorter time T_{MRCA2} than strain H1N1 due to a larger value of R_{0} causing, in turn, a larger average selection coefficient σ. The values of U_{b} and a for the two strains are similar (Fig 4a).
The best-fit values for the cross-immunity distance, a = 14 − 15, agree very well with independent data on equine influenza [37], which represents a direct confirmation of the model. The predicted annual incidence in humans of (4 − 7)% also falls within the experimentally observed range and previous modeling estimates [12, 13, 15]. Interestingly, the model explains the inverse correlation between T_{MRCA2} and evolution rate c reported previously for H2N3, H1N1 and two strains of influenza B [7]. Indeed, the predicted evolution rate c is linearly proportional to the effective selection coefficient σ ∝ R_{0}/a, while T_{MRCA2} is inversely proportional to σ. The dependence of c and T_{MRCA2} on the other parameters, U_{b} and N, is logarithmically slow.
To generalize our results for epidemics occurring on larger or smaller scales, we calculated the dependence of c, T_{MRCA2}, and the annual incidence on population size N (Fig 4B). The sensitivity of our predictions to input parameters U_{b}, a, and R_{0} has also been tested (S1 Appendix, S3 and S4 Figs). Thus, traveling wave theory with modest selection predicts logarithmic dependence of the speed on population size (Fig 4B).
Results are robust to the existence of additional dimensions of antigenic space
Epidemiological data demonstrate that, a priori, antigenic space is not one-dimensional but has fractal nature and fractal dimensionality more than 1 [8, 31]. To demonstrate the weak sensitivity of our model to the existence of additional dimensions, we extended our model to a discrete random tree of epitope variants and solved it numerically (S1 Appendix, S6 Fig). Phylogeny demonstrates quasi-1D behavior comprising a long trunk of permanently fixed mutations and short branches representing transient virus variants and resembling the actual influenza H3N2 phylogeny [4, 12, 13, 15]. We also confirmed the formation of a 1D traveling wave for two-dimensional genetic space (S5 Fig).
Discussion
We investigated stochastic evolutionary dynamics of a virus driven by the pressure to escape immune recognition in previously infected individuals. We mapped this problem to an evolutionary model with fitness landscape expressed in terms of the cross-immunity function K(x) (Fig 2). Stochastic evolution occurs as a traveling wave with two population components structured in the antigenic variant space x, recovered individuals and the currently infected individuals, with different widths and total counts (Fig 1). The recovered distribution is broad and large. The infected distribution represents a narrow and small peak at the recovered distribution front. We expressed several observable parameters including the speed of viral evolution, the annual incidence of infection, and the average time to the most recent ancestor in terms of model parameters N, U_{b}, R_{0}, K(x) (Table 1). The analytic predictions agree with simulation and are able to estimate correctly important parameters of viral evolution in host populations, as we illustrated using genomic data on influenza.
One of the puzzling aspects of influenza virus evolution is is punctuated nature [4]. While most mutations are almost neutral or have a modest phenotypic (fitness) effect, some represent large jumps in antibody recognition [14]. Our results interpret these jumps as a natural consequence of the stochastic nature of the traveling wave models. The extension of the leading edge of a wave occurs due to adding rare, best available escape alleles. Asexual evolution theory with variable fitness effect of mutations demonstrates that most fixed mutations have a fitness effect in excess of average fitness effect [23]. Good et al show that the most likely selection coefficient s* that drives the wave depends on model parameters σ, N, U_{b}, mapping the results either onto the multiple-mutation (MM) model with fixed s [18–21] or the two-site clonal interference (CI) model [35, 36]. Present work demonstrates that influenza virus evolves within MM regime near the border with CI regime (Fig 3). In this region, the fitness effect of a fixed allele is predicted to fluctuate strongly around the most likely value s*, which represents a possible explanation of the punctuated effect.
An SIR model with immune memory and 1D antigenic space (Eq 1) has been previously proposed by Lin et al [11]. Their analysis differs from ours in two critical aspects. Firstly, their approach to viral evolution was completely deterministic, i.e. assumes infinite population size. In fact, the effect of clonal interference acting in finite population diminishes antigenic return on additional mutations. Secondly, their mutation term in Eq 1 had a diffusion form proportional to the second derivative of the infected individual density, ∂^{2}i(x, t)/∂x^{2}. This approximation would be correct if the front edge of the wave was smooth. As we discuss in S1 Appendix, neither approximation holds at low mutation rates, U_{b} ∼ 10^{−4}. As a result, the approach of Lin et al predicts evolution speeds far below simulation results. The traveling wave approach employed here naturally accounts for both the stochastic effects and the steepness of the leading edge. Future development of this model requires inclusion of finite mutation cost [39].
Our analytic results agree with the numeric results of a previous simulation by Bedford et al [12]. Using a similar model, they predicted the same incidence range for influenza A, the same range for the evolution speed, and interpreted the quasi-one-dimensional trajectory in the genetic space we have also observed (S5 and S6 Figs). As starting parameters, they assumed mutation rate U_{b} ∼ 10^{−4} and set the cross-immunity distance to be a = 1/0.07 based on equine flu data [37]. By comparison, here we determine U_{b} and a a posteriori from fitting human H3N2 and H1N1 data on c and T_{MRCA} from the cited work [7]. We test the model by comparing our prediction with the experimental value of a [37].
Conclusion
Merging the standard epidemiological approach and the modern traveling wave theory, we develop a general analytic approach that connects epidemiological and immunological parameters to the observed parameters of influenza evolution. We demonstrate that the distribution of recovered individuals in the genetic space effectively creates a fitness landscape for the infected individual distribution, and both distributions move together along quasi-one-dimensional path. Our predictions demonstrate a good experimental agreement with data on influenza A H3N2.
Supporting informationMathematical appendix.
(PDF)
Theory of clonal interference with relative fitness linear in antigenic coordinate is accurate at small mutation rates and approximately correct at intermediate rates.
(TIFF)
Finite population size N eliminates the artifact of “mirror wave”.
(TIFF)
Dependence of the wave speed and incidence on the population size.
(TIFF)
Dependence of the wave speed and incidence on the cross-immunity scale.
(TIFF)
Two-dimensional influenza model predicts spontaneous development of a stable 1D-like traveling wave starting from a flat front.
(TIFF)
Phylogenetic tree of virus strains existing at different times in a multi-dimensional antigenic space projected onto 2D.
(TIFF)
This work initiated in extensive discussions with Michael Lässig. I.M.R. is grateful to Eric Brunet for valuable suggestions and discussions.
ReferencesMetzgerVT, Lloyd-SmithJO, WeinbergerLS. Autonomous targeting of infectious superspreaders using engineered transmissible therapies. RouzineIM, WeinbergerLS. Design requirements for interfering particles to maintain coadaptive stability with HIV-1. RouzineIM, WeinbergerLS. Reply to “Coadaptive stability of interfering particles with HIV-1 when there is an evolutionary conflict”. SmithDJ, LapedesAS, de JongJC, BestebroerTM, RimmelzwaanGF, OsterhausAD, et al. Mapping the antigenic and genetic evolution of influenza virus. RambautA, PybusOG, NelsonMI, ViboudC, TaubenbergerJK, HolmesEC. The genomic and epidemiological dynamics of human influenza A virus. RussellCA, JonesTC, BarrIG, CoxNJ, GartenRJ, GregoryV, et al. Influenza vaccine strain selection and recent studies on the global migration of seasonal influenza viruses. BedfordT, RileyS, BarrIG, BroorS, ChadhaM, CoxNJ, et al. Global circulation patterns of seasonal influenza viruses vary with antigenic drift. KoelBF, BurkeDF, BestebroerTM, van der VlietS, ZondagGC, VervaetG, et al. Substitutions near the receptor binding site determine major antigenic change during influenza virus evolution. FonvilleJM, WilksSH, JamesSL, FoxA, VentrescaM, AbanM, et al. Antibody landscapes after influenza virus infection or vaccination. NeherRA, BedfordT, DanielsRS, RussellCA, ShraimanBI. Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses. LinJ, AndreasenV, CasagrandiR, LevinSA. Traveling waves in a model of influenza A drift. BedfordT, RambautA, PascualM. Canalization of the evolutionary trajectory of the human influenza virus. StrelkowaN, LassigM. Clonal interference in the evolution of influenza. BedfordT, SuchardMA, LemeyP, DudasG, GregoryV, HayAJ, et al. Integrating influenza antigenic dynamics with molecular evolution. LukszaM, LassigM. A predictive fitness model for influenza. GogJR, RimmelzwaanF, OsterhausADME, GrenfellBT. Population dynamics of rapid fixation in cytotoxic T lymphocyte escape mutants of influenza A. TsimringLS, LevineH, KesslerDA. RNA virus evolution via a fitness-space model. RouzineIM, WakeleyJ, CoffinJM. The solitary wave of asexual evolution. DesaiMM, FisherDS. Beneficial mutation selection balance and the effect of linkage on positive selection. RouzineIM, BrunetE, WilkeCO. The traveling-wave approach to asexual evolution: Muller’s ratchet and speed of adaptation. BrunetE, RouzineIM, WilkeCO. The stochastic edge in adaptive evolution. HallatschekO. The noisy edge of traveling waves. GoodBH, RouzineIM, BalickDJ, HallatschekO, DesaiMM. Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. DesaiMM, WalczakAM, FisherDS. Genetic diversity and the structure of genealogies in rapidly adapting populations. NeherRA, HallatschekO. Genealogies of rapidly adapting populations. GrenfellBT, PybusOG, GogJR, WoodJL, DalyJM, MumfordJA, et al. Unifying the epidemiological and evolutionary dynamics of pathogens. AcevedoA, BrodskyL, AndinoR. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. PoulinR. RouzineIM, RodrigoA, CoffinJM. Transition between stochastic evolution and deterministic evolution in the presence of selection: general theory and application to virology [review]. CarratF, VerguE, FergusonNM, LemaitreM, CauchemezS, LeachS, et al. Time lines of infection and disease in human influenza: a review of volunteer challenge studies. BiggerstaffM, CauchemezS, ReedC, GambhirM, FinelliL. Estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. AstierS. NowakMA. RiceSH. GerrishPJ, LenskiRE. The fate of competing beneficial mutations in an asexual population. SchiffelsS, SzollosiGJ, MustonenV, LassigM. Emergent neutrality in adaptive asexual evolution. ParkAW, DalyJM, LewisNS, SmithDJ, WoodJL, GrenfellBT. Quantifying the impact of immune escape on transmission dynamics of influenza. BrunetE, DerridaB, MuellerAH, MunierS. Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization. BatorskyR, SergeevRA, RouzineIM. The Route of HIV Escape from Immune Response Targeting Multiple Sites Is Determined by the Cost-Benefit Tradeoff of Escape Mutations.