Estimating the Lineage Dynamics of Human Influenza B Viruses

The prediction of the lineage dynamics of influenza B viruses for the next season is one of the largest obstacles for constructing an appropriate influenza trivalent vaccine. Seasonal fluctuation of transmissibility and epidemiological interference between the two major influenza B lineages make the lineage dynamics complicated. Here we construct a parsimonious model describing the lineage dynamics while taking into account seasonal fluctuation of transmissibility and epidemiological interference. Using this model we estimated the epidemiological and evolutional parameters with the time-series data of the lineage specific isolates in Japan from the 2010–2011 season to the 2014–2015 season. The basic reproduction number is similar between Victoria and Yamagata, with a minimum value during one year as 0.82 (95% highest posterior density (HPD): 0.77–0.87) for the Yamagata and 0.83 (95% HPD: 0.74–0.92) for Victoria, the amplitude of seasonal variation of the basic reproduction number is 0.77 (95% HPD:0.66–0.87) for Yamagata and 1.05 (95% HPD: 0.89–1.02) for Victoria. The duration for which the acquired immunity is effective against infection by the Yamagata lineage is shorter than the acquired immunity for Victoria, 424.1days (95% HPD:317.4–561.5days). The reduction rate of susceptibility due to immune cross-reaction is 0.51 (95% HPD: 0.084–0.92) for the immunity obtained from the infection with Yamagata against the infection with Victoria and 0.62 (95% HPD: 0.42–0.80) for the immunity obtained from the infection with Victoria against the infection with Yamagata. Using estimated parameters, we predicted the dominant lineage in 2015–2016 season. The accuracy of this prediction is 68.8% if the emergence timings of the two lineages are known and 61.4% if the emergence timings are unknown. Estimated seasonal variation of the lineage specific reproduction number can narrow down the range of emergence timing, with an accuracy of 64.6% if the emergence times are assumed to be the time at which the estimated reproduction number exceeds one.


Introduction
The influenza virus is one of the most common respiratory viruses and causes a high disease burden worldwide [1]. The influenza viruses co-circulating among humans can be classified as influenza A viruses (IAV) and influenza B viruses (IBV). Approximately 75% of confirmed cases of influenza are infections by the IAV [2]. IAV shows high antigenic diversity and rapid change in antigenicity, and appropriate intervention against IAV epidemics is quite difficult in terms of vaccine strain selection. The disease burden of IBV is also high, and 25% of confirmed cases of influenza virus infection and 22-44% of pediatric influenza related deaths in the US are caused by influenza B [2,3]. The number of major lineages of IBV is relatively low compared to type A. There are only two major genetically and antigenically distinct lineages; the Yamagata lineage and the Victoria lineage.
Trivalent vaccines against influenza include one of those two lineages. The selection of the correct vaccine lineage is essential for high vaccine efficacy against influenza B infections. Despite the limited number of existing IBV lineages, vaccine strain selection is still difficult because the dominant lineage changes over time and the switching time of the dominant lineage is difficult to predict. Although the quadrivalent vaccine includes both IBV lineages, Höpping et al. 2016 pointed out the necessity of vaccine strain selection because the use of trivalent vaccines is still common worldwide and the cost-effectiveness of quadrivalent vaccines is under debate [4].
To predict an effective vaccine strain for IBV, a good model capturing the mechanisms of its complex dynamics is needed. Important factors to consider regarding the complex epidemic dynamics of the Yamagata and Victoria lineages are i) the seasonal variation of transmissibility, ii) epidemiological interference between the two lineages, and iii) time series changes of antigenicity due to the evolution of the pathogens. The incidence of IBV shows seasonal fluctuation which can be explained by the seasonality of the transmissibility of IBV. This transmissibility seasonality has been shown to be determined by seasonal variation of absolute humidity [5,6]. Previous theoretical studies have shown that seasonal variation in transmissibility can induce rich epidemic dynamics, such as periodic or chaotic behavior [7][8][9], therefore a model capturing seasonal fluctuation of transmissibility is essential to predicting epidemic dynamics. Epidemiological interference is also a known factor in complex epidemic dynamics [10][11][12][13]. The time series of confirmed cases between two lineages are negatively correlated (Fig  1), implying epidemiological interference between the two lineages. Moreover, vaccine efficacy studies also imply the existence of immune cross-reaction between lineages [4]. The epidemic dynamics of a lineage are affected by that of the other lineage assuming the existence of immune cross-reaction. The change of antigenicity of influenza across seasons is one major obstacle for prediction. Especially in the case of IAV, the large variety of lineages and epidemic interference between these lineages makes it difficult to predict the dynamics. To analyze these complex dynamics, models taking into account these complex evolutionary dynamics have been proposed so far [14][15][16][17][18][19]. Although the number of co-circulating IBV lineages is limited compared to IAV the large genetic diversity within the lineages and the changing antigenicity over time, especially for the Victoria lineage [20], makes dominant lineage prediction difficult. Modelling the complex evolutionary dynamics is essential to predicting future changes in IBV.
In this paper, we assess the predictability of the dominant lineage of IBV using a mathematical model describing IBV epidemics. To this end, we construct a parsimonious mathematical model that takes into account the seasonal variation of transmissibility, the epidemiological interference between lineages, and the time series change of antigenicity. We first estimate the parameters of our model, including these three factors, from the time series of IBV confirmed cases per lineage and time series of specific humidity. Using these estimated parameters we assess the predictive potential of the dominant lineage in the next season.

Data
We analyzed the weekly reports of the number of cases of human IBV in Japan from the 2010-2011 season to the 2015-2016 season, collected by the National Institute of Infectious Diseases, Japan (http://www.nih.go.jp/niid/en/influenza-e.html). The following analyses are based on the data which we accessed on 12th April 2016. Cases where the lineage was not available were excluded.

Mathematical model describing natural history of IBV
We employed individual-based Monte Carlo simulation (IBM) with host populations of 10,000. The host population was determined based on a sensitivity analysis of posterior distributions to the host population size (S1 Fig). We described the transmission process of the Yamagata lineage and the Victoria lineage using the compartmental SEIRS model. Based on the natural history of IBV, we classified the host population into four classes by infection state against each lineage; susceptible S, latent E, infectious I, and recovered R. A total of 4 2 = 16 infection states were considered. We denote these infection states by XY where X is the infection state against Victoria and Y is that for Yamagata. For example SE denotes S for Victoria and E for Yamagata.
For example, the susceptibility against Victoria is 1 if the host is susceptible to Victoria (the infection state for Victoria is S) and does not have any immunity against Yamagata (the infection state for Yamagata is S or E or I). If the host is susceptible to Victoria and has immunity against Yamagata (SR), the susceptibility to Victoria decreases to 1−α Yamagata!Victoria by crossimmune reaction. The force of infection at time t, λ(t), is determined by the number of infected hosts I(t) and the specific humidity h(t) at time t as follows, Here N denotes the host population size, γ is the lineage specific recovery rate, the term 1 +exp(a n -b n h(t)) describes the transmission coefficient determined by humidity, where h is specific humidity, and a and b are lineage specific parameters. We followed Shaman et al. 2010 [5] regarding the model of the relationship between specific humidity and transmissibility; i.e. specific humidity shows seasonal fluctuation resulting in seasonal fluctuation in transmissibility (and this fluctuation is reflected in parameter b). We extrapolated h(t) using the daily observed data of specific humidity in Tokyo, Japan, collected by the Japan Meteorological Agency (http://www.jma.go.jp/jma/indexe.html). I n (t) denotes the number of infected hosts with lineage n, for example, I victoria (t) = IS(t)+ IE(t)+ II(t) + IR(t). The host infection state becomes E after infection, and the host obtains infectiousness and the infection state becomes I with probability ε. I recovers with probability γ and obtains immunity. The immune response wanes due to the evolution of antigenicity. The emergence of new distinct lineages from existing lineages is not observed for a long time [20]. We assume that the evolutionary dynamics of IBV within the same lineage is stable and the probability of waning immunity is constant over time, κ. The parameters (α, a, b, ε, γ, and κ) are lineage specific.

Estimation of parameters
Using the time-series data of the number of laboratory-confirmed IBV cases per lineage and specific humidity, we estimated the parameters (α, a, b, and κ) in the model described in the last section. Based on previous study [21] we parameterized ε as 1/ε = 0.6 day and γ as 1/γ = 4.0 day. These parameters are estimated for each lineage. We also estimated the herd immunity against each lineage at the beginning of the period explored in this study, i. To estimate these parameters we implemented Approximate Bayesian Computation (ABC) using our model [22]. Models describing the interaction between nonlinear dynamics, i.e. epidemiological interference, are difficult to solve analytically. As a result, model-based inference is complicated to implement due to the difficulty of obtaining an analytical solution for the likelihood function. In such case ABC gives us a good approximation of the posterior distribution. The procedure of ABC that we conducted is, i) we simulated IBM with a parameter set, {α, a, b, κ, SS(0), SR(0), RS(0), RR(0), p}, determined by prior distribution of each parameter, ii) the simulation results were compared to the time-series data of lineage specific confirmed cases and we recorded the parameter sets, {α, a, b, κ, SS(0), SR(0), RS(0), RR(0), p}, if the distance between the simulation results and the observed data was smaller than a threshold, iii) we estimated prior distribution of each parameter (each element of parameter set) from the recorded parameter sets, {α, a, b, κ, SS (0) where I sim denotes the simulation result of the number of infected individuals and I obs denotes the eld data. The parameter p is used for the adjustment of the population size. The parameter sets were accepted when the distance, D, is smaller than 0.44. We assume that the sampling probability of cases for laboratory testing and the host population size are constant over time and conrmed cases are proportional to the number of infected hosts. We set the prior distributions as uniform distributions for all parameters, the ranges of the priors are We introduce the infected people at the beginning of each epidemic season as the initial condition during the IBM simulation process. We dened the beginning of the epidemic season as the time when the number of isolations exceeds 7. The number of infected people in the beginning of an epidemic season was adjusted by p.

Prediction of dominant lineage
Using the posterior distributions obtained by ABC, we simulate IBM for one epidemic season and compare the results with empirical data of lineage specific confirmed cases. To measure the accuracy of the prediction, we conducted IBM 1,000 times and counted the number of simulations that showed the same dominant lineage as the empirical data. The average specific humidity at specific time points over the epidemic season was used for prediction. For prediction we consider three scenarios, i) predict using empirical data for the emergence dates of lineages, ii) predict without using empirical data for the emergence dates of any lineage, and iii) estimate the emergence dates of both lineages and predict the dominant lineage with the estimated emergence dates. In scenario ii), we simulated IBM while varying the emergence timing of both lineages from the beginning to the end of the epidemic season. Regarding iii), we assume the emergence timing for a lineage is equivalent to the time when the lineage specific basic reproduction number R 0,n exceeds one. In our model, R 0,n can be described by: We estimate the timing when R 0,n exceeds one using estimated parameters.

Results
Our model captured the lineage dynamics of both Victoria and Yamagata from the 2010-2011 season to the 2014-2015 season well (Fig 3). Table 1 summarizes the estimated values of parameters; with most parameters being similar between Yamagata and Victoria, with the exception of b and κ. The amplitude of seasonal fluctuation of transmission rate for the   We showed that understanding the emergence timing is crucial for the prediction of the dominant lineage. We also tried to narrow down the considerable range of the emergence timing using the lineage specific basic reproduction number R 0,n . Fig 6 shows R 0,n during 2015-2016 season. The calendar week when the R 0,n exceeds one is the 46 th week (95% highest posterior density (HPD): 43 rd -48 th week) for Victoria and the 47 th week (95%HPD: 44 th -50 th week) for Yamagata. The actual emergence time, determined as the time when the weekly isolation number exceeds 6, is the 46 th and 43 rd week for Victoria and Yamagata, respectively. Estimated emergence timing by R 0,n can improve the accuracy of prediction, 64.6 percent of 1000 simulation runs with estimated timing shows the correct dominant strain.
Due to the similar number of isolates between Yamagata and Victoria in the 2015-2016 season, it was difficult to determine the dominant strain. We also compared our prediction to the frequency of Victoria lineage isolates among all the IBV isolates ( Fig 5B). Our model can predict the frequency of lineage as well. The predicted frequency of the Victoria lineage using the observed emergence timing by R 0,n is 0.58 (95%HPD: 0.01-0.99), the predicted frequency using estimated emergence timing by R 0,n is 0.61 (95%HPD: 0.01-0.99), the average predicted frequency of Victoria among varied emergence timings is 0.64 (95%HPD: 0.12-0.97), and the observed frequency of Victoria was 0.51 from the field data.

Discussion
In this paper we estimated the dynamics of two major IBV lineages using a parsimonious mathematical model. Although the prediction of the dominant lineage of IBV is important for vaccine strain selection, the complex lineage dynamics of IBV makes this prediction difficult. Our result suggests that the prediction of the dominant lineage of IBV may be possible if the epidemiological interference between lineages was quantified.
Our estimates of the waning rate of immunity suggest that immunity against the Yamagata lineage is shorter than immunity to Victoria. This tendency is robust even when the seasons of the data for the estimation were changed  Table 2). There are two possible interpretations of this result: i) the antigenicity of Yamagata changes much faster than Victoria or ii) the immunity against Yamagata wanes faster than Victoria. The results of phylogenetic analysis and the antigenicity measured by hemagglutination inhibition assay suggest that the Victoria lineage is under stronger selection pressure due to host immunity, and the antigenicity of Victoria changes faster than that of Yamagata [20]. Therefore hypothesis i) is not likely to be true. On the other hand, the hypothesis ii) is not rejected by the result of phylogenetic and antigenicity analyses. If the immunity itself wanes rapidly, the lineage cannot be under selection pressure by host immunity and the change of antigenicity is not essential for the persistence of the lineage. Furthermore, the broader age-distribution of infection of Yamagata than Victoria implies frequent re-infection with the Yamagata lineage [20], supporting the possibility of rapid waning immunity against the Yamagata lineage. The clinical trial of vaccines against the Yamagata lineage showed stronger immune reaction to Yamagata than to Victoria [23], but this immunity may persist temporarily and wane rapidly. To conclude, whether the immunity against the Yamagata lineage is of long or short duration will require further study. Our estimates of lineage-specific reproduction numbers agree with phylodynamic analysis [20]; the average reproduction number of Victoria is larger than that of Yamagata. The seasonal fluctuation of the reproduction number of Victoria is also larger than that of Yamagata. Our estimate of the reproduction number takes into account both cross-reactivity of immunity between Victoria and Yamagata and waning immunity. If we misestimated these two factors, estimated values would be far from the estimate by phylodynamic analysis.
Our estimation suggests that the cross-immunity between Victoria and Yamagata is high enough that infections by one lineage suppress infections to the other lineage at the population level. However, cross-immunity between the two lineages cannot protect individuals from infection. This highlights the importance of the selection of the IBV vaccine lineages in the countries where trivalent influenza vaccines are used.
Prediction of strain dynamics requires long time series data of lineage specific isolates. In fact, prediction using data from only one year cannot capture the strain dynamics, and the opposite lineage of a year's dominant lineage, as shown in the data, was selected as the dominant lineage each time (the data is not shown in this paper). Surveillance with an appropriate and consistent study design is essential for predicting the lineage dynamics for vaccine strain selection.
For simplicity, our model assumed constant antigenic evolution within each lineage. This assumption is based on the fact that no new lineage has emerged after the branching of IBV to Yamagata and Victoria, which suggests it has relatively stable evolutionary dynamics compared to IAV. Our model is sufficient for the short-term prediction of lineage dynamics, however, when we predict the emergence of new lineages and the extinction of current lineages, the evolutionary dynamics at the quasi-species level would need to be taken into account.
Even though the dominant lineage can be predicted, selection of the specific strain is still required. Especially, the antigenicity of the Victoria lineage changes rapidly, so further understanding of the evolution of the Victoria lineage is necessary. Phylodynamic studies showed that the time series change of the genetic diversity of Victoria is similar to influenza A H3N2 and H1N1, so the model must take into account evolution at the strain level [16,17,24,25].  In conclusion, we developed a parsimonious mathematical model describing the lineage dynamics of IBV. Using the weekly number of lineage specific isolates we estimated the reproduction number, the waning rate of immunity, and the strength of cross-immune reaction. Our prediction suggests that models taking into account epidemiological interference due to cross-immune reaction and the seasonality of transmission can predict the lineage dynamics of IBV for the next year.