## Figures

## Abstract

### Background

Seasonal influenza outbreaks are a serious burden for public health worldwide and cause morbidity to millions of people each year. In the temperate zone influenza is predominantly seasonal, with epidemics occurring every winter, but the severity of the outbreaks vary substantially between years. In this study we used a highly detailed database, which gave us both temporal and spatial information of influenza dynamics in Israel in the years 1998–2009. We use a discrete-time stochastic epidemic SIR model to find estimates and credible confidence intervals of key epidemiological parameters.

### Findings

Despite the biological complexity of the disease we found that a simple SIR-type model can be fitted successfully to the seasonal influenza data. This was true at both the national levels and at the scale of single cities.The effective reproductive number *R _{e}* varies between the different years both nationally and among Israeli cities. However, we did not find differences in

*R*between different Israeli cities within a year.

_{e}*R*

_{e}was positively correlated to the strength of the spatial synchronization in Israel. For those years in which the disease was more “infectious”, then outbreaks in different cities tended to occur with smaller time lags. Our spatial analysis demonstrates that both the timing and the strength of the outbreak within a year are highly synchronized between the Israeli cities. We extend the spatial analysis to demonstrate the existence of high synchrony between Israeli and French influenza outbreaks.

### Conclusions

The data analysis combined with mathematical modeling provided a better understanding of the spatio-temporal and synchronization dynamics of influenza in Israel and between Israel and France. Altogether, we show that despite major differences in demography and weather conditions intra-annual influenza epidemics are tightly synchronized in both their timing and magnitude, while they may vary greatly between years. The predominance of a similar main strain of influenza, combined with population mixing serve to enhance local and global influenza synchronization within an influenza season.

**Citation: **Huppert A, Barnea O, Katriel G, Yaari R, Roll U, Stone L (2012) Modeling and Statistical Analysis of the Spatio-Temporal Patterns of Seasonal Influenza in Israel. PLoS ONE 7(10):
e45107.
https://doi.org/10.1371/journal.pone.0045107

**Editor: **Maciej F. Boni,
University of Oxford, Viet Nam

**Received: **April 10, 2012; **Accepted: **August 14, 2012; **Published: ** October 8, 2012

**Copyright: ** © Huppert et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Funding: **This work was supported by the EU-FP7 Epiwork grant, the Maccabi Institute for Health Services Research Grant, the Israel Science Foundation and the Israel Ministry of Health. UR is supported by the Adams Fellowship Program of the Israel Academy of Sciences and Humanities. RY is supported by the Israel National Institute for Health Policy and Health Services Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Seasonal influenza outbreaks are a serious burden for public health worldwide. Seasonal influenza is mainly a self-limiting disease, and in most patients results in only moderate illness without need for medical treatment. Nevertheless, it is estimated to cause morbidity to millions of people each year. In addition, influenza poses a major risk to chronic patients of all ages especially the elderly, to whom it causes more severe morbidity and is associated with a higher death rate [1]–[3]. The global mortality from the disease is estimated at 250,000 to 500,000 cases annually [4], [5]. Furthermore, the economic burden of seasonal influenza is estimated to be 11 billion US dollars a year in the US alone [6]. This includes morbidity, mortality, hospitalizations and absenteeism from work and school. As early as 1952 the WHO established the Global Influenza Surveillance Network. However there are major problems regarding the reliability of influenza data [7], [8]. The major difficulty is that the disease has no clear-cut clinical signs and can be easily confused with other respiratory illnesses having similar symptoms [9]. In addition, influenza illness is often slight or moderate and a significant number of infected individuals are asymptomatic, so that many individuals with the disease do not seek any medical care. In order to better estimate the burden of influenza, different sources of influenza data have been used to study the disease, such as Influenza-Like Illness (ILI) diagnoses, virus isolations, death records from pneumonia and influenza, physician visits and web search queries [2], [4], [10]–[12].

Influenza is predominantly seasonal, with epidemics occurring every winter, but the severity of the epidemic and the exact timing of the outbreak vary substantially between years (see figure 1). Influenza has also been studied with respect to its geographic spread at different scales. Viboud et al. (2006) have studied the spread of influenza across the United States and found that there is higher synchrony between more populous states [12]. In other studies Viboud et al. (2004) and Chowell et al. (2008) analyzed the synchrony on a global scale comparing outbreaks in the US, France and Australia [13], [14] and found high synchrony between the US and France and no synchrony between the Northern and Southern hemispheres (even after adjusting for the two hemispheres being out of phase). In the smaller spatial scale of France, Bonabeau et al. (1998) demonstrated how the disease rapidly spreads across the country, and concluded that when modeling the initial spread of influenza it is sufficient to assume global homogeneous mixing to capture the dynamics. Geographic heterogeneities and density dependence affect the dynamics of the disease around local outbreak peaks [15].

(A) Weekly number of ILI cases per 100,000 Maccabi Health Services members in Israel, Tel Aviv and Jerusalem. The text next to the peaks indicates dominant subtype (NDS = no dominant subtype). (B) Weekly number of ILI cases per 100,000 people in Israel (Maccabi Health Services members) and France (sentinel clinics patients). Note different scales for each time series, as French incidence is ∼3 times higher than Israeli incidence.

Israel provides a unique opportunity to study the spatial spread of influenza at a small scale since the country is only approximately 22,000 km^{2}. We use a highly-detailed influenza database, both temporally and spatially (see data section) collected in Israel for over 11 years. The data composed of ILI diagnoses in a subset of some 23.8% of the Israeli population. We initially examine the ILI data on a national level (i.e. aggregated data) to understand the year to year variability of influenza epidemics both in their timing and magnitude. In the second part of the paper we examine spatial aspects of influenza in Israel with emphasis on the spatial synchrony. The following patterns are of particular interest: i) Differences in ILI dynamics and key epidemiological parameters between the major cities in Israel (in time and space) ii) Synchrony between different Israeli cities within a year iii) Comparing the “local” Israeli synchrony with the intercontinental synchrony of influenza outbreaks tested on the time series of Israel and France.

### Modeling Approach

Figure 1 displays the time series of influenza outbreaks (ILI cases) over eleven years when aggregated spatially over all of Israel. There is considerable variation between years in both the peak height of the outbreak, the total attack rate, and the times at which the epidemics reach their maxima during the winter months (December to March (figure 2A). The high quality of the entire dataset both in terms of temporal and spatial resolution (see data section), is a key feature that motivates the following modeling analysis. The data were analyzed and modeled both: i) at the national level, using the total number of ILI cases aggregated over the entire country (figure 1), and ii) in the seven largest cities of the database (in terms of population size). These cities provide a picture of the spatial variation of influenza across the country.

Top (A): superimposition of eleven seasons of daily ILI incidence in Israel. Starting date is June 1^{st}. Middle (B): superimposition of ILI incidence in seven cities in the 1998–1999 season where *R _{e}* = 1.2 was low and the correlation between the different cities is relatively weak. Bottom(C): ILI incidence in the same seven cities during the 2006–2007 season where

*R*= 1.6 was high and the correlation between the different cities was strong.

_{e}A discrete time age-of-infection SIR epidemic model, as formulated in Katriel et al, (2011) [16], is used to fit the Israeli ILI data and estimate epidemiological parameters in order to gain a better understanding of the influenza dynamics (see Methods for model description). The SIR framework assumes that individuals within a population can be divided into three categories or compartments: Susceptibles, Infected or Recovered. Disease transmission within the population is modeled by tracking the changes in numbers of individuals within these compartments. Although SIR-type models have become almost the gold standard for modeling bacterial and viral respiratory infectious diseases, it is to some degree an oversimplified model for influenza due to the virus's ability to rapidly mutate giving rise to its characteristic antigenic drift [17], [18]. Earn et al (2002) argue that “it seems impossible to avoid a much greater degree of model complexity [than the SIR]. The primary obstacle to simple compartmental modeling of flu is antigenic drift.” [7] In order to bypass the difficulties of modeling continuous evolutionary changes in influenza we have chosen to model individual years separately as single influenza outbreaks, a practice that may be found in the important studies of Baroyan et al (1971) [19] in the USSR, Spicer (1979) [20] in the UK and recently by Chowell et al (2008; 2010) for the US, France, Australia and Brazil [14], , Cintron-Arias et al (2009) who modeled US epidemics [22] and van Noort et al (2011) who used the same approach to modeling influenza time series as single consecutive outbreaks in the Netherlands, Belgium and Portugal [23]. Our analysis goes further than these studies in the manner that it uses a specially formulated statistical approach for estimating key epidemiological parameters.

The basic reproductive number *R*_{0} is an important and widely employed index for quantifying individual epidemic growth rates [24]. *R*_{0} is defined as the average number of people infected by a typical individual over the disease infectivity period in a totally susceptible population. In general, it is extremely difficult to estimate *R*_{0} because the initial population is rarely totally susceptible. Most studies of influenza therefore only attempt to estimate the “effective *R*_{0}”, or *R _{e}*, which is a composite index:

*R*

_{e}= R_{0}•

*S*

_{0}, where

*S*

_{0}is the proportion of the population who are susceptible at the beginning of an outbreak. The effective reproduction number

*R*should be interpreted as the average number of people an infected person infects during the course of their illness in a population, a fraction

_{e}*S*

_{0}(

*S*

_{0}<1) of which is susceptible. But

*R*tells us little about

_{e}*R*

_{0}since

*S*

_{0}(an important variable in its own right), is difficult to estimate directly [25]. There are many methods documented in the literature for estimating

*R*and nearly all of them calculate the rate of exponential growth of the infected population in the first phase of an outbreak [26]–[28]. Recently there have been several methods developed for estimating both

_{e}*R*

_{0}and

*S*

_{0}which are usually more complex and in most cases require fitting an SIR type model to the data [22], [23], [29]–[31]. Certainly there would be a major advantage in having the ability to separate out these two parameters because they have very different biological meanings. In the approach used here, we take advantage of the fact that for the Israeli dataset, the entire epidemic curve is available for analysis and not just the initial phase. We are thus able to fit the age-of-infection SIR model to the full epidemic curve to obtain estimates of both

*R*and

_{0}*S*for each epidemic outbreak. In Table 1 the estimates of

_{0}*R*obtained using the entire curve (see methods) are given together with estimates of

_{e}*R*calculated using the classical method [32] which estimate the exponential growth of the number of infected people. These two estimates of

_{e}*R*are significantly correlated (r = 0.91, p = 0.0007).

_{e}Our methodology gives estimates of *R _{0}* and

*S*under the assumption that all influenza cases in the population are reported and thus that the ILI time series are a product of a surveillance system with a perfect, or 100%, reporting rate. However, such a situation is never the case in practice. If we assume that the reporting rate of the surveillance captures the proportion

_{0}**(0<**

*r**r*<1) of true influenza cases, then we refer to our estimates in the presence of partial reporting as and while the desired values under perfect reporting are referred to as . They are related as follows [25]:(1)

Since the true reporting rates (*r*) of surveillance systems are rarely quantified accurately, they become an important limiting factor when trying to estimate these key epidemiological parameters. However, for well managed surveillance systems, it can be assumed that ** r** is reasonably constant over extended times, in which case it would be possible to capture the relative values and trends of these parameters as they change in time. Unfortunately there have been no studies of the reporting rate stability of the Maccabi data. We have compared the data to another independent Israeli surveillance and found almost identical trends over the same period indicating the reporting rate has changed little. Nevertheless, as we note in the Data section, there was a change in the ILI coding in 2002, which might possibly have changed the reporting rate in that year, presumably to a small degree.

In our data, it was found that on average 1.5% of all Maccabi members were infected with influenza annually (see table 1 for full list of attack rates). However, in the US, average overall attack rates are estimated to be 10–20% [33], [34], and France some 12–15% [35]. This, together with discussions with the Israeli Ministry of Health, motivated our setting of the reporting rate to 10% (r = 0.1), yielding an attack rate in Israel (adjusted to 15%) consistent with that reported in the literature for other countries [8], [14]. In this case, the average estimates of the true *R _{0}* in Israel using equation 1 should be

*R*∼4.9 which is close to the rough calculation of Katriel and Stone (2010) who estimated

_{0}*R*∼3.75 [25].

_{0}We note that the parameter *R _{e}* has the interesting property that it is entirely independent of the reporting rate since:(2)

Thus estimations of *R*_{e} remain unaffected by spatial differences or temporal changes in reporting rates.

## Results

### Temporal Features

The time course of each of the outbreaks in the aggregated Israeli time series was fitted using the discrete-time SIR model (see Methods). In general the model accurately reproduced the time course of the Influenza A epidemics as demonstrated by the simulation run shown in figure 3A based on data for the year 2007–8. Figure 3C provides a more general picture by displaying model fits for each epidemic in all eleven years as compared to the observed data, based on a cumulative plot, similar to a Q-Q plot (cf [30] and methods). The cumulative incidences of the observed data are plotted on the x-axis, while the cumulative incidence produced by the SIR model is plotted on the y axis. The solid straight diagonal lines are reference lines which connect all points (*I*_{t}, *I*_{t}) of the observed data. The lower points in each series represent the early stage of the epidemic and the top points represent the late stage of the epidemic. Perfect fits between model and data would result in all points falling on the diagonal reference lines. This is to some degree achieved in the fits of the Israeli Influenza A data. The fits of the Influenza B seasons, however, as seen also in figure 3B, are of a lower quality.

Top (A): model fits to the 2007–2008 season (main) and the 2002–2003 season (inset (B)). Bottom(C): model fits to all eleven seasons 1998–1999 to 2008–2009 where observed cumulative infectives is plotted as a function of model SIR cumulative infectives (dots). Diagonal lines indicate a perfect fit of the model to the data; Day 1 in each season corresponds to June 1^{st}. Time length of each epidemic varies from season to season due to differences in the start and end date of the best-fitting model.

Also included in figure 1 is a plot of the well-studied influenza (ILI) dataset collected in France (see figure 1B and data section for details) which will be used for the purposes of a comparative study with Israel. We found that the French epidemic data (figure 1B) could also be modeled with good accuracy by the SIR fitting procedure. However, for both the Israeli and French datasets, we were unable to fit influenza B years with the same level of accuracy, since the epidemic curves corresponding to influenza B outbreaks were more asymmetric than the standard SIR model could accommodate for (figure 3B). Intriguingly, when correlated against its year the estimates for in the Israeli aggregated data showed a significant (p = 0.006, *R*^{2} = 0.69) long-term increase over the eleven years (figure 4A).

The relationship between season (time) and *R*_{0} (A), season and *S*_{0} (B), *R*_{e} and maximum number of ILI cases at the peak of the outbreak (C). In these analyses we used only A seasons (n = 9).

The estimates for *R*_{0} (based on r = 0.1) varied between the lowest *R*_{0} = 2.95 in 2000–2001 and a maximum of *R*_{0} = 8.16 during the 2006–2007 outbreak, with an average of *R*_{0} = 4.9 (see table 1 for full details). We note that it is unlikely that the increase in *R _{0}* is due to an increase in reporting rates over this period. Had there been an increase in reporting rate, one would expect a corresponding increase in attack rate over the time-period, but this does not appear to occur. In addition the analysis was repeated after excluding the first years (where the coding system was different) and the trend remained (see figure 4). Interestingly, Spicer (1979) [20] also noted an increase in transmissibility (equivalent to an increase in

*R*) with time after a new strain of H2N2 appeared: “transmissibility was low in the early stages of the introduction of the new Asian (H2N2) virus subtype in 1957 and the Hong Kong (H3N2) virus subtype in 1969–1970 and high for some years after. This is biologically plausible if the virus is adapting to new conditions of spread.”

_{0}While *R _{0}* exhibited a long-term increase over the years, the fraction of susceptibles

*S*

_{0}showed a significant decrease with time (p = 0.0006, r = −0.91) as shown in Figure 4B. The maximum fraction of susceptible was found to be

*S*

_{0}= 40.4% during 2000–2001 while the minimum was

*S*

_{0}= 19.6% obtained in 2006–2007; the average being

*S*

_{0}= 29.2%.

The average *R*_{e} for influenza A in Israel was found to be *R*_{e} = 1.32, was lowest with *R*_{e} = 1.17 in 2000–2001 (characterized by a dominant H1N1 virus), and highest with *R*_{e} = 1.6 in 2006–2007. The estimates of *R*_{e} for Israel are very similar to estimates for seasonal influenza in other parts of the world Australia, US and France (the average in all 3 countries was *R _{e}* = 1.3) [8], [14], but higher then recent estimates of Chowell et al 2010 for Brazil, USA and France

*R*

_{e}= 1.06,

*R*

_{e}= 1.14 and

*R*

_{e}= 1.14 respectively [21]. While using our model to estimate

*R*in France during the same time period gave an average estimate of

_{e}*R*

_{e}= 1.36.

Over the eleven years of this study, both *R*_{0} and *S*_{0} vary considerably (e.g., *R*_{0} varied between 2.95–8.16 while *S*_{0} varied between 19.6–40.4%).It is puzzling why the observed decrease in the susceptible fraction of the population over time is balanced out by the increase in the reproductive number *R*_{0} so that the product *R _{e} = S_{0}•R_{0}* changes to a relatively limited degree and is always slightly larger than unity (figure 4 A, B). The statistical analysis also showed a significant and high correlation r = 0.95 (p≪0.0001, N = 9, all influenza A seasons), between the magnitude of

*R*

_{e}and the number of ILI cases in the smoothed peak of the epidemics (see figure 4C) in the aggregated national data.

### Spatial aspects of influenza in Israel

One of the most striking features regarding the dynamics of influenza in Israel is the strong spatial synchronization. Figures 1 and 2, show very clearly that the time series of major cities in the country are highly correlated. In order to quantify how the ILI varies spatially across the country, we focus on the two main aspects of the ILI data: the magnitude of the outbreaks in the different cities and the timing or temporal differences in the occurrence of the outbreaks which also varies spatially.

### The variation of R_{e} across Israel

For the purposes of examining spatial differences in the magnitude of influenza outbreaks across Israel we make use of *R _{e}* as an index for epidemic intensity [14]. Given that

*R*

_{e}is a measure that is independent of reporting rate, it is useful for conducting comparisons especially since there are indications that the reporting rates of various cities can be quite different (see data section).

An interesting outcome of our analysis is the finding that there are no statistically significant differences in *R _{e}* between the different cities

**within a year**(ANOVA test, p = 0.46, F = 0.97). In contrast

*R*varies (both nationally and among the Israeli cities) between the different years in a manner that is highly statistically significant (ANOVA test, p = 1.65×10

_{e}^{−13}, F = 17.7). Our results are consistent with previous studies, as for example [13] who conclude for the US, France and Australia that “while the average inter-pandemic

*R*seems rather invariant across geographical locations at around 1.3 there is substantial year to year variability around this average”.

_{p}### Quantifying the spatial synchrony in Israel

As figure 1 shows, the ILI cases from the two major Israeli cities, Tel Aviv and Jerusalem, appear to be tightly synchronized with a correlation of r = 0.91 (p≪0.001). Both cities are also highly correlated to the spatially aggregated Israeli ILI data having respective correlation coefficients r = 0.98 (p≪0.001) and r = 0.92 (p≪0.001). It is thus not surprising that the attack rates (i.e., the total number of cases per year) of Tel Aviv and Jerusalem are also correlated to the attack rates in Israel with a correlation coefficient of r = 0.92, (p≪0.001) and r = 0.91 (p≪0.001) respectively.

To explore synchrony further we examined whether the timing of the influenza outbreaks observed in Tel Aviv and Jerusalem are more synchronized than expected by chance. Two different tests were devised:

**Phase Analysis:**The Tel Aviv and Jerusalem time series were superimposed and the peak date of each epidemic in each time series was identified. In the phase analysis [36] we let Δ_{i}represent the time difference between the peak of the outbreak in Tel Aviv and that in Jerusalem for the i'th year, and calculated(3)for the observed data. In step two, the annual outbreaks of Jerusalem were randomly reshuffled over the eleven years [37]. The reshuffling required breaking up the Jerusalem time series into eleven separate years (or outbreaks) and then randomly reordering their sequence of occurrence. was then calculated. This was repeated N = 100,000 times to obtain the distribution of*S*. The index_{shuffle}*S*was found to be larger than the observed value_{shuffle}*S*in 99,985 of the N = 100,000 reshufflings. Therefore, Tel Aviv and Jerusalem are significantly synchronized (p<0.00014) in terms of the timing of their peaks. An analysis of the nine influenza A seasons (i.e., excluding from the analysis years dominated by influenza B) gave very similar results, with p<0.00024._{obs}**Correlation Analysis:**Similar to the above test, the correlation*r*between the Tel Aviv and Jerusalem time series of infectives was measured for both the observed and reshuffled time series. Again, the reshuffling involved breaking up the Jerusalem time series into eleven separate years (or outbreaks) and then randomly reordering their sequence of occurrence. We generated N = 100,000 randomized Jerusalem time series and calculated their correlations_{obs}*r*with the Tel Aviv time series. This procedure gives the probability distribution of_{shuffle}*r*. We found that the correlation between Tel Aviv and the observed Jerusalem time series was higher than the correlation calculated from the randomized Jerusalem time series in all 100,000 cases. This occurred both when all 11 seasons were analyzed and when influenza B seasons were excluded from the analysis. The high correlation observed between Tel Aviv and Jerusalem time series is nonrandom and provides strong support for the notion that the cities are synchronized over and above the background synchrony of the irrepressible annual winter outbreaks._{shuffle}

### Variability of the Spatial Synchrony

We found that different years have different characteristic strengths of spatial synchrony. As a reference frame, figure 2A plots a superimposition of all 11 outbreaks occurring over the 11 seasons in the aggregated national data, and gives an indication of the (relative lack of) temporal synchrony *between* years. This should be compared with figures 2B and C, which are plots of the time series for the seven largest cities for the years 1998–9 (figure 2B) and 2006–7 (figure 2C). The former is an example of a year with relatively low spatial synchrony while the latter is the year with the maximum synchrony among the Israeli cities. Comparing the figures it is easy to see that the synchrony within a year (figure 2B and C) is far stronger than the synchrony between years (figure 2A).

### Epidemic Synchronization between Israel and France

To gain further insights into the synchrony dynamics of influenza in Israel, we studied its relationship to a distant European country - France. Figure 1B displays the aggregated ILI time-series of both countries. Visually one observes in figure 1B that the level of synchrony between France and Israel is surprisingly high (with the small exceptions of the 2004–2005 season and the 2006–2007 season where the outbreak in Israel occurs a few weeks before France). The correlation coefficient between the time series was correspondingly high with r = 0.71. The synchrony was enhanced through the appearance of influenza B which was the dominant virus (i.e., the small peaks in 2002–3 and 2005–6) occurring simultaneously in the same years in both countries (see also [38]).

In order to quantify the temporal synchrony between the two countries we again performed the phase and correlation analyses for the timing of the outbreaks. Both tests found Israel and France to be significantly synchronized, in the timing of influenza outbreaks (p = 0.014 and 0.008, phase and correlation tests respectively). Results were still significant when B seasons were omitted (p = 0.037, p = 0.045 phase and correlation tests respectively).

In addition we tested whether the intensity in the outbreaks between the two countries was correlated. The values of both peak heights and of *R*_{e} were calculated and correlated for all the 9 outbreaks between1999–2009. A significant correlation between Israel and France was found in both peak heights (r = 0.6; p = 0.05) and between the values of *R*_{e} (r = 0.75; p = 0.02). The statistical tests indicate that for both Israel and France i) there are very minor differences in the timing of the epidemics (phase and correlation analyses) and ii) large/small outbreaks tend to occur in the same years in both countries. Nevertheless, and as expected, the correlation between the Israeli cities is higher than the correlation between the two countries.

## Discussion

The high quality of the Israeli ILI data has enabled us to explore the spatio-temporal dynamics of influenza in Israel over eleven years. The fact that the simple SIR model can be fitted successfully to data of seasonal influenza A in both the national (aggregated data) level as well as in the scale of single large cities is notable in light of the complex epidemiology of influenza [7]. This is consistent with previous work from the Soviet Union, United Kingdom, France, United States Australia, the Netherlands, Belgium, Portugal and Brazil [14], [20]–[23], [39], [40]. One of the main advantages of using the modeling approach proposed here is the ability to estimate separately the two components of *R _{e}* namely

*R*and

_{0}*S*, which limited many previous studies of influenza [24], [34]. We found that over the study period the value of

_{0}*R*increased in time (figure 4A). Interestingly this increase was “balanced” by a decrease in

_{0}*S*. One possible speculation that can explain the observed increase in

_{0}*R*could be the evolutionary adaption of the virus to become a more efficient infector [41]. For example, the new strain of pandemic influenza, the H1N1 swine flu virus, had relatively low transmission during the swine flu pandemic [16], [42]–. However pandemic influenza is potentially far more dangerous because the immunity of the population to the pandemic virus is “expected to be” lower than the circulating seasonal influenza strains (i.e., a high

_{0}*S*). It is believed that in the future, the H1N1 virus (which is now considered a seasonal strain) will adapt to become a better infector as has happened with previous pandemic and seasonal strains [45]. As mentioned above, in parallel to the increase in

_{0}*R*the population susceptibility is expected to decrease due to an increase in the population immunity as more of the population is exposed to the new virus strain. The exact mechanisms which lead to the observed negative correlation in

_{0}*R*and

_{0}*S*found here needs to be further studied in order to better understand the dynamics of influenza. We note again that the estimate of

_{0}*R*remains independent of reporting rate. Thus even though there is strong under-reporting in our data, the

_{e}= S_{0}•R_{0}**estimations of**

*R***In Israel the value of**

_{e}remain unaffected.*R*is strongly correlated to the magnitude of the peak height of influenza outbreaks (figure 4C). Knowledge of

_{e}*R*

_{e}is essential for understanding and controlling the spread of an infectious disease [24], [45]. For instance the proportion of the population which needs to be vaccinated in order to reach herd-immunity is a function of

*R*. Recent studies have shown that a reliable estimation of

_{e}*R*for seasonal influenza can be obtained within a period of 4 weeks after the initiation of the disease [14]. Therefore, it is possible to estimate

_{e}*R*, in the first few weeks of the season and use this information to predict the upcoming epidemic.

_{e}*R*_{e} was also positively correlated to the strength of the spatial synchronization in Israel. We found that in cases where the disease is more “infectious” (i.e., higher *R*_{e}) then the outbreaks in different cities tend to occur with smaller time lags (see figure 2B, C). It may be hypothesized that the higher *R*_{e} implies a more forceful infectivity which increases the synchronization and reduces the variability in the timing and magnitude of the peaks between the different cities (see figure 2). It would be an interesting future direction of study to examine the capacity of this hypothesis to explain the observed correlation between *R*_{e} and synchronization by studying simulations of explicit spatial models.

Modeling studies have shown the sensitivity of the severity of influenza outbreaks to small demographic and environmental changes [46], which implies that small difference in environmental factors can cause large differences in the size of outbreaks between different locations. Nevertheless, we see notable resemblance between the time series within a year across large geographical distances (e.g., Israel and France).

Using laboratory-confirmed influenza surveillance data, Finkelman et al (2007) reported large scale co-occurrence of influenza type A and B, and interhemispheric synchrony (i.e., the dominant strain within a season is the same for most of the hemisphere) [38]. Another example of high synchrony between Israel and France can be seen in Figure 1B, where in the outbreak of 2003–4 there is an early epidemic in both countries. Interestingly the 2003–4 season was dominated by a new influenza strain (A/Fujian) which peaked early in many different countries (probably due to the fact that the population immunity to the new strain was low and therefore *S*_{0} was high, leading to an early outbreak) in the northern hemisphere [50]. Another factor leading to the high spatio-temporal synchrony within Israel is due to short travel distances between cities in a small country. Interestingly, spatio-temporal synchrony is high even between Israel and France, despite the much larger geographic scale. It is possible that here to, a single dominant influenza strain prevails in both countries in each season and air-travel between the countries aids temporal synchrony.

A second perplexing observation between the Israeli cities can be seen from examining the estimates of *R*_{e} for two very different cities such as Bnei Brak and Tel Aviv. As opposed to Tel-Aviv Bnei Brak has large ultra-orthodox religious communities that are characterized by large families with many children. The two cities differ in many important demographic aspects (e.g. household size, age structure and population connectivity) that are thought to influence the dynamics of influenza [47]–[49]. It is expected that significant variation in demographic factors would lead to observed differences in the value of a key parameter such as *R*** _{e}**. During our study period there were no statistical differences between the value of

*R*between different cities (within a year), and even countries. The fact that the size of

_{e}

*R***is rather “constant” between the Israeli cities is in line with the findings of Baroyan et al (1977) [50], Spicer (1979) [20] and Chowell (2010) [21] which were obtained on the much larger geographical scale of the USSR and Brazil. Spicer (1979) remarks: “The most striking and unexpected feature of the model is that the parameter on which the spread of an epidemic within a city depends is the same for every city in any one epidemic within the USSR” [20]. Nevertheless, in recent years extremely complicated models were developed to model pandemic influenza [47]–[49], [51]. The models include many biological demographic and sociological complexities which are believed to be important in capturing the dynamics of pandemic influenza. For instance, Merler et al 2011 [51] had to incorporate information on intra-European mobility and the different socio-demographic structure of the different European countries in order to reproduce the observed spatial pattern of the West to East spread of the 2009 pandemic in Europe. In contrast, demographic structure did not appear to have impact on the spread of influenza in Israel.**

_{e}Additional data on the spatial spread of influenza combined with statistical analysis is required to better understand how different population demographics effect *R _{e}* and the propagation of the disease within different communities.

## Methods

### Data

Our dataset consisted of all Influenza-Like Illness (ILI) cases in Israel diagnosed daily by Maccabi Health Services doctors, between January 1^{st} 1998 and May 31^{st} 2009. Diagnosis codes included in this database are ICD9 code 487.1 (influenza) and internal Maccabi codes for influenza, influenza-like disease and swine influenza. The last year of data, in which the swine flu pandemic occurred, was excluded. ICD9 codes were used exclusively in 1998–2002, when there was a transition to internal diagnosis codes. The ILI data were corrected for repeat visits. The criterion chosen to filter out repeat visits from the data was recommended by the Israeli Center for Disease Control, and defines a visit as a repeat visit if it comes within 28 days of a pervious visit with ILI symptoms. The data exhibits a strong weekly cycle due to the absence of weekend reporting. Hence, the data was smoothed using a 7-day moving average kernel [52] red line in Figure 1A. The dataset includes 7 seasons in which the dominant strain was influenza A H3N2, two seasons in which the dominant strain was influenza B, one season in which the dominant strain was influenza A H1N1, and one season in which no strain was dominant. Historical data of dominant influenza strains in Israel is available at the World Health Organization's FluNet website at http://apps.who.int/globalatlas/dataQuery/default.asp.Maccabi is the second-largest Health Maintenance Organization (HMO) in Israel and insures about 23% of Israel's population. During the period analyzed the number of Maccabi members varied between 1.37 and 1.86 million people. Several works based on this dataset have already been published [30], [53], [54]. The French ILI dataset is taken from the Sentinel network - a network of ∼1200 General Practitioners (GPs) in France who, since 1984, regularly collect data about diagnoses of 12 diseases and report it via the internet. ILI is defined in the Sentinel network as sudden temperatures greater than 39°C, myalgia and cough/running nose. The data received from the GPs are then processed to estimate the number of ILI cases per 100,000 residents in each region, by using population data and the fraction of GPs taking part in the surveillance out of the total number of GPs. Since the frequency of reporting is irregular and is left for each doctor to decide, the data presented in the Sentinel network website are weekly aggregate incidences [55].

### Model

We used an SIR discrete-time age-of-infection model as described in [56]. The total population is denoted by *N*. The number of susceptibles at the end of day *t* is denoted by *S(t)* while the number of people who become infected on day *t* is denoted by *i(t)*. It is important to emphasize that *i(t)* here counts only the newly-infected individuals on day *t*. Note the key relationship:(4)

It is assumed that each individual has, on average, *β* contacts with other random individuals per day. A person who becomes infective retains a certain (and non-constant) degree of infectivity for *d* days. The number of days since a person's infection is termed its age of infection. Therefore, the number of infectives whose age of infection is *τ (1≤τ≤d)* on day *t* is *i(t−τ)*. When a susceptible meets an infective person whose age-of-infection is *τ (1≤τ≤d)*, the susceptible becomes infective with probability *P _{τ}*. The vector

*P = (P*thus defines the infectivity profile, and is a key parameter of the model [16]. The values for the vector

_{1}; …; P_{d})*P*were obtained from the comprehensive review paper about influenza viral shedding by Carrat et al 2008 [57]. The values are

*P*= (0.073, 0.181, 0.222, 0.185, 0.137, 0.09, 0.056, 0.032, 0.016, 0.008).

The probability that any single susceptible becomes infected during day *t* is given by:(5)In a deterministic variation of the model, the daily number of infectives is, for large *N*:(6)We use the deterministic model to simulate the time-series of figure 3.

The above model also has a stochastic formulation (Katriel et al. 2011) whose log-likelihood can be shown to be (Katriel, manuscript):(7)It is possible to estimate the parameters *S _{0}* and

*R*by numerically maximizing the above log-likelihood expression. The maximization should be carried out for β>0 and over integers

_{0}*S*in the range (as the number of susceptibles at the beginning cannot be smaller than the total number of cases).

_{0}It is important to demonstrate the statistical identifiability of the parameters *S _{0}* and

*R*from the data, using the likelihood function (equation 7) [58].To achieve this, we generated contour plots of the function for each season. Figure 5 displays this plot for season 9. As can be observed, the likelihood attains a unique maximum at a point which forms our maximum likelihood estimates, and the region in which the likelihood is close to the maximal values is small enough to provide rather narrow 95% confidence intervals for the parameters. The corresponding plots for other seasons are qualitatively similar.

_{0}The colored region includes the sets of parameters giving the maximum likelihood and likelihoods which are up to 3 units below the maximum likelihood. The upper and lower limits of this region were used as a 95% confidence interval for the *S*_{0} and *R*_{0} values [58].

In another approach that was taken to obtain bootstrap confidence intervals for the effective reproduction number (*R*_{e}), the stochastic version of the model (equations 4–6) was simulated 10,000 times using the exact same parameters (i.e., *β* and *S*_{0} where estimated from the real data using (equation 7)). For each of the simulated epidemics *R _{e}* was re-estimated and the 250 lowest and highest estimates were removed to give the 95% bootstrap confidence intervals.

The values of *R _{e}* were also obtained using the classic method of measuring the rate of exponential growth at the initiation of the outbreak as in [32] The estimates from both methods were highly correlation (R = 0.91, p = 0.0007) (see table 1).

## Acknowledgments

We would like to thank the ICDC (Israel Center for Disease Control) and its director Prof. Tamar Shohat, Maccabi Health Services and Dr. Varda Shalev. The helpful comments Maciej Boni and two anonymous referees are acknowledged with gratitude.

## Author Contributions

Analyzed the data: AH OB GK RY UR LS. Contributed reagents/materials/analysis tools: AH OB GK RY LS. Wrote the paper: AH OB GK UR LS.

## References

- 1. Thompson WW, Shay DK, Weintraub E, Brammer L, Cox N, et al. (2003) Mortality associated with influenza and respiratory syncytial virus in the United States. JAMA: the journal of the American Medical Association 289: 179–186.
- 2. Truscott J, Fraser C, Hinsley W, Cauchemez S, Donnelly C, et al. (2009) Quantifying the transmissibility of human influenza and its seasonal variation in temperate regions. PLoS currents 1: RRN1125.
- 3. Hoen AG, Buckeridge DL, Charland KML, Mandl KD, Quach C, et al. (2011) Effect of expanded US recommendations for seasonal influenza vaccination: comparison of two pediatric emergency departments in the United States and Canada. CMAJ: Canadian Medical Association journal = journal de l'Association medicale canadienne 183: E1025–32 doi:10.1503/cmaj.110241.
- 4. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, et al. (2009) Detecting influenza epidemics using search engine query data. Nature 457: 1012–1014 doi:10.1038/nature07634.
- 5. Lambert LC, Fauci AS (2010) Influenza vaccines for the future. The New England journal of medicine 363: 2036–2044 doi:10.1056/NEJMra1002842.
- 6. Weycker D, Edelsberg J, Halloran ME, Longini IM, Nizam A, et al. (2005) Population-wide benefits of routine vaccination of children against influenza. Vaccine 23: 1284–1293 doi:10.1016/j.vaccine.2004.08.044.
- 7. Earn DJD, Dushoff J, Levin SA (2002) Ecology and evolution of the flu. Trends in Ecology & Evolution 17: 334–340 doi:10.1016/S0169-5347(02)02502-8.
- 8. Truscott J, Fraser C, Cauchemez S, Meeyai A, Hinsley W, et al. (2011) Essential epidemiological mechanisms underpinning the transmission dynamics of seasonal influenza. Journal of The Royal Society Interface doi:10.1098/rsif.2011.0309.
- 9. Gröndahl B, Puppe W, Hoppe A, Kühne I, Weigl JA, et al. (1999) Rapid identification of nine microorganisms causing acute respiratory tract infections by single-tube multiplex reverse transcription-PCR: feasibility study. Journal of clinical microbiology 37: 1–7.
- 10. Cliff aD, Haggett P (1993) Statistical modelling of measles and influenza outbreaks. Statistical methods in medical research 2: 43–73.
- 11. Simonsen L, Clarke MJ, Williamson GD, Stroup DF, Arden NH, et al. (1997) The impact of influenza epidemics on mortality: introducing a severity index. American journal of public health 87: 1944–1950.
- 12. Viboud C, Bjørnstad ON, Smith DL, Simonsen L, Miller M, et al. (2006) Synchrony, waves, and spatial hierarchies in the spread of influenza. Science (New York, NY) 312: 447–451 doi:10.1126/science.1125237.
- 13. Viboud C, Boëlle P-Y, Pakdaman K, Carrat F, Valleron A-J, et al. (2004) Influenza epidemics in the United States, France, and Australia, 1972–1997. Emerging infectious diseases 10: 32–39.
- 14. Chowell G, Miller M, Viboud C (2008) Seasonal influenza in the United States, France, and Australia: transmission and prospects for control. Epidemiology and infection 136: 852–864 doi:10.1017/S0950268807009144.
- 15. Bonabeau E, Toubiana L, Flahault A (1998) The geographical spread of influenza. Proceedings Biological sciences/The Royal Society 265: 2421–2425 doi:10.1098/rspb.1998.0593.
- 16. Katriel G, Yaari R, Huppert A, Roll U, Stone L (2011) Modelling the initial phase of an epidemic using incidence and infection network data: 2009 H1N1 pandemic in Israel as a case study. Journal of the Royal Society, Interface/the Royal Society 8: 856–867 doi:10.1098/rsif.2010.0515.
- 17. Andreasen V, Lin J, Levin SA (1997) The dynamics of cocirculating influenza strains conferring partial cross-immunity. Journal of mathematical biology 35: 825–842.
- 18. Lin J, Andreasen V, Levin SA (1999) Dynamics of influenza A drift: the linear three-strain model. Mathematical biosciences 162: 33–51.
- 19. Baroyan OV, Rvachev LA, Basilevsky UV, Ermakov VV, Frank KD, et al. (1971) Computer Modelling of Influenza Epidemics for the Whole Country (USSR). Advances in Applied Probabilty 3: 224–226.
- 20. Spicer CC, Lawrence CJ (1979) The mathematical modelling of influenza epidemics. British medical bulletin 35: 23.
- 21. Chowell G, Viboud C, Simonsen L, Miller M, Alonso WJ (2010) The reproduction number of seasonal influenza epidemics in Brazil, 1996–2006. Proceedings Biological sciences/The Royal Society 277: 1857–1866 doi:10.1098/rspb.2009.1897.
- 22. Cintrón-Arias A, Castillo-Chávez C, Bettencourt L, Lloyd A, Banks HT (2009) The estimation of the effective reproductive number from disease outbreak data. Mathematical Biosciences and Engineering 6: 261–282 doi:10.3934/mbe.2009.6.261.
- 23. van Noort SP, Aguas R, Ballesteros S, Gomes MGM (2011) The role of weather on the relation between influenza and influenza-like illness. Journal of theoretical biology 298: 131–137 doi:10.1016/j.jtbi.2011.12.020.
- 24.
Anderson RM, May RM (1992) Infectious Diseases of Humans: Dynamics and Control. Oxford University Press, USA. 757 p.
- 25. Katriel G, Stone L (2010) Pandemic Dynamics and the Breakdown of Herd Immunity. PLoS ONE 5: 4 doi:10.1371/journal.pone.0009565.
- 26. Dietz K (1993) The estimation of the basic for infectious diseases. Statistical Methods in Medical Research 2: 23–41.
- 27. Heesterbeek JAP (2002) A brief history of R0 and a recipe for its calculation. Acta biotheoretica 50: 189–204.
- 28. Chowell G, Nishiura H (2008) Quantifying the transmission potential of pandemic influenza. Physics of Life Reviews 5: 50–77 doi:10.1016/j.plrev.2007.12.001.
- 29. Ferrari MJ, Bjørnstad ON, Dobson AP (2005) Estimation and inference of R0 of an infectious pathogen by a removal method. Mathematical biosciences 198: 14–26 doi:10.1016/j.mbs.2005.08.002.
- 30. Barnea O, Yaari R, Katriel G, Stone L (2011) Modelling seasonal influenza in Israel. Mathematical biosciences and engineering: MBE 8: 561–573.
- 31. Coelho FC, Codeço CT, Gomes MGM (2011) A bayesian framework for parameter estimation in dynamical models. PloS one 6: e19616 doi:10.1371/journal.pone.0019616.
- 32. Favier C, Degallier N, Rosa-Freitas MG, Boulanger JP, Costa Lima JR, et al. (2006) Early determination of the reproductive number for vector-borne diseases: the case of dengue in Brazil. Tropical medicine & international health: TM & IH 11: 332–340 doi:10.1111/j.1365-3156.2006.01560.x.
- 33. Cox NJ, Subbarao K (2000) Global epidemiology of influenza: past and present. Annual review of medicine 51: 407–421 doi:10.1146/annurev.med.51.1.407.
- 34. Glezen WP, Couch RB, MacLean RA, Payne A, Baird JN, et al. (1978) Interpandemic influenza in the Houston area, 1974–76. The New England journal of medicine 298: 587–592 doi:10.1056/NEJM197803162981103.
- 35. Finkenstädt BF, Morton a, Rand D (2005) Modelling antigenic drift in weekly flu incidence. Statistics in medicine 24: 3447–3461 doi:10.1002/sim.2196.
- 36. Blasius B, Huppert A, Stone L (1999) Complex dynamics and phase synchronization in spatially extended ecological systems. Nature 399: 354–359 doi:10.1038/20676.
- 37. Stone L (1992) Coloured Noise or Low-Dimensional Chaos? Proceedings of the Royal Society B: Biological Sciences 250: 77–81.
- 38. Finkelman BS, Viboud C, Koelle K, Ferrari MJ, Bharti N, et al. (2007) Global patterns in seasonal activity of influenza A/H3N2, A/H1N1, and B from 1997 to 2005: viral coexistence and latitudinal gradients. PloS one 2: e1296 doi:10.1371/journal.pone.0001296.
- 39.
Bailey NTJ (1975) The mathematical theory of infectious diseases and its applications. 2nd edition. Griffin. 413 p.
- 40. Spicer CC, Lawrence CJ (1984) Epidemic influenza in Greater London. Journal of Hygiene 93: 105–112.
- 41. Ferguson NM, Galvani AP, Bush RM (2003) Ecological and immunological determinants of influenza evolution. 422 doi:10.1038/nature01491.1.
- 42. Roll U, Yaari R, Katriel G, Barnea O, Stone L, et al. (2011) Onset of a pandemic: characterizing the initial phase of the swine flu (H1N1) epidemic in Israel. BMC infectious diseases 11: 92 doi:10.1186/1471-2334-11-92.
- 43. Cauchemez S, Bhattarai A, Marchbanks TL, Fagan RP, Ostroff S, et al. (2011) Role of social networks in shaping disease transmission during a community outbreak of 2009 H1N1 pandemic influenza. Proceedings of the National Academy of Sciences of the United States of America 108: 2825–2830 doi:10.1073/pnas.1008895108.
- 44. Nishiura H, Chowell G, Safan M, Castillo-Chavez C (2010) Pros and cons of estimating the reproduction number from early epidemic growth rate of influenza A (H1N1) 2009. Theoretical biology & medical modelling 7: 1–13 doi:10.1186/1742-4682-7-1.
- 45. Fraser C, Riley S, Anderson RM, Ferguson NM (2004) Factors that make an infectious disease outbreak controllable. Proceedings of the National Academy of Sciences of the United States of America 101: 6146–6151 doi:10.1073/pnas.0307506101.
- 46. Dushoff J, Plotkin JB, Levin SA, Earn DJD (2004) Dynamical resonance can account for seasonality of influenza epidemics. Proceedings of the National Academy of Sciences of the United States of America 101: 16915–16916 doi:10.1073/pnas.0407293101.
- 47. Ferguson NM, Cummings DT, Fraser C, Cajka JC, Cooley PC, et al. (2006) Strategies for mitigating an influenza pandemic. Nature 442: 448–452 doi:10.1038/nature04795.
- 48. Germann TC, Kadau K, Longini IM, Macken C (2006) Mitigation strategies for pandemic influenza in the United States. Proceedings of the National Academy of Sciences of the United States of America 103: 5935–5940 doi:10.1073/pnas.0601266103.
- 49. Longini IM, Nizam A, Xu S, Ungchusak K, Hanshaoworakul W, et al. (2005) Containing pandemic influenza at the source. Science 309: 1083–1087 doi:10.1126/science.1115717.
- 50.
Baroyan OV, Rvachev LA, Ivannikov YG (1977) Modelling and prediction of influenza epidemics in the USSR. (In Russian).
- 51. Merler S, Ajelli M, Pugliese A, Ferguson NM (2011) Determinants of the Spatiotemporal Dynamics of the 2009 H1N1 Pandemic in Europe: Implications for Real-Time Modelling. PLoS Computational Biology 7: e1002205 doi:10.1371/journal.pcbi.1002205.
- 52.
Hastie T, Tibshirani R, Friedman JH (2009) The Elements of Statistical Learning. Springer. 745 p.
- 53. Edlund S, Bromberg M, Chodick G, Douglas J, Ford D, et al. (2011) A spatiotemporal model for influenza. electronic Journal of Health Informatics 6: 1–7.
- 54. Heymann a D, Hoch I, Valinsky L, Kokia E, Steinberg DM (2009) School closure may be effective in reducing transmission of respiratory viruses in the community. Epidemiology and infection 137: 1369–1376 doi:10.1017/S0950268809002556.
- 55. Cauchemez S, Valleron A-J, Boëlle P-Y, Flahault A, Ferguson NM (2008) Estimating the impact of school closure on influenza transmission from Sentinel data. Nature 452: 750–754 doi:10.1038/nature06732.
- 56. Brauer F (2005) Age of infection in epidemiology models. Electronic Journal of Differential Equations (Conf) 12: 29–37.
- 57. Carrat F, Vergu E, Ferguson NM, Lemaitre M, Cauchemez S, et al. (2008) Time lines of infection and disease in human influenza: a review of volunteer challenge studies. American journal of epidemiology 167: 775–785 doi:10.1093/aje/kwm375.
- 58.
Bolker BM (2008) Ecological Models and Data in R. Princeton University Press, Princeton, New Jersey, USA. 408 p.