Modeling and Statistical Analysis of the Spatio-Temporal Patterns of Seasonal Influenza in Israel

Background Seasonal influenza outbreaks are a serious burden for public health worldwide and cause morbidity to millions of people each year. In the temperate zone influenza is predominantly seasonal, with epidemics occurring every winter, but the severity of the outbreaks vary substantially between years. In this study we used a highly detailed database, which gave us both temporal and spatial information of influenza dynamics in Israel in the years 1998–2009. We use a discrete-time stochastic epidemic SIR model to find estimates and credible confidence intervals of key epidemiological parameters. Findings Despite the biological complexity of the disease we found that a simple SIR-type model can be fitted successfully to the seasonal influenza data. This was true at both the national levels and at the scale of single cities.The effective reproductive number Re varies between the different years both nationally and among Israeli cities. However, we did not find differences in Re between different Israeli cities within a year. R e was positively correlated to the strength of the spatial synchronization in Israel. For those years in which the disease was more “infectious”, then outbreaks in different cities tended to occur with smaller time lags. Our spatial analysis demonstrates that both the timing and the strength of the outbreak within a year are highly synchronized between the Israeli cities. We extend the spatial analysis to demonstrate the existence of high synchrony between Israeli and French influenza outbreaks. Conclusions The data analysis combined with mathematical modeling provided a better understanding of the spatio-temporal and synchronization dynamics of influenza in Israel and between Israel and France. Altogether, we show that despite major differences in demography and weather conditions intra-annual influenza epidemics are tightly synchronized in both their timing and magnitude, while they may vary greatly between years. The predominance of a similar main strain of influenza, combined with population mixing serve to enhance local and global influenza synchronization within an influenza season.


Introduction
Seasonal influenza outbreaks are a serious burden for public health worldwide. Seasonal influenza is mainly a self-limiting disease, and in most patients results in only moderate illness without need for medical treatment. Nevertheless, it is estimated to cause morbidity to millions of people each year. In addition, influenza poses a major risk to chronic patients of all ages especially the elderly, to whom it causes more severe morbidity and is associated with a higher death rate [1][2][3]. The global mortality from the disease is estimated at 250,000 to 500,000 cases annually [4,5]. Furthermore, the economic burden of seasonal influenza is estimated to be 11 billion US dollars a year in the US alone [6]. This includes morbidity, mortality, hospitalizations and absenteeism from work and school. As early as 1952 the WHO established the Global Influenza Surveillance Network. However there are major problems regarding the reliability of influenza data [7,8]. The major difficulty is that the disease has no clear-cut clinical signs and can be easily confused with other respiratory illnesses having similar symptoms [9]. In addition, influenza illness is often slight or moderate and a significant number of infected individuals are asymptomatic, so that many individuals with the disease do not seek any medical care. In order to better estimate the burden of influenza, different sources of influenza data have been used to study the disease, such as Influenza-Like Illness (ILI) diagnoses, virus isolations, death records from pneumonia and influenza, physician visits and web search queries [2,4,[10][11][12].
Influenza is predominantly seasonal, with epidemics occurring every winter, but the severity of the epidemic and the exact timing of the outbreak vary substantially between years (see figure 1). Influenza has also been studied with respect to its geographic spread at different scales. Viboud et al. (2006) have studied the spread of influenza across the United States and found that there is higher synchrony between more populous states [12]. In other studies Viboud et al. (2004) and  analyzed the synchrony on a global scale comparing outbreaks in the US, France and Australia [13,14] and found high synchrony between the US and France and no synchrony between the Northern and Southern hemispheres (even after adjusting for the two hemispheres being out of phase). In the smaller spatial scale of France, Bonabeau et al. (1998) demonstrated how the disease rapidly spreads across the country, and concluded that when modeling the initial spread of influenza it is sufficient to assume global homogeneous mixing to capture the dynamics. Geographic heterogeneities and density dependence affect the dynamics of the disease around local outbreak peaks [15].
Israel provides a unique opportunity to study the spatial spread of influenza at a small scale since the country is only approximately 22,000 km 2 . We use a highly-detailed influenza database, both temporally and spatially (see data section) collected in Israel for over 11 years. The data composed of ILI diagnoses in a subset of some 23.8% of the Israeli population. We initially examine the ILI data on a national level (i.e. aggregated data) to understand the year to year variability of influenza epidemics both in their timing and magnitude. In the second part of the paper we examine spatial aspects of influenza in Israel with emphasis on the spatial synchrony. The following patterns are of particular interest: i) Differences in ILI dynamics and key epidemiological parameters between the major cities in Israel (in time and space) ii) Synchrony between different Israeli cities within a year iii) Comparing the ''local'' Israeli synchrony with the intercontinental synchrony of influenza outbreaks tested on the time series of Israel and France. Figure 1 displays the time series of influenza outbreaks (ILI cases) over eleven years when aggregated spatially over all of Israel. There is considerable variation between years in both the peak height of the outbreak, the total attack rate, and the times at which the epidemics reach their maxima during the winter months (December to March (figure 2A). The high quality of the entire dataset both in terms of temporal and spatial resolution (see data section), is a key feature that motivates the following modeling analysis. The data were analyzed and modeled both: i) at the national level, using the total number of ILI cases aggregated over the entire country (figure 1), and ii) in the seven largest cities of the database (in terms of population size). These cities provide a picture of the spatial variation of influenza across the country.

Modeling Approach
A discrete time age-of-infection SIR epidemic model, as formulated in Katriel et al, (2011) [16], is used to fit the Israeli ILI data and estimate epidemiological parameters in order to gain a better understanding of the influenza dynamics (see Methods for model description). The SIR framework assumes that individuals within a population can be divided into three categories or compartments: Susceptibles, Infected or Recovered. Disease transmission within the population is modeled by tracking the changes in numbers of individuals within these compartments. Although SIR-type models have become almost the gold standard for modeling bacterial and viral respiratory infectious diseases, it is to some degree an oversimplified model for influenza due to the virus's ability to rapidly mutate giving rise to its characteristic antigenic drift [17,18]. Earn et al (2002) argue that ''it seems impossible to avoid a much greater degree of model complexity [than the SIR]. The primary obstacle to simple compartmental modeling of flu is antigenic drift.'' [7] In order to bypass the difficulties of modeling continuous evolutionary changes in influenza we have chosen to model individual years separately as single influenza outbreaks, a practice that may be found in the important studies of Baroyan et al (1971) [19] in the USSR, Spicer (1979) [20] in the UK and recently by  for the US, France, Australia and Brazil [14,21], Cintron-Arias et al (2009) who modeled US epidemics [22] and van Noort et al (2011) who used the same approach to modeling influenza time series as single consecutive outbreaks in the Netherlands, Belgium and Portugal [23]. Our analysis goes further than these studies in the manner that it uses a specially formulated statistical approach for estimating key epidemiological parameters.
The basic reproductive number R 0 is an important and widely employed index for quantifying individual epidemic growth rates [24]. R 0 is defined as the average number of people infected by a typical individual over the disease infectivity period in a totally susceptible population. In general, it is extremely difficult to estimate R 0 because the initial population is rarely totally susceptible. Most studies of influenza therefore only attempt to estimate the ''effective R 0 '', or R e , which is a composite index: R e = R 0 N S 0 , where S 0 is the proportion of the population who are susceptible at the beginning of an outbreak. The effective reproduction number R e should be interpreted as the average number of people an infected person infects during the course of their illness in a population, a fraction S 0 (S 0 ,1) of which is susceptible. But R e tells us little about R 0 since S 0 (an important variable in its own right), is difficult to estimate directly [25]. There are many methods documented in the literature for estimating R e and nearly all of them calculate the rate of exponential growth of the infected population in the first phase of an outbreak [26][27][28].
Recently there have been several methods developed for estimating both R 0 and S 0 which are usually more complex and in most cases require fitting an SIR type model to the data [22,23,[29][30][31]. Certainly there would be a major advantage in having the ability to separate out these two parameters because they have very different biological meanings. In the approach used here, we take advantage of the fact that for the Israeli dataset, the entire epidemic curve is available for analysis and not just the initial phase. We are thus able to fit the age-of-infection SIR model to the full epidemic curve to obtain estimates of both R 0 and S 0 for each epidemic outbreak. In Table 1 the estimates of R e obtained using the entire curve (see methods) are given together with estimates of R e calculated using the classical method [32] which estimate the exponential growth of the number of infected people. These two estimates of R e are significantly correlated (r = 0.91, p = 0.0007).
Our methodology gives estimates of R 0 and S 0 under the assumption that all influenza cases in the population are reported and thus that the ILI time series are a product of a surveillance system with a perfect, or 100%, reporting rate. However, such a situation is never the case in practice. If we assume that the reporting rate of the surveillance captures the proportion r (0,r,1) of true influenza cases, then we refer to our estimates in the presence of partial reporting as R 0 and S 0 while the desired values under perfect reporting are referred to as R 0 and S 0 . They are related as follows [25]: Since the true reporting rates (r) of surveillance systems are rarely quantified accurately, they become an important limiting factor when trying to estimate these key epidemiological parameters. However, for well managed surveillance systems, it can be assumed that r is reasonably constant over extended times, in which case it would be possible to capture the relative values and trends of these parameters as they change in time. Unfortunately there have been no studies of the reporting rate stability of the Maccabi data. We have compared the data to another independent Israeli surveillance and found almost identical trends over the same period indicating the reporting rate has changed little. Nevertheless, as we note in the Data section, there was a change in the ILI coding in 2002, which might possibly have changed the reporting rate in that year, presumably to a small degree.
In our data, it was found that on average 1.5% of all Maccabi members were infected with influenza annually (see table 1 for full list of attack rates). However, in the US, average overall attack rates are estimated to be 10-20% [33,34], and France some 12-15% [35]. This, together with discussions with the Israeli Ministry of Health, motivated our setting of the reporting rate to 10% (r = 0.1), yielding an attack rate in Israel (adjusted to 15%) consistent with that reported in the literature for other countries [8,14]. In this case, the average estimates of the true R 0 in Israel using equation 1 should be R 0 ,4.9 which is close to the rough calculation of Katriel and Stone (2010) who estimated R 0 ,3.75 [25].
We note that the parameter R e has the interesting property that it is entirely independent of the reporting rate since: Thus estimations of R e remain unaffected by spatial differences or temporal changes in reporting rates.

Temporal Features
The time course of each of the outbreaks in the aggregated Israeli time series was fitted using the discrete-time SIR model (see Methods). In general the model accurately reproduced the time course of the Influenza A epidemics as demonstrated by the simulation run shown in figure 3A based on data for the year 2007-8. Figure 3C provides a more general picture by displaying model fits for each epidemic in all eleven years as compared to the observed data, based on a cumulative plot, similar to a Q-Q plot (cf [30] and methods). The cumulative incidences of the observed data are plotted on the x-axis, while the cumulative incidence produced by the SIR model is plotted on the y axis. The solid straight diagonal lines are reference lines which connect all points (I t , I t ) of the observed data. The lower points in each series represent the early stage of the epidemic and the top points represent the late stage of the epidemic. Perfect fits between model and data would result in all points falling on the diagonal reference lines. This is to some degree achieved in the fits of the Israeli Influenza A data. The fits of the Influenza B seasons, however, as seen also in figure 3B, are of a lower quality.
Also included in figure 1 is a plot of the well-studied influenza (ILI) dataset collected in France (see figure 1B and data section for details) which will be used for the purposes of a comparative study with Israel. We found that the French epidemic data (figure 1B) could also be modeled with good accuracy by the SIR fitting procedure. However, for both the Israeli and French datasets, we were unable to fit influenza B years with the same level of accuracy, since the epidemic curves corresponding to influenza B outbreaks were more asymmetric than the standard SIR model could accommodate for (figure 3B). Intriguingly, when correlated against its year the estimates for R R 0 in the Israeli aggregated data showed a significant (p = 0.006, R 2 = 0.69) long-term increase over the eleven years (figure 4A).
The estimates for R 0 (based on r = 0.1) varied between the lowest R 0 = 2.95 in 2000-2001 and a maximum of R 0 = 8.16 during the 2006-2007 outbreak, with an average of R 0 = 4.9 (see table 1 for full details). We note that it is unlikely that the increase in R 0 is due to an increase in reporting rates over this period. Had there been an increase in reporting rate, one would expect a corresponding increase in attack rate over the time-period, but this does not appear to occur. In addition the analysis was repeated after excluding the first years (where the coding system was different) and the trend remained (see figure 4). Interestingly, Spicer (1979) [20] also noted an increase in transmissibility (equivalent to an increase in R 0 ) with time after a new strain of H2N2 appeared: ''transmissibility was low in the early stages of the introduction of the new Asian (H2N2) virus subtype in 1957 and the Hong Kong (H3N2) virus subtype in 1969-1970 and high for some years after. This is biologically plausible if the virus is adapting to new conditions of spread.'' While R 0 exhibited a long-term increase over the years, the fraction of susceptibles S 0 showed a significant decrease with time and France R e = 1.06, R e = 1.14 and R e = 1.14 respectively [21]. While using our model to estimate R e in France during the same time period gave an average estimate of R e = 1.36.
Over the eleven years of this study, both R 0 and S 0 vary considerably (e.g., R 0 varied between 2.95-8.16 while S 0 varied between 19.6-40.4%).It is puzzling why the observed decrease in the susceptible fraction of the population over time is balanced out by the increase in the reproductive number R 0 so that the product R e = S 0 NR 0 changes to a relatively limited degree and is always slightly larger than unity (figure 4 A, B). The statistical analysis also showed a significant and high correlation r = 0.95 (p%0.0001, N = 9, all influenza A seasons), between the magnitude of R e and the number of ILI cases in the smoothed peak of the epidemics (see figure 4C) in the aggregated national data.

Spatial aspects of influenza in Israel
One of the most striking features regarding the dynamics of influenza in Israel is the strong spatial synchronization. Figures 1  and 2, show very clearly that the time series of major cities in the country are highly correlated. In order to quantify how the ILI varies spatially across the country, we focus on the two main aspects of the ILI data: the magnitude of the outbreaks in the different cities and the timing or temporal differences in the occurrence of the outbreaks which also varies spatially.

The variation of R e across Israel
For the purposes of examining spatial differences in the magnitude of influenza outbreaks across Israel we make use of R e as an index for epidemic intensity [14]. Given that R e is a measure that is independent of reporting rate, it is useful for conducting comparisons especially since there are indications that the reporting rates of various cities can be quite different (see data section). An interesting outcome of our analysis is the finding that there are no statistically significant differences in R e between the different cities within a year (ANOVA test, p = 0.46, F = 0.97). In contrast R e varies (both nationally and among the Israeli cities) between the different years in a manner that is highly statistically significant (ANOVA test, p = 1.65610 213 , F = 17.7). Our results are consistent with previous studies, as for example [13] who conclude for the US, France and Australia that ''while the average inter-pandemic R p seems rather invariant across geographical locations at around 1.3 there is substantial year to year variability around this average''.  Quantifying the spatial synchrony in Israel As figure 1 shows, the ILI cases from the two major Israeli cities, Tel Aviv and Jerusalem, appear to be tightly synchronized with a correlation of r = 0.91 (p%0.001). Both cities are also highly correlated to the spatially aggregated Israeli ILI data having respective correlation coefficients r = 0.98 (p%0.001) and r = 0.92 (p%0.001). It is thus not surprising that the attack rates (i.e., the total number of cases per year) of Tel Aviv and Jerusalem are also correlated to the attack rates in Israel with a correlation coefficient of r = 0.92, (p%0.001) and r = 0.91 (p%0.001) respectively.
To explore synchrony further we examined whether the timing of the influenza outbreaks observed in Tel Aviv and Jerusalem are more synchronized than expected by chance. Two different tests were devised: i) Phase Analysis: The Tel Aviv and Jerusalem time series were superimposed and the peak date of each epidemic in each time series was identified. In the phase analysis [36] we let D i represent the time difference between the peak of the outbreak in Tel Aviv and that in Jerusalem for the i'th year, and calculated for the observed data. In step two, the annual outbreaks of Jerusalem were randomly reshuffled over the eleven years [37]. The reshuffling required breaking up the Jerusalem time series into eleven separate years (or outbreaks) and then randomly reordering their sequence of occurrence. (D i ) 2 was then calculated. This was repeated N = 100,000 times to obtain the distribution of S shuffle . The index S shuffle was found to be larger than the observed value S obs in 99,985 of the N = 100,000 reshufflings. Therefore, Tel Aviv and Jerusalem are significantly synchronized (p,0.00014) in terms of the timing of their peaks. An analysis of the nine influenza A seasons (i.e., excluding from the analysis years dominated by influenza B) gave very similar results, with p,0.00024. ii) Correlation Analysis: Similar to the above test, the correlation r obs between the Tel Aviv and Jerusalem time series of infectives was measured for both the observed and reshuffled time series. Again, the reshuffling involved breaking up the Jerusalem time series into eleven separate years (or outbreaks) and then randomly reordering their sequence of occurrence. We generated N = 100,000 randomized Jerusalem time series and calculated their correlations r shuffle with the Tel Aviv time series. This procedure gives the probability distribution of r shuffle . We found that the correlation between Tel Aviv and the observed Jerusalem time series was higher than the correlation calculated from the randomized Jerusalem time series in all 100,000 cases. This occurred both when all 11 seasons were analyzed and when influenza B seasons were excluded from the analysis. The high correlation observed between Tel Aviv and Jerusalem time series is nonrandom and provides strong support for the notion that the cities are synchronized over and above the background synchrony of the irrepressible annual winter outbreaks.

Variability of the Spatial Synchrony
We found that different years have different characteristic strengths of spatial synchrony. As a reference frame, figure 2A plots a superimposition of all 11 outbreaks occurring over the 11 seasons in the aggregated national data, and gives an indication of the (relative lack of) temporal synchrony between years. This should be compared with figures 2B and C, which are plots of the time series for the seven largest cities for the years 1998-9 (figure 2B) and 2006-7 ( figure 2C). The former is an example of a year with relatively low spatial synchrony while the latter is the year with the maximum synchrony among the Israeli cities. Comparing the figures it is easy to see that the synchrony within a year (figure 2B and C) is far stronger than the synchrony between years (figure 2A).

Epidemic Synchronization between Israel and France
To gain further insights into the synchrony dynamics of influenza in Israel, we studied its relationship to a distant European country -France. Figure 1B [38]).
In order to quantify the temporal synchrony between the two countries we again performed the phase and correlation analyses for the timing of the outbreaks. Both tests found Israel and France to be significantly synchronized, in the timing of influenza outbreaks (p = 0.014 and 0.008, phase and correlation tests respectively). Results were still significant when B seasons were omitted (p = 0.037, p = 0.045 phase and correlation tests respectively).
In addition we tested whether the intensity in the outbreaks between the two countries was correlated. The values of both peak heights and of R e were calculated and correlated for all the 9 outbreaks between1999-2009. A significant correlation between Israel and France was found in both peak heights (r = 0.6; p = 0.05) and between the values of R e (r = 0.75; p = 0.02). The statistical tests indicate that for both Israel and France i) there are very minor differences in the timing of the epidemics (phase and correlation analyses) and ii) large/small outbreaks tend to occur in the same years in both countries. Nevertheless, and as expected, the correlation between the Israeli cities is higher than the correlation between the two countries.

Discussion
The high quality of the Israeli ILI data has enabled us to explore the spatio-temporal dynamics of influenza in Israel over eleven years. The fact that the simple SIR model can be fitted successfully to data of seasonal influenza A in both the national (aggregated data) level as well as in the scale of single large cities is notable in light of the complex epidemiology of influenza [7]. This is consistent with previous work from the Soviet Union, United Kingdom, France, United States Australia, the Netherlands, Belgium, Portugal and Brazil [14,[20][21][22][23]39,40]. One of the main advantages of using the modeling approach proposed here is the ability to estimate separately the two components of R e namely R 0 and S 0 , which limited many previous studies of influenza [24,34]. We found that over the study period the value of R 0 increased in time (figure 4A). Interestingly this increase was ''balanced'' by a decrease in S 0 . One possible speculation that can explain the observed increase in R 0 could be the evolutionary adaption of the virus to become a more efficient infector [41]. For example, the new strain of pandemic influenza, the H1N1 swine flu virus, had relatively low transmission during the swine flu pandemic [16,[42][43][44]. However pandemic influenza is potentially far more dangerous because the immunity of the population to the pandemic virus is ''expected to be'' lower than the circulating seasonal influenza strains (i.e., a high S 0 ). It is believed that in the future, the H1N1 virus (which is now considered a seasonal strain) will adapt to become a better infector as has happened with previous pandemic and seasonal strains [45]. As mentioned above, in parallel to the increase in R 0 the population susceptibility is expected to decrease due to an increase in the population immunity as more of the population is exposed to the new virus strain. The exact mechanisms which lead to the observed negative correlation in R 0 and S 0 found here needs to be further studied in order to better understand the dynamics of influenza. We note again that the estimate of R e = S 0 NR 0 remains independent of reporting rate. Thus even though there is strong under-reporting in our data, the estimations of R e remain unaffected. In Israel the value of R e is strongly correlated to the magnitude of the peak height of influenza outbreaks ( figure 4C). Knowledge of R e is essential for understanding and controlling the spread of an infectious disease [24,45]. For instance the proportion of the population which needs to be vaccinated in order to reach herdimmunity is a function of R e . Recent studies have shown that a reliable estimation of R e for seasonal influenza can be obtained within a period of 4 weeks after the initiation of the disease [14]. Therefore, it is possible to estimate R e , in the first few weeks of the season and use this information to predict the upcoming epidemic.
R e was also positively correlated to the strength of the spatial synchronization in Israel. We found that in cases where the disease is more ''infectious'' (i.e., higher R e ) then the outbreaks in different cities tend to occur with smaller time lags (see figure 2B, C). It may be hypothesized that the higher R e implies a more forceful infectivity which increases the synchronization and reduces the variability in the timing and magnitude of the peaks between the different cities (see figure 2). It would be an interesting future direction of study to examine the capacity of this hypothesis to explain the observed correlation between R e and synchronization by studying simulations of explicit spatial models.
Modeling studies have shown the sensitivity of the severity of influenza outbreaks to small demographic and environmental changes [46], which implies that small difference in environmental factors can cause large differences in the size of outbreaks between different locations. Nevertheless, we see notable resemblance between the time series within a year across large geographical distances (e.g., Israel and France).
Using laboratory-confirmed influenza surveillance data, Finkelman et al (2007) reported large scale co-occurrence of influenza type A and B, and interhemispheric synchrony (i.e., the dominant strain within a season is the same for most of the hemisphere) [38]. Another example of high synchrony between Israel and France can be seen in Figure 1B, where in the outbreak of 2003-4 there is an early epidemic in both countries. Interestingly the 2003-4 season was dominated by a new influenza strain (A/Fujian) which peaked early in many different countries (probably due to the fact that the population immunity to the new strain was low and therefore S 0 was high, leading to an early outbreak) in the northern hemisphere [50]. Another factor leading to the high spatiotemporal synchrony within Israel is due to short travel distances between cities in a small country. Interestingly, spatio-temporal synchrony is high even between Israel and France, despite the much larger geographic scale. It is possible that here to, a single dominant influenza strain prevails in both countries in each season and air-travel between the countries aids temporal synchrony.
A second perplexing observation between the Israeli cities can be seen from examining the estimates of R e for two very different cities such as Bnei Brak and Tel Aviv. As opposed to Tel-Aviv Bnei Brak has large ultra-orthodox religious communities that are characterized by large families with many children. The two cities differ in many important demographic aspects (e.g. household size, age structure and population connectivity) that are thought to influence the dynamics of influenza [47][48][49]. It is expected that significant variation in demographic factors would lead to observed differences in the value of a key parameter such as R e . During our study period there were no statistical differences between the value of R e between different cities (within a year), and even countries. The fact that the size of R e is rather ''constant'' between the Israeli cities is in line with the findings of Baroyan et al (1977) [50], Spicer (1979) [20] and   [21] which were obtained on the much larger geographical scale of the USSR and Brazil. Spicer (1979) remarks: ''The most striking and unexpected feature of the model is that the parameter on which the spread of an epidemic within a city depends is the same for every city in any one epidemic within the USSR'' [20]. Nevertheless, in recent years extremely complicated models were developed to model pandemic influenza [47][48][49]51]. The models include many biological demographic and sociological complexities which are believed to be important in capturing the dynamics of pandemic influenza. For instance, Merler et al 2011 [51] had to incorporate information on intra-European mobility and the different socio-demographic structure of the different European countries in order to reproduce the observed spatial pattern of the West to East spread of the 2009 pandemic in Europe. In contrast, demographic structure did not appear to have impact on the spread of influenza in Israel.
Additional data on the spatial spread of influenza combined with statistical analysis is required to better understand how different population demographics effect R e and the propagation of the disease within different communities.

Data
Our dataset consisted of all Influenza-Like Illness (ILI) cases in Israel diagnosed daily by Maccabi Health Services doctors, between January 1 st 1998 and May 31 st 2009. Diagnosis codes included in this database are ICD9 code 487.1 (influenza) and internal Maccabi codes for influenza, influenza-like disease and swine influenza. The last year of data, in which the swine flu pandemic occurred, was excluded. ICD9 codes were used exclusively in 1998-2002, when there was a transition to internal diagnosis codes. The ILI data were corrected for repeat visits. The criterion chosen to filter out repeat visits from the data was recommended by the Israeli Center for Disease Control, and defines a visit as a repeat visit if it comes within 28 days of a pervious visit with ILI symptoms. The data exhibits a strong weekly cycle due to the absence of weekend reporting. Hence, the data was smoothed using a 7-day moving average kernel [52] red line in Figure 1A. The dataset includes 7 seasons in which the dominant strain was influenza A H3N2, two seasons in which the dominant strain was influenza B, one season in which the dominant strain was influenza A H1N1, and one season in which no strain was dominant. Historical data of dominant influenza strains in Israel is available at the World Health Organization's FluNet website at http://apps.who.int/globalatlas/dataQuery/ default.asp.Maccabi is the second-largest Health Maintenance Organization (HMO) in Israel and insures about 23% of Israel's population. During the period analyzed the number of Maccabi members varied between 1.37 and 1.86 million people. Several works based on this dataset have already been published [30,53,54]. The French ILI dataset is taken from the Sentinel network -a network of ,1200 General Practitioners (GPs) in France who, since 1984, regularly collect data about diagnoses of 12 diseases and report it via the internet. ILI is defined in the Sentinel network as sudden temperatures greater than 39uC, myalgia and cough/running nose. The data received from the GPs are then processed to estimate the number of ILI cases per 100,000 residents in each region, by using population data and the fraction of GPs taking part in the surveillance out of the total number of GPs. Since the frequency of reporting is irregular and is left for each doctor to decide, the data presented in the Sentinel network website are weekly aggregate incidences [55].

Model
We used an SIR discrete-time age-of-infection model as described in [56]. The total population is denoted by N. The number of susceptibles at the end of day t is denoted by S(t) while the number of people who become infected on day t is denoted by i(t). It is important to emphasize that i(t) here counts only the newly-infected individuals on day t. Note the key relationship: It is assumed that each individual has, on average, b contacts with other random individuals per day. A person who becomes infective retains a certain (and non-constant) degree of infectivity for d days. The number of days since a person's infection is termed its age of infection. Therefore, the number of infectives whose age of infection is t (1#t#d) on day t is i(t2t). When a susceptible meets an infective person whose age-of-infection is t (1#t#d), the susceptible becomes infective with probability P t . The vector P = (P 1 ; …; P d ) thus defines the infectivity profile, and is a key parameter of the model [16]. The values for the vector P were obtained from the comprehensive review paper about influenza viral shedding by Carrat et al 2008 [57]. The values are P = (0.073, 0.181, 0.222, 0.185, 0.137, 0.09, 0.056, 0.032, 0.016, 0.008).
The probability that any single susceptible becomes infected during day t is given by: In a deterministic variation of the model, the daily number of infectives is, for large N: We use the deterministic model to simulate the time-series of figure 3. The above model also has a stochastic formulation ) whose log-likelihood can be shown to be (Katriel, manuscript): It is possible to estimate the parameters S 0 and R 0 by numerically maximizing the above log-likelihood expression. The maximization should be carried out for b.0 and over integers S 0 in the range P T t~1 i(t)ƒ S 0 ƒ N (as the number of susceptibles at the beginning cannot be smaller than the total number of cases).
It is important to demonstrate the statistical identifiability of the parameters S 0 and R 0 from the data, using the likelihood function LL (equation 7) [58].To achieve this, we generated contour plots of the function LL for each season. Figure 5 displays this plot for season 9. As can be observed, the likelihood attains a unique maximum at a point which forms our maximum likelihood estimates, and the region in which the likelihood is close to the maximal values is small enough to provide rather narrow 95% confidence intervals for the parameters. The corresponding plots for other seasons are qualitatively similar.
In another approach that was taken to obtain bootstrap confidence intervals for the effective reproduction number (R e ), the stochastic version of the model (equations 4-6) was simulated 10,000 times using the exact same parameters (i.e., b and S 0 where estimated from the real data using (equation 7)). For each of the simulated epidemics R e was re-estimated and the 250 lowest and highest estimates were removed to give the 95% bootstrap confidence intervals.
The values of R e were also obtained using the classic method of measuring the rate of exponential growth at the initiation of the outbreak as in [32] The estimates from both methods were highly correlation (R = 0.91, p = 0.0007) (see table 1).