The authors have declared that no competing interests exist.
Conceived and designed the experiments: SC KP PYB. Performed the experiments: SC. Analyzed the data: SC. Contributed reagents/materials/analysis tools: SC. Wrote the paper: SC KP PYB.
Commuting data is increasingly used to describe population mobility in epidemic models. However, there is little evidence that the spatial spread of observed epidemics agrees with commuting. Here, using data from 25 epidemics for influenza-like illness in France (ILI) as seen by the Sentinelles network, we show that commuting volume is highly correlated with the spread of ILI. Next, we provide a systematic analysis of the spread of epidemics using commuting data in a mathematical model. We extract typical paths in the initial spread, related to the organization of the commuting network. These findings suggest that an alternative geographic distribution of GP accross France to the current one could be proposed. Finally, we show that change in commuting according to age (school or work commuting) impacts epidemic spread, and should be taken into account in realistic models.
The multi-scale network of social interactions
Except for a report on the correlation between influenza epidemic peak timing and inter-states commuting in the USA
Using these two databases we first analyzed how commuting data relates to disease spread at a local level. We then examind the underlying mechanisms of propagation using an epidemic model derived from commuting networks An indicator based on the similarity of epidemic courses in excess of random movements was developed. Finally, we investigated how age differences in commuting networks, i.e. to school or to work, led to changes in the spatial spread of diseases.
The Sentinelles network
Incidence for 100000 inhabitants as monitored by the Sentinelles network during season 1985–1986. Maps are 2 weeks apart.
We used the data collected in the 1999 census data in France. All data were obtained at the LAU1 level, that we refer to as ‘district’ afterwards. There are 3704 districts in France. In each district, the population was split into 5 age classes : less than 3 years old; 3 to 10; 11 to 18; 18 to 65 and more than 65. These categories were retained to capture large changes in mixing groups due to schooling (3–10 and 11–18) and work (18–65). The frequency of each age class was obtained from census data in each district, as well as the percentage of population with a professional occupation. We also computed the average number of contacts of an individual of age
The commuting dataset, derived from census data, contains the movements of more than
We identified communities using the weighted ‘Louvain’ algorithm
The natural history of influenza infection was described as a 4 stage SEIR process: individuals were first susceptible to the disease (stage S), then latent (infected but not infectious yet; stage E), infectious (stage I) and finally recovered and removed from transmission (stage R). We simulated transmission using the generation time distribution, i.e. the time from infection in a primary case to infection in a secondary case, as in Mills et al.
A discrete time (time step 0.2 days) deterministic transmission model was implemented. We assumed that only professionally active individuals in age class 18–65 would commute to work, and that all children aged 3 to 18 attended and commuted to school. School-based commuting matrices were the same in age classes 3–10 and 11–18. No births and deaths were considered during the time of simulation, nor any change in place of residence or of destination.
At each time step, the number of incident cases
(1).
Household based force of infection was computed using the age-specific average number of contacts in the household. More precisely, the force of infection was proportional to the density of infected contacts among household members as follows (2) :
For school-based (X = S) and workplace-based (X = W) force of infections, we used a similar approach, computing the expected density of infection among contacts as (3):
For community based transmission, the force of infection was computed using the same principle as above by (4).
We calibrated transmission parameters
Moran's I statistic
Moran's I was computed for each week before and after epidemic peaks, and averaged, week-wise. The same procedure was repeated 1000 times using random permutations to calculate p-values. To test for the specific role of the commuting network as opposed to commuting distance only, we compared these indices with those obtained using random commuting networks, where the distribution of distance travelled was kept the same as in the original data, but commuting trips were chosen at random in any direction. We repeated the above calculation for 100 such random networks.
We also used Mantel's test as described in
In all cases, permutation tests were used to calculate P-values.
We used the overlap measure introduced in Colizza
For each pair of districts in France, we aimed to identify up to what date after first introduction epidemics grew more similarly than expected if commuting was at random. This is measured by criterion,
Lines correspond with overlap measures for a given pair of district at different times after introduction of a single infected. For a particular pair (green line), we also present the overlap measure obtained using reshuffled networks for the same pair (red line). Criterion
To test the sensitivity of the model to the proportion of infections occuring in each context, we performed 100 simulations with a set of parameters, for which
An analysis of sensitivity was also performed to test the impact of the hypothesis that adults asymptomatic individuals had a reduced generation time, by simulating 100 outbreaks with a random initial case where only children would have it. As before, overlap was used to compare the simulations to the former ones.
The sensitivity of the results to the proportion of adults initially immunized was also tested, simulating 100 outbreaks intitialized in randomly chosen districts with different rate of immunity (0, 10, 20, 30, 40, 50, 60 and 70%). Simulations were compared to outbreaks generated with a 80% rate of immunity for adults using overlap.
Workers from one district commuted on average to 133 other districts, and school aged children to an average 75 destinations (
(a,b)Total number of individuals leaving each district via work commuting (a) and school commuting (b). (c) Proportion of commuters and travelled distance in the school network (red) and the work network (green). (d,e) Clusters identified in the work (d) and schoool (e) commuting networks.
The diameter (i.e. the longest minimal path from one place to the other) of the commuting network was 3 for work and 4 for school.
The importance of short-distance commuting also showed in the communities found by clustering (
In the 26 epidemics observed in the Sentinelles network, the spatial autocorrelation computed with weights derived from school and work commuting was significantly greater than 0. In other words, incidence increased synchronously in strongly linked areas. Moran's I was significantly greater than 0 (
(a) Mean value of Moran's Index computed on the 26 epidemics from the Sentinelles network, and (b) on 100 simulated epidemics. In each case, the blue line uses work commuting based weights or school (red line). Gray areas corresponds to the 95% expected values when no autocorrelation is present.
Likewise, Mantel's test performed with weights matrix derived from school and work commuting was positive (Mantel's correlation being equal to 0.069 for work commuting and 0.060 for school commuting), confirming the existence of a spatial auto-correlation linked to commuting movements (
Simulated epidemics started from different places were all similar in timing and incidence at the national level. Moran's I analysis exhibited the same behavior as in the observed epidemics (
As for observed epidemics, Mantel's test was found to be positive for simulated epidemics (mantel correlation was equal to 0.106 with work commuting and 0.121 with school commuting).
Irrespective of the starting district, national incidence was very similar over the course of the epidemic. Even if the national incidence were similar, overlap changed depending on the pair of districts considered. Initial overlap was very variable using the observed commuting network, but always increased to 1 with time. Remarkably, the overlap in epidemics using reshuffled networks was also large, and quickly increased to 1 as well.
The excess in overlap, as measured by criterion
On the contrary,
We found that the correlation between
To get a picture of initial common paths of spread, we averaged the value of
For each district,
Based on the average
Commuting for work and school created two layers of mixing that could lead to differences in the spatial spread. Indeed, the distance traveled to work was larger, suggesting increased dissemination, but transmission in children is typically larger and could take precedence on transmission by adults. We therefore simulated the spread of epidemics in models where either commuters for school or work remained in their place of residence, with the same number of contacts.
Epidemics were started from 100 random districts with the 3 possibilities : commuting to work and school, only to school or only to work (
(a,b,c) ILI epidemic curves using all commuting networks (a), only work commuting (b) and only school commuting (c). Epidemics were started form 1000 randomly chosen districts. (d) Overlap between epidemics using work (blue curve) or school commuting (red curve).
Not unexpectedly, ignoring one commuting network led to epidemics that spread less rapidly. The peak of epidemics simulated with school commuting were on average delayed by 2 weeks, although with large variability. For some simulations, the propagation was faster when only school commuting was present, but this was independent of the district of departure (correlation of delay with district population :
Finally, simulated outbreaks where all commuters followed the same commuting pattern, either school or work, were much in line with the results above. Overlap with original simulations was almost perfect when using only the school network but differed markedly from the start when using only work commuting (w
Overlap between epidemics simulated with first model and epidemics propagating only by school (red) or work (blue) commuting (a), with epidemics for which asymptomatic adults do not have a reuced genration time (b), with epidemics simulated with different parameters of transmission (c). (d) Overlap between epidemics in which 80% of adults are susceptible with epidemics with different rates of susceptibility.
The overlap between simulations with different rates of contacts and the original simulations started in the same district was very large (
Similarly, the overlap between epidemics with a reduced generation time for symptomatic adults and without was very large (
The overlap between simulations with 80% of susceptible adults and other percentages of immunization decreased with the rate of susceptibility of adults (
Our analysis showed that commuting data determines the spread of influenza in modern populations, as evidenced by the large autocorrelation in observed ILI incidence in regions connected by commuting. Building on this observation, we provided an in depth study of the consequences of mobility as described by commuting in the initial spread of epidemics, showing how to identify preferential paths in a densely connected territory. Last, we showed that age specific heterogeneity in commuting leads to different patterns of spread, depending on the age category the most involved in transmission.
The spatial structure of epidemics in France was manifest according to the change in Moran's index over time. The index increased up to a maximum just before the national epidemic peak, and decreased afterwards. This spatial structure was hinted at by the non random structure of spatial incidence pointed out by Bonabeau et al.
In our systematic exploration of the model dynamics, a three stages scenario for the spread of epidemics emerged. The first stage followed introduction of an infected individual in the population. The lack of large
We used the raw commuting data from the census, instead of a smoothed version based on a gravity model
Seeding epidemics with only one case, as we did in the systematic analysis, is presumably not very realistic. Indeed, real epidemics may be seeded by repeated introductions from abroad over a few weeks. We however selected this simple seeding pattern to study systematically the influence of the initial place of introduction, as it allowed a rather simple way to compare epidemic courses through their overlap. This type of seeding likely reduces noise and leads to increased spatial autocorrelation, as noted in
Thanks to the systematic search for locations having large similarity with others, we identified preferential paths for epidemic spread due to human mobility. Clustering districts according to the average