The authors have declared that no competing interests exist.
Conceived and designed the experiments: VC. Performed the experiments: MT PB. Analyzed the data: MT PB AD GKKK CMS. Contributed reagents/materials/analysis tools: MT PB AD GKKK CMS VB ZS MCG VC. Wrote the paper: MT PB AD GKKK CMS VB ZS MCG VC. Drafted the manuscript: MT PB VC.
Human mobility is a key component of large-scale spatial-transmission models of infectious diseases. Correctly modeling and quantifying human mobility is critical for improving epidemic control, but may be hindered by data incompleteness or unavailability. Here we explore the opportunity of using proxies for individual mobility to describe commuting flows and predict the diffusion of an influenza-like-illness epidemic. We consider three European countries and the corresponding commuting networks at different resolution scales, obtained from (i) official census surveys, (ii) proxy mobility data extracted from mobile phone call records, and (iii) the radiation model calibrated with census data. Metapopulation models defined on these countries and integrating the different mobility layers are compared in terms of epidemic observables. We show that commuting networks from mobile phone data capture the empirical commuting patterns well, accounting for more than 87% of the total fluxes. The distributions of commuting fluxes per link from mobile phones and census sources are similar and highly correlated, however a systematic overestimation of commuting traffic in the mobile phone data is observed. This leads to epidemics that spread faster than on census commuting networks, once the mobile phone commuting network is considered in the epidemic model, however preserving to a high degree the order of infection of newly affected locations. Proxies' calibration affects the arrival times' agreement across different models, and the observed topological and traffic discrepancies among mobility sources alter the resulting epidemic invasion patterns.
The spatial dissemination of a directly transmitted infectious disease in a population is driven by population movements from one region to another allowing mixing and importation. Public health policy and planning may thus be more accurate if reliable descriptions of population movements can be considered in the epidemic evaluations. Next to census data, generally available in developed countries, alternative solutions can be found to describe population movements where official data is missing. These include mobility models, such as the radiation model, and the analysis of mobile phone activity records providing individual geo-temporal information. Here we explore to what extent mobility proxies, such as mobile phone data or mobility models, can effectively be used in epidemic models for influenza-like-illnesses and how they compare to official census data. By focusing on three European countries, we find that phone data matches the commuting patterns reported by census well but tends to overestimate the number of commuters, leading to a faster diffusion of simulated epidemics. The order of infection of newly infected locations is however well preserved, whereas the pattern of epidemic invasion is captured with higher accuracy by the radiation model for centrally seeded epidemics and by phone proxy for peripherally seeded epidemics.
One of the biggest challenges that modelers have to face when aiming to understand and reproduce the spatial spread of an infectious disease epidemic is to accurately capture population movements between different locations or regions. In developed countries this task is generally facilitated by the existence of data or statistics at the national or regional level tracking individuals' movements and travels, by purpose, mode, and other indicators if available (see e.g. transport statistics in Europe
Depending on the infectious disease under study, different mobility processes may play a relevant role in the spatial propagation of the epidemic while others appear to be negligible, as determined by the typical timescales and mode of transmission of the disease, and the geographic scale of interest. For rapid directly transmitted infections, daily movements of individuals represent the main mean of spatial transmission. At the worldwide scale, air travel appears to be the most relevant factor for dissemination, as observed during the SARS epidemic
To overcome issues in accessing commuting data when simulating spatial influenza spread, epidemic models have traditionally relied on mobility models to synthetically build patterns of movements at the desired scale
Next to mobility modeling approaches, alternative tools for understanding daily human movements have more recently flourished thanks to the availability of individual data obtained from different sources, namely mobile phone call records carrying temporal and spatial information on the position of the cell phone user at the level of tower signal cells
Despite the variety of modeling approaches and data sources, the impact of using different proxies for human commuting in epidemic models for rapidly disseminated infections is still poorly understood. Each approach or source of data clearly has its own intrinsic strengths and weaknesses, related to accuracy and availability of the dataset.
More specifically, mobility models require some assumptions or input data for calibration and fit to the real commuting behavior. The gravity model requires full knowledge of mobility data for its parameter fitting and can be extended to other regions where data is not available in case of empirical evidence pointing to “universal” commuting behavior at a given resolution scale, i.e. well described by the same set of parameter values
Recent studies have assessed the effects of using gravity models in mathematical epidemic models
The aim of this paper is therefore to assess the adequacy of two specific proxies – mobile phone data and the radiation model – to reproduce commuter movement data for the modeling of the spatial spread of influenza-like-illness (ILI) epidemics in a set of European countries. We first compare the commuting networks extracted from the official census surveys of three European countries (Portugal, Spain and France) to the corresponding proxy networks extracted from three high-resolution datasets tracking the daily movements of millions of mobile phone users in each country. More specifically, we examine through a detailed statistical analysis the ability of mobile phone data to match the empirical commuting patterns reported by census surveys at different geographic scales. We then examine whether the observed discrepancies between the datasets affect the results of epidemic simulations. To this aim, we compare the outcomes of stochastic SIR epidemics simulated on a metapopulation model for recurrent mobility that is based either on the mobile phone commuting networks or the radiation model commuting networks, with respect to the epidemics simulated by integrating the census data. We evaluate how the simulated epidemic behavior depends on the underlying mobility source and on the spatial resolution scale considered, by investigating the time to first infection in each location and the invasion epidemic paths from the seed.
The study relied on billing datasets that were previously recorded by a mobile provider as required by law and billing purposes, and not for the purposes of this project. To safeguard personal privacy, individual phone numbers were anonymized by the operator before leaving storage facilities, in agreement to national regulations on data treatment and privacy issues, and they were identified with a security ID (hash code). The research was reviewed and approved by the MIT's Institutional Review Board (IRB). As part of the IRB review, authors, who handled the data, and the PI participated in ethics training sessions at the outset of the study.
The census commuting networks are extracted from three census surveys, one for each of the countries under study: Portugal, Spain and France. Each survey tracks the number of people who daily commute for work or study reasons between any two locations within the country. Locations are identified as political subdivisions of the country, usually corresponding to their lowest administrative level. Commuting flows directed to or coming from abroad are not considered in the analysis (see Section 3.1 in
Census surveys used in the present study are not homogeneous in terms of collection date and geographic resolution at which data is collected. They represent however the most accurate and reliable description of commuter movements that is available for the countries under study. Commuting data for Portugal is extracted from the database of the National Institute of Statistics
We thus consider: (i) the Portuguese
Commuting data for Spain is extracted from the database of the National Institute of Statistics
Commuting data for France is extracted from the database of the French National Institute of Statistics and Economic Studies
In the following we indicate with
Mobile phone commuting networks are extracted from three high-resolution datasets, based on mobile phone's billing information of a large sample of anonymized users in each country under study (2006 data for Portugal, 2007 for Spain and France), and already used in previous works
The data provides information about the time of usage of the mobile phone and the coordinates of the corresponding mobile phone tower handling the communication. The data allows us to identify the set of locations visited by each user (georeferenced in terms of tower cells) and to rank them according to the total number of calls placed by a user from each of them. Only users with more than 100 calls are included in the study, to enable the estimation of the individual's commuting mobility pattern. Since mobile phone trajectories clearly include different sorts of daily movements, we need therefore to extract commuter movements only for the comparison with census data, and disregard other types of displacement. Following previous work
Once defined on the same geography, the two datasets also need to refer to the same population. The census dataset represents the benchmark, as it comprises the entire population of a country (commuters and non-commuters at a given scale) and its mobility features, whereas the commuting data obtained from the mobile phone dataset is affected by the sampling bias corresponding to the operator's coverage and to the selection of the subset available for the analysis (it only therefore represents a fraction of the total population) and by the algorithm used to identify commuting-like movements. We explored the geographic coverage of the mobile phone dataset for the three countries (see the Analyses subsection for the corresponding methodology adopted). With no additional information on the subset of individuals included in the mobile phone datasets, we opt for a
More sophisticated choices can be made to account for the sampling biases in a more accurate way, as discussed in the
As a sensitivity analysis, and for further comparison with the radiation model (see following subsection), we also consider a
In the following we indicate with
We create synthetic commuting networks using the
We use a metapopulation modeling approach
Human mobility is described in terms of recurrent daily movements between place of residence and workplace so that the infection dynamics can be separated into two components, each of them occurring at each location
Simulations are fully stochastic, individuals are considered as integer units and each process is modeled through binomial and multinomial extractions (more details on the simulation algorithm are reported in Section 4 in
Once a set of initial conditions is defined (mobility network,
For each country under study, we assess the coverage of the population in the mobile phone dataset by calculating the national average,
We compare the structural and fluxes properties of the commuting networks extracted from census surveys with those of the networks extracted from mobile phones records, to test the quality of mobile phones data as a proxy for commuting at national level. We analyze the topology of the networks obtained from the two sources of data and extract the intersection and its associated travel fluxes. We perform different statistical tests (Spearman's rank correlation coefficient, Lin coefficient, and Wilcoxon test) on the correlation between commuting flows connecting any pair of nodes in each dataset, and between the total numbers of commuters per node in each dataset. We also check for non-trivial correlations between the discrepancies found in the two datasets and nodes' populations and distances between connected vertices. The same analysis is run for all countries, at all resolution scales.
In all realizations and for each subpopulation, we keep track of the following epidemic observables. The temporal information about the epidemic spreading is encoded in the arrival time
The spatial diffusion of the disease is investigated through the epidemic invasion tree representing the most probable transmission route of the infection from one subpopulation to another during the history of the epidemic
Since the stochasticity of the seeding events can induce small weights variations in the invasion paths and thus different invasion tree topologies, for every scenario we build 50 invasion trees, each of them obtained from randomly selecting 400 stochastic realizations out of the total of 1,000 run for each scenario (this approach allows us to minimize the random fluctuations in the final invasion tree with a limited computational effort). We then compare the invasion trees describing the spatial spreading on different mobility networks through the Jaccard similarity index. Given a tree
Incidence and prevalence curves are defined as the density of newly secondary cases and density of infected individuals at every time step. From the ensemble of 1,000 stochastic realizations, average and reference ranges are then evaluated for every location as well as the peak time of the epidemic.
The census commuting networks for Portugal include (i) 1,643,938 commuters traveling between the 278 municipalities through 25,634 weighted directed connections, and (ii) 469,089 commuters traveling between the 18 districts on a fully connected network. In Spain we consider the provinces' geographical scale only, as constrained by the information available in the census survey. The commuting network is formed by 47 nodes and 722 weighted directed edges, representing the daily travel flows of 537,331 commuters. The commuting networks for France are defined at the district scale (8,019,636 commuters moving along 38,077 weighted directed edges connecting 329 nodes), and at the department level (4,957,193 commuters for 7,994 weighted directed links among 96 nodes). For all countries, at all scales considered, all administrative units are included in the datasets (i.e. they have at least one incoming or outgoing commuting flux to another administrative unit in the country). A summary of the basic statistics of the networks extracted from census data is reported in
country | administrative level | # nodes | # links | # commuters | ||||
Portugal | municipalities | 278 | 278 | 25,634 | 24,846 | 1,643,398 | 452,113 | 5,255,010 |
districts | 18 | 18 | 305 | 306 | 469,089 | 155,137 | 3,525,367 | |
Spain | provinces | 47 | 47 | 722 | 2,146 | 537,331 | 460,211 | 5,181,570 |
France | districts | 329 | 329 | 38,077 | 60,816 | 8,019,636 | 1,676,103 | 18,750,497 |
departments | 96 | 96 | 7,994 | 8,930 | 4,957,193 | 1,087,856 | 12,198,666 |
Number of nodes, of links, and of commuters for each commuting network under study, without considering self-loops. Rows correspond to different countries and geographical subdivisions within a country. Columns indicate values from the census dataset and the mobile phone dataset. Commuters for the mobile phone dataset correspond to the values obtained directly from the samples, prior to the normalization procedure, and after the basic normalization procedure. Values obtained with the refined normalization are not reported as they are equal to those of the census dataset, by definition.
Commuting patterns from mobile phone records are extracted from a sample of 1,058,197 anonymous users in Portugal, 1,034,430 in Spain, and 5,695,974 in France. Records referred to 2,068 towers in Portugal, 9,788 towers in Spain, and 18,461 in France. Once mapped onto the administrative units, we find 452,113, 460,211 and 1,676,103 total commuters in the mobile data samples in Portugal, Spain, and France, respectively, corresponding to the lowest administrative hierarchy.
Population tracked by the operators' samples is distributed nationwide and approximately equal to 9% of the census population in Portugal and France, and 2% of the census population in Spain. By taking into account these scaling factors, cell phone population correlates well with the census population at the highest geographical resolution considered, with a Pearson correlation coefficient between the two quantities equal to
Map showing the ratio
Commuting networks obtained from census data and mobile phone activity data share the same number of nodes at all hierarchies considered in all countries, given that all administrative units were covered by both datasets, however variations are observed in the number of commuting links (
We compare the probability density distributions of the travel fluxes
Top: probability density distributions of the weights (
Restricting our analysis on the topological intersection, a side-by-side weight comparison on each link shows a high correlation between the two datasets (Spearman's rank correlation coefficient >0.7 for the largest administrative units,
country | administrative level | outgoing commuters per residence location | incoming commuters per destination | ||||
Portugal | municipalities | 0.42 | 0.64 | 0.41 | 0.93 | 0.41 | 0.88 |
districts | 0.44 | 0.89 | 0.17 | 0.93 | 0.17 | 0.90 | |
Spain | provinces | 0.45 | 0.73 | 0.15 | 0.72 | 0.15 | 0.54 |
France | districts | 0.53 | 0.67 | 0.56 | 0.93 | 0.53 | 0.95 |
departments | 0.49 | 0.81 | 0.47 | 0.93 | 0.41 | 0.94 |
Values of the Lin's concordance coefficients after a log transformation of variables, and Spearman's coefficient measured between the mobile phone network and the census network for the weights (
The correlations found along the various indicators do not ensure the statistical equivalence of the two datasets (a Wilcoxon-test for matched pairs would reject the null hypothesis of zero median differences between paired values of the same quantities).
We further analyze whether the observed discrepancies between the weights in the mobile phone networks and the census networks show any dependency on the variables that characterize the underlying spatial and social structure, namely the Euclidean distance between the connected nodes (calculated from the coordinates of the administrative unit's centroid), the population of the origin node and the population of the destination node (
Panels show the ratio between the weights of the mobile phone networks
If we refine the normalization of the mobile phone networks by taking into account the total number of commuters in each administrative unit, the agreement with the census dataset improves in the side-by-side weight comparison on every link (see Section 3 in
We examine whether the observed non-negligible discrepancies in the commuting fluxes of the two datasets are also significant from an epidemic modeling perspective, altering substantially the outcome of disease spreading scenarios. We compare scenarios obtained from stochastic metapopulation models equally defined and initialized, except for the mobility data they integrate (see Methods). In addition to the census commuting network and the mobile phone commuting network, we also consider the synthetic commuting network generated with the radiation model.
Epidemics starting from different seeds in the three countries, and characterized by different values of the basic reproductive number, yield large variations of the Jaccard index value
Comparing the epidemic behavior on the census network and two proxy networks, mobile phone (red symbols) and radiation model (blue symbols), in Portugal (top panels), Spain (middle) and France (bottom).
If the seed is instead located in a peripheral node, values of the Jaccard similarity index fall always below 0.4 in the three countries, and decrease with larger values of the transmissibility.
Mobile phone data performs similarly to the radiation model once the corresponding epidemic models are seeded in a central location, except for the case of Lisbon, and performs better or similar when they are seeded in a peripheral location. If the epidemic starts from a mid-size populated region, the relative performance of the radiation model against mobile phone data in the epidemic outcomes depends on
To test for the role of overestimation of flows, we also performed the same analysis by considering the refined normalization of the mobile phone commuting data that keeps the same total number of commuters per administrative region as in the census dataset and explicitly discounts overestimation biases. The refined normalization allows the mobile phone data to better reproduce the invasion paths obtained from census commuting flows for central and medium-type locations for all
Comparing the epidemic behavior of mobile phone proxy vs. census, when basic and refined normalization are considered. Only the case of France is shown.
When focusing on the time of arrival in a given location, we find a systematic difference between models based on proxy networks and the benchmark model integrating census data. Mobile phone data, overestimating the census commuting fluxes if a basic normalization is considered, leads to a positive difference
By discounting
Epidemic peak times are also affected by the different distributions of commuting flows in the two networks (see Section 2.2 in
On coarser spatial scales (Portuguese districts, French departments), we obtain a higher similarity between simulated results with proxies vs. census (see Section 2.1 in
Next to traditional census sources or transportation statistics, several novel approaches to quantifying human movements have become recently available that increase our understanding of mobility patterns
To systematically test this hypothesis exploiting the full resolution of both the proxy data and the official census data for commuting, we have compared these two datasets in three European countries and performed a rigorous assessment of the adequacy of proxy commuting patterns – extracted from mobile phone data or synthetically modeled – to reproduce the spatiotemporal spread of an emerging ILI infection.
Mobility data from mobile phones is able to capture well the fluxes of the commuting patterns of the countries under study, reproducing the large fluctuations in the travel flows observed in the census networks. In all countries the intersection between the two networks includes the vast majority of the commuting flows and the correlation measured on links' traffic and nodes' total fluxes of incoming or outgoing commuters is high (though not statistically equivalent). This suggests that mobile phone data can be used as a surrogate tracking the commuting patterns of a given country, identifying the relative importance of its mobility connections in terms of flows' magnitude, with a resolution that is equivalent to the one adopted by official census surveys or higher. This is a particularly relevant result for data-poor situations, where census data may not be available and official statistics may not be enough to correctly inform a mobility model.
Discrepancies are however found, especially in the overestimation of commuting flows per link and in the larger variations observed for weaker flows and longer distances, that appear to be responsible for the differences observed in the simulated epidemics.
Epidemics run on mobile phone commuting networks reproduce well the invasion pattern simulated on the census commuting when the seed is located in a central location and
The opposite situation is instead found when seeds are located in peripheral nodes, reporting low values of the Jaccard index. The analysis of the commuting networks has indeed shown that larger discrepancies exist for small weights. Once considered in the framework of an epidemic propagation, such discrepancies are expected to lead to strong differences in the invasion already at the first generation of infected locations. If these locations directly infected by the seed strongly differ, their contribution to the decrease of the similarity of the invasion paths will become increasingly stronger for further generations: different nodes are infected and likely different neighbors of those nodes will be affected by the disease, so that deviations cumulate at each successive step of the invasion (
The full invasion trees for
Diseases with a higher transmission potential would enhance this behavior, as with a large value of
A clear bias, which is observed consistently across all countries and for all resolution scales considered, is the faster rate of spread of the simulation based on the mobile phone commuting network with respect to the census one. This is clearly induced by the larger commuting flows obtained following the extraction of commuting patterns from mobile phone data using a basic normalization. The effect is stronger for
Time of arrival of the infection in a given location is better matched by the epidemic model built on the radiation model, though with large fluctuations for small values of
Nodes ranking according to time to first infection also improve in the epidemic simulations based on the refined normalization with respect to the baseline one. The similarity in the invasion paths equals (or even improves) the levels reached once the radiation model is considered. Similar results are therefore obtained from two different sources however employing the same type and amount of input data (for calibration/normalization). Jaccard index values display anyway the presence of important differences in the way the epidemic propagates on proxies with respect to census, being
Effects of flows overestimation are visible in the analysis of the epidemic peaks too, but less prominent. The larger number of commuters that travel in the mobile phone networks tends to synchronize the epidemic peak between different subpopulations, leading to shorter overall timespan for all subpopulations to peak in the mobile phone case with respect to census. Differences between the datasets mostly range in a time interval of 2–3 weeks, a time resolution that still allows a meaningful comparison of epidemic results with the average reporting period of standard surveillance systems.
In the case of France and Portugal we have also studied multiple hierarchical levels of the administrative units, by aggregating both datasets. Overall, our analysis indicates that the epidemic behavior on aggregated proxy network better matches the results obtained on census data, with respect to higher resolution level. This is however obtained at the cost of studying the epidemic on a lower geographic resolution, which would then provide less information on the predicted time course of the epidemic and may compromise our ability to use models to extract valuable public health information for epidemic control
The overall picture we presented clearly shows that proxies integrated into epidemic models can provide fairly good estimation of the ranking of subpopulations in terms of time to first infection. A good agreement in the simulated arrival times is intrinsically related to proxies' calibration and normalization aspects, and observed biases can be reduced by using additional information, such as the knowledge of the total number of commuters in each location. On the other hand, the most probable path of infection from one subpopulation to another appears to be affected by more substantial discrepancies between the different sources of data or synthetic flows that cannot be overcome through a simple normalization. To further improve predictions on the path of invasion, we would need to comprehensively understand the causes behind the differences observed in the data analysis. These are inevitably related to the methods used to account for the population sample considered in the mobile phone data and to define the commuting mobility per user.
First, in extracting the commuting behavior of each user from mobile phone data we necessarily have to make assumptions on the identification of home and work locations (in absence of metadata on the user). If we identify these two locations as the two most visited ones
Increasingly sophisticated approaches can also be envisioned, based on clustering methods applied to calling behavior
Second, our basic normalization may be too simplistic, thus inducing strong overestimation because the population sampled through the mobile phone data is not representative of the general population, being characterized by specific different features affecting the resulting mobility behavior. Biases may be induced by mobile phones ownership, with fluctuations strongly dependent on socio-economic status
Small-scale studies targeting specific populations (such as e.g. a city or a college town) with additional metadata accompanying the activity records may possibly shed more light in the identification of such biases.
In poorer countries these effects are expected to be of a larger magnitude, given that mobile phone users still represent a privileged minority of the population
The introduction of a refined normalization to account for the non-representative nature of the mobile phone sample fixes the total number of commuters equally in the two datasets and leads to an improvement of the comparison of the commuting fluxes on a link-by-link basis. Discrepancies on traffic flows along links are however still observed that are responsible for differences in the resulting epidemic observables, even though the overall systematic overestimation obtained with the basic normalization has been discounted. Increasingly sophisticated approaches can be developed that use iterative proportional fitting, fixing two marginal values that need to be assumed, i.e. the total numbers of incoming commuters and of outgoing commuters per location (or additional data, such as points of interest in the case of intra-city commuting)
Third, there may be inconsistencies in the definition of commuting for both datasets, or differences in the year of collection of each dataset. We have no information on users' age in the mobile phone dataset, therefore movements for work or study are both tracked in users' trajectories. Commuting for study reasons is included in the Portuguese and French census data, whereas Spain reports about workflows only. The impact of not considering students' commuting in the Spanish case is however estimated to be rather low. Spanish data is indeed collected at a high administrative level (provinces), where students' commuting flows may be very weak given that they are usually more localized than those of workers. Data from France shows that 95% of students (aged<15) travel on distances less than 10 km
Discrepancies in the year of data collection for the two sources range from two years for Spain (2005 is the year of collection of census data, 2007 the year of collection of phone data) to five years for Portugal (2001, 2006). In the case of France, the two datasets belong to the same year (2007). To assess the possible changes in commuting flows with time, we analyzed French yearly data between 2006 and 2009, given their availability (see Table S1 in
Finally, the epidemic model considered adopts some approximations that we would like to discuss in the following. Even if countries under consideration belong to a contiguous area in continental Europe, numerical simulations for the epidemic spread were performed for each country in isolation. This choice is driven by the lack of mobile phone data for cross-border movements (given their national nature), and by the negligible fraction of commuting across countries with respect to national commuting (about 780,000 people in the EU, including EEA/EFTA, were cross-border commuters in the year 2006/2007
The modeling approach we proposed was fairly simple and did not consider additional substructure of the population, interventions, change of behavior or weekend vs. weekday movements. Our aim not being to reproduce historical epidemics, we chose to include only the basic ingredients that were the object of the analysis in order to achieve a clearer understanding and interpretation of results. Simulations were performed assuming a continuous series of working days, given the purpose of the study and the knowledge that the inclusion of weekend movements has little or no effect in the resulting epidemic profile
Our study was performed on three European countries, and we expect that our conclusions are applicable to other developed countries in the world characterized by similar cultural, social, and economic profiles.
Our approach for the extraction of commuting patterns from mobile phone data was based on minimal assumptions in order to facilitate its generalizability in other settings where data knowledge may be limited or completely absent. Further work is necessary to extend this work to the analysis of the adequacy of mobile phone data as proxy for human mobility in underdeveloped countries where cultural and socio-economic factors may affect differently the biases here exposed. We also note that diseases other than ILI may be of higher interest for these regions, and in that case the relevant mobility mode and epidemic model would need to be updated in the approach we presented.
For instance, the transmission of the disease under study may be strongly affected by seasonal forces, such as the variations in human density and contact rates due to agricultural cycles that drive the spatial spread of measles in Niger
We provide the mobile phone commuting dataset in .zip format containing the OD matrices extracted from the mobile phone data at the highest geographical resolution in Portugal, Spain and France. Specific information about the format of the data is inside the archive file.
(ZIP)
The file contains: additional information on data sources (Section 1). Additional results for lower geographic resolutions and epidemic peak times (Section 2). Sensitivity analysis on cross-border commuting, refined definitions of workplace and residence, refined normalization of phone data (Section 3). Additional details on the simulation algorithm (Section 4).
(PDF)
GKKK would like to thank the ISI Foundation in Turin for its hospitality during the time this work was undertaken.