Skip to main content
  • Loading metrics

Spatially resolved simulations of the spread of COVID-19 in three European countries

  • Andrea Parisi ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Writing – original draft

    Affiliation School of Life Sciences and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom

  • Samuel P. C. Brand,

    Roles Methodology, Writing – original draft

    Affiliation School of Life Sciences and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom

  • Joe Hilton,

    Roles Data curation, Methodology, Writing – original draft

    Affiliation School of Life Sciences and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom

  • Rabia Aziza,

    Roles Methodology, Writing – original draft

    Affiliation School of Life Sciences and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom

  • Matt J. Keeling,

    Roles Conceptualization, Data curation, Funding acquisition, Methodology, Writing – original draft

    Affiliation School of Life Sciences and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom

  • D. James Nokes

    Roles Conceptualization, Funding acquisition, Methodology, Writing – original draft

    Affiliations School of Life Sciences and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom, Kenya Medical Research Institute (KEMRI) - Wellcome Trust Research Programme, Kilifi, Kenya


We explore the spatial and temporal spread of the novel SARS-CoV-2 virus under containment measures in three European countries based on fits to data of the early outbreak. Using data from Spain and Italy, we estimate an age dependent infection fatality ratio for SARS-CoV-2, as well as risks of hospitalization and intensive care admission. We use them in a model that simulates the dynamics of the virus using an age structured, spatially detailed agent based approach, that explicitly incorporates governmental interventions and changes in mobility and contact patterns occurred during the COVID-19 outbreak in each country. Our simulations reproduce several of the features of its spatio-temporal spread in the three countries studied. They show that containment measures combined with high density are responsible for the containment of cases within densely populated areas, and that spread to less densely populated areas occurred during the late stages of the first wave. The capability to reproduce observed features of the spatio-temporal dynamics of SARS-CoV-2 makes this model a potential candidate for forecasting the dynamics of SARS-CoV-2 in other settings, and we recommend its application in low and lower-middle income countries which remain understudied.

Author summary

First detected in China in December 2019, the SARS-CoV-2 virus rapidly spread around the world causing a major pandemic and over 300,000 deaths by the end of May 2020. In response to the pandemic, many governments issued measures aimed at containing the spread of the virus and limiting the expected number of deaths. Our goal is to have a model capable of reproducing the observed spatial and temporal spread of the virus, based on the estimate of a limited number of local parameters. We present a detailed model capable of reproducing the spread of the disease in three selected countries. We use a description of the population subdivided in age groups, high resolution population maps, and household structures: we take into account the measures imposed by government and their consequences on the social and mobility patterns of the population. Based on data from Spain and Italy, we estimate hospitalisation and death risks, as well as the infection fatality ratio of the disease. The resulting model has a handful of parameters that need to be estimated and has thus the potential to be used in different contexts, in particular in middle and low-income countries.


The first cases of COVID-19, caused by the novel severe acute respiratory syndrome coronavirus (SARS-CoV-2), were reported in Wuhan, China in December 2019. Rapidly causing a local epidemic, it spread around the world within a few months causing a major pandemic which by the start of May 2020 had resulted in over 1,000,000 cases and over 300,000 deaths worldwide. Early data from China suggested levels of transmissibility and virulence comparable to those observed in the Spanish flu pandemic of 1918–19, which caused an estimated 20–50 million deaths worldwide [1]. Despite causing mild infections in the majority of cases, a small but considerable fraction of infected individuals require hospitalization and, in the most severe cases, intensive care with mechanical ventilation. Estimates of the case fatality ratio (CFR) show a dramatic variation across populations [2], likely reflecting the diverse approaches to case detection adopted by different countries. In the absence of an effective vaccine able to fend off the infection, many countries have adopted stringent non-pharmaceutical control measures designed to reduce transmission rates by limiting the frequency of person-to-person contact. These measures are designed to limit the total mortality due to COVID-19 and to confine the epidemic to a level that does not overwhelm national health systems.

As a novel infectious agent, many characteristics of SARS-CoV-2 are still poorly understood and are the object of close scrutiny by the research community worldwide. Early estimates of the basic reproductive number in Hubei province, China, varied between 2.2 [3] and 3.11 [4], with estimates in European countries ranging between 2.5 and 3.0 [57]. Measurements of the doubling time ranged between 2.9 and 7.4 days [3, 5]. The majority of infections are asymptomatic, but there is uncertainty over the proportion of asymptomatic infections, with reports ranging between 4.5% and 64% [811]. A study conducted in the municipality of Vo’ Euganeo, at the centre of the Italian outbreak, in which the whole population was tested irrespective of symptoms across two consecutive surveys found the mean proportion of asymptomatics to be between 40–45% [10]. An outbreak onboard the cruise ship Diamond Princess led to most passengers and staff tested for COVID-19, with 51% of laboratory confirmed cases being asymptomatic [12] and a subsequent modelling study estimating the asymptomatic proportion 74% (95% C.I. 70–78%) [13]. The proportion of cases which are asymptomatic is of interest because of their role in the transmission of the virus within the population. However, so far age stratified fractions of asymptomatic cases remain difficult to obtain. Evidence from several studies [10, 14] suggest that children are less likely to be affected by coronavirus, and data from most countries show that the rate of severe cases in young children is low compared to the mature population, although it is still unclear whether this reflects lower susceptibility or higher rates of asymptomatic infection among children. A detailed survey conducted by the Spanish Government [15] showed that only 1% of children below 1 year of age were infected by the SARS-CoV-2 virus in Spain; this percentage increased with age to 2.2% for 1–4 years old children, and to higher percentages for older individuals, up to almost 7% for individuals aged 70–74. While this could reflect a lower susceptibility to the virus among young individuals, decoupling this variation from age-structured mixing effects requires a detailed mathematical analysis. Additionally, the Spanish survey showed that the number of infected individuals is substantially larger than the number of detected cases, suggesting that large numbers of undetected asymptomatic or mildly symptomatic cases may have driven the outbreak.

An age dependent fraction of infected individuals suffers from severe symptoms that require hospitalization, and a fraction of these require mechanical ventilation in intensive care units. Estimates of hospitalization rates based on Chinese data as well as the infection fatality ratios were estimated in Verity et al. [16]. Based on these, a set of estimates for hospitalization and admissions in intensive care were used by Ferguson et al. [17]. Recent data coming from the first highly infected European countries (Italy and Spain) provide space for further estimates. By combining recent information on cases, hospitalizations, intensive care admissions, deaths with results from the Spanish survey, we are able to build updated age stratified estimates of the ascertainment rate, rate of hospital admissions, intensive care admissions and corrisponding death rates.

A number of interventions were progressively implemented by most countries worldwide to counter the wide spread of the virus. These included stay-at-home measures, shielding of most-at-risk population, closure of schools and higher level educational institutions, teleworking, cancellation of mass gatherings, closure of shops, including recreational and food establishments [18]. The underlying idea of these interventions is to break the chain of transmission of the virus by reducing contact between individuals. In most countries, the interventions adopted resulted in a substantial reduction of the disease morbidity and mortality. The adoption of these interventions has typically been driven by intense modelling work carried out by national teams as well as the wider international research community. Analysis of big data provided by some large companies [19, 20] has been an important component of the modelling effort.

In this work, we present a detailed model of the spatial spread of SARS-CoV-2 virus under non-pharmaceutical interventions that explicitly incorporates governmental interventions, changes in mobility and contact patterns as implied by measurements by big data operators [19] as well as demographic, internal human mobility and age mixing information. Our aim is to have a model that could be readily used on a range of different countries, including middle and low income countries: we thus apply the model to three European countries, Italy, Spain and the United Kingdom, to assess its capability of properly reproducing observed data, and to obtain estimates of key parameters that are, as much as possible, country independent. We use data from the first serological survey performed in Spain [15] as well as data on Intensive Care Units (ICU) admissions and deaths in Italy [21] to infer rates of detection of cases, severe and critical admissions to hospital, as well as death rates due to SARS-CoV-2. We then use the model to estimate, for each country, an unrestricted R0, which is the value that the basic reproductive number would achieve in the absence of interventions. We project the R0 of each country to a common reference country, thus removing the influence of country specific contact matrices. We also estimate a set of age-dependent susceptibilities independently for each country, resulting in a overall set of age stratified susceptibilities which agrees in all the countries under study. Output from our simulations is capable of reproducing the spatio-temporal spread of the disease in the studied countries, with some disagreement on the spatial aspect for Spain due to specific features of human mobility in that country.


The simulation program uses a gridded spatial description with a resolution of 5km. In each grid element, age stratified estimates of the resident populations were obtained from the WorldPop database [22]. Maps available with a 100m resolution were coarsened to 5km: this choice was mainly due to computational efficiency. Each grid cell is treated as a well mixed population, whereas transmission between distinct grid elements is driven by human mobility. The model uses an agent-based description in which individuals have a set of preferred locations and an associated frequency of visit which are assigned at the start of the simulation. Each day individuals move to one of their preferred locations and participate in the local transmission dynamics. The list of preferred locations and their frequency of visit and duration of stay are built in accordance to the fluxes between grid elements predicted by an appropriate model for human mobility.

Epidemiological model

In each location, the dynamics follows a modified age stratified SEIR model that includes further classes for hospitalization (severe cases) and cases in intensive care (critical cases), for a total of 9 age groups and 11 distinct epidemiological classes: a schematic description of the model is shown in Fig 1, while the full model is detailed in S1 Text. The model distinguishes between two kinds of infectives: detected cases Ij and undetected Uj. Detected cases may evolve into severe or critical cases, but otherwise both categories have the same infectiousness. All rates we derive may be expressed as rates per infected individual, regardless of their symptomaticity. We do distinguish between detected and undetected cases to provide corresponding estimates of prevalence.

Fig 1. Schematic and simplified description of the epidemiological model.

The full model is age-structured, and includes special handling of transmission in households as well as separating the transmission of work environments from other kind of transmission. S: susceptible; E: exposed; I: infective case; U: undetected case; H: hospitalized; C: critical; W: delay for hospitalised or critical; Q: quarantined at home; D: death; R: recovered; Z: removed asymptomatic. The following parameters are used: β the rate of transmission; 1/σ average latent period, 1/γ average time to recovery; 1/γ + 1/ρ average time to hospitalization; 1/μ average permanence in hospital; 1/ζ average stay in quarantine; z risk of becoming a detected case once exposed; q risk of being quarantined if detected case; h risk of being a severe case if hospitalized; dh death risk of severe cases; dc death risk of critical cases. Further details on the model are discussed in S1 Text.

Transitions between compartments are driven by stochastic events: a direct Gillespie algorithm is used to integrate the stochastic dynamics, with the exception of transmission within households which are handled separately. Infections are driven by contacts between susceptibles and infectives, either detected or undetected. We use age-mixing patterns estimated by Prem et al. [23]: these consist of distinct estimated age-mixing matrices for home contacts, work, school and other kind of contacts. We take advantage of this subdivision to implement different control strategies. Given a susceptible in class Si, the force of infection is given by: (1) where R0 represents a reproductive number and is the maximum eigenvalue of the contact matrix of a chosen reference country. This approach allows projecting estimates of the basic reproductive number onto a common reference country, thus discounting differences in contact patterns between the countries under study, and was used previously in Hilton and Keeling 2020 [24]. The choice of the reference country is completely arbitrary, thus we choose Italy as our reference country as it was the first to experience an outbreak in Europe. The age dependent susceptibilities σi are estimated from simulations. We consider work and non-work transmission separately, thus the type superscript may indicate one of the following two contact matrices: and . Due to this separation, we can implement stay-at-home interventions that affect one kind of transmission but not the other. The contribution from any of these matrices may be reduced or increased during the simulation. A full list of model parameters values and associated literature is available in S1 Text.


Home transmission is not included in the above, and is handled differently. Describing within-household transmission using home contact matrices in an age-stratified model is reasonable during early unconstrained transmission, but will overestimate transmission later in the dynamics because it ignores the finite nature of within-household contacts, which are removed from an infectious individual’s pool of potential secondary infections once they themselves are infected. To overcome this issue, we include a description of households based on the dynamic generation of synthetic households. When an individual is infected, the simulation program will dynamically generate a household by linking together unrelated individuals. Suppose an individual in age group j not previously included in any household is infected, a host household is generated on-the-fly. First, a household size is chosen from a Poisson distribution with mean equal to the country mean household size. Then, susceptible individuals from age groups i are added to this household, randomly chosen according to the discrete probability distribution . The choice of choosing susceptibles is driven by the fact that individuals from other classes would have already been involved in the process of forming synthetic households. Once a household is built, transmission from infectious to susceptible household members is attempted on a daily basis with a probability corresponding to 1 − exp(−βfhhΔt), where Δt is set to 1 day, , and fhh is an enhancing factor that takes into account increased transmission due to school closures and home working. Household members who quarantine or are hospitalized are removed from the transmission chain to free up computational resources, while the rest of the household structure is maintained for future infection events. If during the course of the simulation the search for members of new families runs out of susceptibles, the simulation program retries searching for new members up to a limited number of attempts. Beyond that value, we assume that remaining household members must be recovered individuals and the effective household size is reduced. The maximum number of attempts is limited to 20 for efficiency.

The use of a household structure alters the force of infection: a household will tend to contain the wider spread of the disease, especially if a quarantine or stay-at-home policy is enabled. Since we are breaking the local full mixing of the population, the epidemic curve will follow a less steep growth rate. Thus to make sure that the model reproduces the correct growth rate given the parameters estimated using well-mixed models, a multiplicative term to the force of infection is introduced that bridges between the two descriptions. The term, calculated using a semi-analytical approach and checked with simulations, is discussed in detail in S1 Text, where we also discuss a modified algorithm capable of producing more realistic household structures.

Human mobility

We estimate the movement of individuals within each country using the Radiation Model [25, 26] fitted on commuting data. Individuals have a set of preferred locations, with an associated frequency of visit, which is assigned at the start of the simulation based on the fluxes between grid elements estimated by the radiation model [27, 28]. Each day a destination is chosen for each individual among his set of preferred locations according to their frequency of visit, and the individual is moved to that preferred location where it participates to the dynamics of that cell for a duration corresponding to the average work day. The individual is then moved back to his original location.

Data on human mobility is available for all countries investigated, through their respective National Statistical Institutions, in the form of commuting matrices between national political subdivisions. This is typically in form of estimates of the number of individiduals who reside in one subdivision and work in another subdivision. For Italy, information on place of study (for 16+ years old) is also available. This information is thus work oriented, however it is worth noting that even in this case it is still an approximation to the true fluxes, since it misses mobility of individuals outside of the typical work shift. Despite these caveats these dataset may be used effectively for the description of the spatial spread of infectious diseases at large scales [29].

Our simulation model uses grid elements as basic areal units, thus we identified the grid cells corresponding to the different national subdivisions specified by the commuting matrices, and calculated the corresponding aggregated fluxes. These fluxes depend on the fraction of the population participating in the global commuting between regions which is a model parameter [27, 28]: the optimal value was identified by maximizing the common part of commuters—a measure of similarity of flow estimates based on the Sørensen index [3032]. Further details are available in S1 Text.

Ascertainment rate and risks of hospitalization and death

We derive the risk for a detected case to be hospitalized as a severe or critical case from age stratified data of Spanish surveillance [33]. We further derive age stratified death risks for severe and critical cases based on both Spanish and Italian data [21, 33]. Finally, we use the Spanish serological survey to estimate ascertainment rates (the percentage of detected cases). By appropriately combining risk percentages with the ascertainment rate, we derive the risk of death for an infected individual, or Infection Fatality Ratio (IFR). Table 1 shows the resulting rates, while all calculations required are detailed in S1 Archive.

Table 1. Estimates of ascertainment rate, risk of being severe or critical for detected cases, risk of death for critical or severe cases, and infection fatality ratio (95% confidence interval in brackets).

In Grasselli et al. [21], which we use to estimate death risks, individuals up to 39 years old are grouped into two groups corresponding to 0–19 and 20–39. No deaths were registered in the youngest group: rates for the 20–29 and 30–39 were obtained from the second group. The Infection Fatality Rate is obtained by appropriately combining the other five columns.


A number of interventions are implemented following approaches used by various governments to control the disease as well as information available from Google COVID-19 Community Mobility Reports [19]: here we detail the interventions available (contact tracing, social distancing, school closures, stay-at-home including work from home, travel restrictions), while details of the interventions implemented for each country are discussed in S1 Text. All interventions may be implemented immediately at the time the interventions are activated, or with a linear increase from the time of activation during a specified number of days.

Contact tracing.

A rudimentary contact tracing is implemented by recording contacts of each individual being infected and entering the latent class. As the individual progresses to removal, a fraction of his contacts are isolated and quarantined. There is very limited information available on the effective delay from detection to quarantine of contacts. Ideally this is supposed to start within 24 hours of interview with the case, but anectodal evidence during the early stages of the epidemic suggested that times were longer. We assumed in this context a permissive scenario, with a 3 days delay from detection to quarantine of contacts. All three countries suspended contact tracing early in the epicemic, thus trials with lower delays did not affect significantly the results.

Social distancing and school closures.

Social distancing consists in reducing physical contact between people with the aim of reducing transmission. We implement it as a generalized contact reduction of Kother. Closure of schools is implemented by reducing the transmission in schools, typically to zero. In this model schools are not explicitly described, thus reduction of transmission is implemented by multiplying the school contact matrix Kschool by a reduction factor.

Home working and stay-at-home.

We consider home confinement for both working and non-working population (the stay-at-home idea). Working individuals may be asked to stay at home and limit their main interaction with member of their household. Similarly, non-working individuals may be asked to remain home. Both confinement measures are implemented mechanistically: we assume that a fraction fw of working individuals will stay at home and not commute or interact for work, having work transmission reduced to zero. Similarly, individuals who are staying at home due to social distancing, will cease to interact socially and will only have contacts within their household.

Household transmission.

The model uses synthetic households as discussed previously. We assume that due to school closures, home working and stay-at-home, household transmission may increase: this is implemented by tuning the multiplicative factor fhh introduced earlier.

Travel restrictions.

Travel restriction may be implemented by reducing the probabilities pij underlying the radiation model (Ref. [25] and S1 Text). Note also that individuals who do not go to work or are quarantined do not move. The application of certain restrictions (for instance home working and stay-at-home) will indirectly contribute to travel reduction. Travel restriction may also be implemented by preventing individuals from moving between certain areas of the country, typically between administrative subdivisions. Travel restrictions are country specific and typically require specific implementations.

Fitting methods and epidemiological data

We use an approach based on recent work on Gaussian optimization detailed in Refs. [34] and [35], and adapted to our specific needs. The method is particularly advantageous as it reduces the computational time required for the fitting, making highly detailed modelling available to fast exploration of model features. We also use a Sequential Monte Carlo [36] algorithm for verification of the goodness of the fits or in alternative to Gaussian optimization. Both methods are detailed in S1 Text. We fit the basic reproductive number R0, the recovery rate γ, and the initial condition by setting an importation rate modulated by province (for Spain and Italy) and Local Authority (for the United Kingdom) and the time span during which importations occur. The four parameter and the assumptions on their priors are listed in S1 Text. Since rates of case confirmation in Italy, Spain and UK may differ, we fit on daily national death counts.

We fit age-dependent susceptibilities using an approximated approach based on the eigenvector corresponding to the maximum eigenvalue of the contact matrix. Specifically, we build a pseudo contact matrix which reproduces the age distribution of cases obtained from a simulations with unitary susceptibilities, and then we fit a set of susceptibilities that alters the age distribution to reproduce the one from data. The method is detailed in S1 Text.


The analysis of the data from Spain and Italy allows us to estimate ascertainment rate of cases, rates of severe and critical cases and rates of death, to finally determine the underlying infection fatality ratio (Table 1). The latter is the rate of fatality per infected individual, regardless of it being detected or not. A corresponding infection-fatality rate for reported cases can be estimated by dividing the IFR by the ascertainment rate. In S1 Archive we show that the IFR obtained can reproduce observed age distribution of deaths in European countries within reasonable accuracy. In addition to these rates, we also estimated age-dependent susceptibilities for each of the countries discussed: the results are consistent among countries, and allow us to derive average susceptibilities to the SARS-CoV-2 virus usable as reference in future works (Table 2). The results show that susceptibility to the SARS-CoV-2 virus is low for individuals up to 50 years of age, and then climbs rapidly with age.

Table 2. Estimated relative susceptibilities for Italy, Spain and the United Kingdom with respect to the 80+ age group.

The last column is an average of the previous three that can be used as reference in future works. Values in brackets represent the 95% confidence interval.

We estimted, for each country, the unrestricted R0 appearing in the force of infection (1), which would be the R0 of the disease measured in the reference country in the absence of any containment measure. This represents a reference estimate that is independent of the specific contact pattern of the country under study. We find that the three countries lead to similar estimates of R0 (Italy: 4.2, 95% CI [3.4–5.2]; UK 4.1, 95% CI [3.2–5.3]; Spain 4.6, 95% CI [3.9–5.8]), while the number of predicted cases is in line with what observed in data.

The results for the spatial distribution of cases show a remarkable similarity with data, provided the description of human mobility is adequate. In Italy (Figs 2 and 3), the density of cases remains high in the Northern part of the country, especially in the Lumbardy, Veneto and Emilia-Romagna regions which were the areas of Italy that registered the highest number of cases during the peak of the epidemic. In simulations, clusters of cases are also observed in Rome, Naples and other densely populated areas of the south, however the fraction of infected population of these areas remains low. The difference may reflect a fundamental difference in the geographic distribution of the population between the highest infected areas and the rest of the country: the affected areas in the north of Italy form a continuous densely inhabited hinterland, with population densities among the highest in Europe (Lumbardy alone counts one sixth of the population of Italy). This extended area of interconnected highly population densities, coupled with measures that curbed mobility within the country, may have favoured the local spread of the disease whilst containing it within the affected regions.

Fig 2.

(A) Country level dynamics of SARS-CoV-2 inferred by death incidence data for Italy. Fitted parameters are R0 = 4.2 (95% CI [3.4—5.2]), t0 = 17 (95% CI [1—46]) days, τγ = 5.7 (95% CI [1.7–7.9]), and log ω = −7.5 (95% CI [-10—-4.6]) days−1. (B) Age distribution of cases predicted by the model and comparison with data from Italy.

Fig 3.

(A) Comparison between predicted and observed spatial spread of SARS-CoV-2 cases for Italy. Base layer of maps available from (B) Frames from an average of 10 simulation for Italy, corresponding to 7-days cases at 7-days intervals ending on 1st March, 15th and 5th April: top frames show the average incidence, bottom frames show the incidence as a fraction of the population. Maps are based on a coarsening to 5km of the WorldPop maps available at

Results for the United Kingdom (Figs 4 and 5) also show a remarkable resemblance to the observed spatial cumulative incidence. The initial stages of the epidemic are dominant in the highly populated areas of England, and spread towards lower populated areas during the declining phase following restrictions. The United Kingdom was subject to several importations from abroad (evidence of this is also suggested by a recent report on genomic studies [37]) that led to a widespread distribution of cases. The United Kingdom presents a high population density in and around large centres in England, with lower population densities in Wales and Scotland; restriction in the United Kingdom were less severe than those imposed in Spain and Italy, and mobility level were higher than those observed in the other two countries, especially towards the end of the lockdown period: in simulations, this is reflected in the spread towards low populated areas during the later stages of the declining phase.

Fig 4.

(A) Country level dynamics of SARS-CoV-2 inferred by death incidence data for the United Kingdom. Fitted parameters are R0 = 4.1 (95% CI [3.2—5.3]), t0 = 31 (95% CI [6—54]) days, τγ = 3.9 (95% CI [0.3–7.5]), and log ω = −5.2 (95% CI [-7.6—-2.8]) days−1. (B) Age distribution of cases predicted by the model and comparison with data from the United Kingdom.

Fig 5.

(A) Comparison between predicted and observed spatial spread of SARS-CoV-2 cases for the United Kingdom. Base layer of maps produced from data sets available at and (B) Frames from an average of 10 simulation for the United Kingdom, corresponding to 7-days cases at 7-days intervals ending on 26th March, 23rd April and 28th May 2020: top frames show the average incidence, bottom frames show the incidence as a fraction of the population. Maps are based on a coarsening to 5km of the WorldPop maps available at

Simulation for Spain (Figs 6 and 7) fail to reproduce the correct spatial spread of the SARS-COV-2: this is likely a direct consequence of the inadequacy of the radiation model to correctly describe mobility within the country. Direct inspection of commuting data for Spain reveals that most commuting occurs between a few specific provinces (for instance between Madrid and a few of its neighbouring provinces), whilst the rest of the country is subject to a substantially lower level of transmission. This makes it hard to any model based purely on distance and population size, like the gravity model, the radiation model and their descendants, to provide a correct description of mobility that is valid countrywide. This discrepancy in the description of the spatial spread might be behind the discrepancy observed in the age distribution of cases in Fig 6B. In our simulations, the highest populated area of Madrid has the highest count of cases, however rural and less densely inhabited areas of the country are subject to diffused local transmission. The proportion of the elderly population in rural areas is higher than that in the area of Madrid, which explains why our simulations produce a higher proportion of cases in the elderly population and a relative smaller proportion in other age groups. Similarly to the United Kingdom, importation of initial cases was more widespread in the country with respect to Italy, with cases occurring in provinces with low density population. Mobility of individual greatly diminished in the middle of March but increased again during the following weeks: spatial spread on short scales can be observed during the declining phase of the epidemic.

Fig 6.

(A) Country level dynamics of SARS-CoV-2 inferred by death incidence data for Spain. Fitted parameters are R0 = 4.6 (95% CI [3.9—5.8]), t0 = 28 (95% CI [6—51]) days, τγ = 3.9 (95% CI [0.4–7.6]), and log ω = −7.4 (95% CI [-9.3—-5.7]) days−1. (B) Age distribution of cases predicted by the model and comparison with data from Spain.

Fig 7.

(A) Comparison between predicted and observed spatial spread of SARS-CoV-2 cases for Spain. Base layer of maps available at (B) Frames from an average of 10 simulation for Spain, corresponding to 7-days cases at 7-days intervals ending on 19th March, 2nd April and 23rd April: top frames show the average incidence, bottom frames show the incidence as a fraction of the population. Maps are based on a coarsening to 5km of the WorldPop maps available at


Our estimates of the IFR obtained from Spanish and Italian data differ from the early estimates of Verity et al. [16]: the rates found suggest a lower IFR for middle aged individuals (30–60) and a higher IFR for 80+ individual with respect to the Verity estimates. Our estimates take advantage of the Spanish serological survey, and thus do not require to estimate the fraction of asymptomatic infections that occur within a population—a fraction that still today is not well quantified. Nevertheless, our IFR reproduce reasonably well the age distribution of deaths registered in several countries. We estimated also an age-dependent susceptibility that shows a marked dependence on age. The estimated susceptibilities confirm that young individuals are less susceptible than older individuals to the virus, and they further suggest that this low susceptibility is shared by middle aged individuals (30–60 years old) that instead account for a large fraction of cases. The reason of this apparent contradiction is acutally due to the fact that in our modelling approach we distinguish between working and non-working individuals, with the first subject to more mixing due to differential measures imposed by government.

Previous studies have reported a range of susceptibilities as discussed in the review by Russell et al. [38]. The authors of the review report an odd of being an infected contact in young individuals less than 19 year old compared to adults of age greater than 20 years old of 0.56 (95% CI 0.37—0.85), pooled from all reported studies. Based on our susceptibilities, one can build the average susceptibility ΘY of age set Y. which is simply an average over the population sizes ΘY = ∑jY Nj σj/∑jY Nj, where Nj is the size of age group j. The ratio Θ≤19 y.o.≥20 y.o. is the relative odds ratio of being an infected contact between children and adults: focussing for simplicity on Spain, one gets a value of ≈ 0.43 which is not far from the value found by Russell et al. [38].

The model uses an agent based description of the country population that incorporates households, as well as interventions adopted by the various government, or changes in contact patterns measured by social data. Some of the parameters we use are based on previous estimates: for instance the age-mixing matrices are based on the work of Prem et al. [23] which are projections from socio-economical data. Reduction in contact patterns are based on estimates from social data which we interpret as a direct measure of such reduction. The simulation model includes importations from other countries and takes into account, to some extent, the reduction of air travel between countries: the spatial distribution of imported cases reflects the distribution of the initial cases observed in each country. Super spreading events are not explicitly included in this approach, but their effects are implicitly incorporated by the modulated importations in the initial stages of the simulation.

We adapt modern methods for Bayesian optimization that provide fast estimates of model parameters, permitting to simulate and estimate parameters for a country of roughly 60 million individuals using detailed agent based simulations in a relative short simulation time. Despite the simulations being parallelized, the version of this fitting algorithm is currently not parallelized: methods for parallelizing Gaussian processes do exist. The simplest option would be to draw multiple points from the current posterior variance and progressively update it as the evaluation of each point completes. The Sequential Monte Carlo algorithm for parameters inference is instead fully parallelized, being however computationally more expensive.

Our simulation model is capable of reproducing several features of the spread of COVID-19 in three distinct European countries: the spatial spread, the temporal dynamics of the epidemic, the age distribution of cases. We estimated an unrestricted and country independent basic reproductive number R0 (referenced to Italy) which is higher than previous estimates in European countries. The difference is likely the result of the complexity of the model that deviates from a fully mean field description of the spread of the disease. Our spatial description uses highly resolved spatial maps, and individuals have preferential contacts in the local neighbourhood, thus reducing the effective population that each individual can see. The introduction of movement control in all of these countries caused a further containment of human mobility and thus a further reduction of the effective population that each individual would see: this, combined to other aspects of our model like the fact that we force individuals to stay at home during lockdowns, make these deviation even more exacerbated. It is however possible to measure the effective reproduction number Re from the early growth of our simulations. This value is a direct measure of the basic reproductive number based on a well mixed description of the population and is thus better placed for a comparison with earlier estimates. If λ is the rate of growth during the early phase of the epidemic, then Re = (λ/γ + 1)(λ/σ + 1) [39]. For the three countries we get an estimate of Re (Italy: 3.0, 95% CI [1.1–4.9]; UK 2.6, 95% CI [0.6–4.6]; Spain 2.9, 95% CI [0.9–4.9]) remarkably in line with those reported in earlier studies [5, 6, 40].

The spatial spread of the disease is correctly described as long as mobility can be described by an underlying model for human mobility. The latter was modelled using the radiation model, preferred to the gravity model due to the need of aggregating fluxes between grid elements up to the level of counties. For Italy and the United Kingdom, the radiation model achieves a value for the Common Part of Commuters, a measure of similarity between predicted and real commuting fluxes, of about 0.65 which, considering the use of an aggregation process from base areal units, is surprisingly in line with the typical performance of both the radiation and gravity model. However, the limitations of the radiation model are evident in the description of mobility in Spain that only scores a value for the Common Part of Commuters of 0.21. Possibly, the use of more recent models for human mobility might mitigate such discrepancies [31].

The agreement of the R0 obtained from the independent fits performed in each country as well as the capability of the model of reproducing the observed data suggest its possible use in other contexts. One possibility could be to estimate the impact of future interventions in the countries studied in this work or other European countries, provided age-stratified information on cases and deaths from COVID-19 as well as information on the mobility of individuals are available for such countries. Another possibility could be its use in Low and Middle income countries: the key ingredients that would permit this are the availability of contact matrices from the work of Prem et al. [23] for almost all countries worldwide, the assumption that age-dependent susceptibility does not depend on anything other than age, and similarly the assumption that the IFR of COVID-19 derived from the Spanish data could be exported to other countries. Under these conditions the model could be used to explore the spread of COVID-19 in African and South American countries. Data on human mobility should also be available to permit studies of spatial spread, but since this is often not readily available in these contexts, with a bit of effort one could use spatially detailed time series of detected cases to infer the level of human travel by fitting an appropriate model for mobility.

Supporting information

S1 Text. Extended description of methods and data sources.


S1 Archive. Estimates of age dependent ascertainment rates, risk of severe, critical cases, death risks and infection fatality ratio from Spanish and Italian data.

Includes the R markdown for producing estimates and the required data files.



The authors wish to thank Erin Gorsich, Trystan Leng, Hector McKimm, Bridget Penman and Emma Southall for sharing their analysis of recent literature of COVID-19.


  1. 1. Patterson KD, Pyle GF. The geography and mortality of the 1918 influenza pandemic. Bulletin of the History of Medicine. 1991;65: 4–21. pmid:2021692
  2. 2. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases. 2020;20: 533–534. pmid:32087114
  3. 3. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia. New England Journal of Medicine. 2020;382: 1199–1207. pmid:31995857
  4. 4. Read JM, Bridgen JR, Cummings DA, Ho A, Jewell CP. Novel coronavirus 2019-nCoV: early estimation of epidemiological parameters and epidemic predictions. medRxiv:2020.01.23.20018549 [Preprint]. 2020 [cited 2020 May 15].
  5. 5. Riccardo F, Ajelli M, Andrianou XD, Bella A, Manso MD, Fabiani M, et al. Epidemiological characteristics of COVID-19 cases and estimates of the reproductive numbers 1 month into the epidemic, Italy, 28 January to 31 March 2020. Eurosurveillance. 2020;25: 2000790. pmid:33303064
  6. 6. Guirao A. The Covid-19 outbreak in Spain. A simple dynamics model, some lessons, and a theoretical framework for control response. Infectious Disease Modelling. 2020;5: 652–669. pmid:32869008
  7. 7. Salje H, Kiem CT, Lefrancq N, Courtejoie N, Bosetti P, Paireau J, et al. Estimating the burden of SARS-CoV-2 in France. Science. 2020;369: 208–211. pmid:32404476
  8. 8. Porcheddu R, Serra C, Kelvin D, Kelvin N, Rubino S. Similarity in Case Fatality Rates (CFR) of COVID-19/SARS-COV-2 in Italy and China. The Journal of Infection in Developing Countries. 2020;14: 125–128. pmid:32146445
  9. 9. Fontanet A, Tondeur L, Madec Y, Grant R, Besombes C, Jolly N, et al. Cluster of COVID-19 in northern France: A retrospective closed cohort study. medRxiv:2020.04.18.20071134 [Preprint]. 2020 [cited 2020 May 15].
  10. 10. Lavezzo E, Franchin E, Ciavarella C, Cuomo-Dannenburg G, Barzon L, Del Vecchio C, et al. Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo’. Nature. 2020;584: 425–429. pmid:32604404
  11. 11. Mizumoto K, Kagaya K, Zarebski A, Chowell G. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance. 2020;25: 2000180.
  12. 12. European Centre for Disease Prevention and Control (ECDC). Novel coronavirus disease 2019 (COVID-19) pandemic: increased transmission in the EU/EEA and the UK—sixth update—12 March 2020. Stockholm: ECDC; 2020. Available from:
  13. 13. Emery JC, Russell TW, Liu Y, Hellewell J, Pearson CA, CMMID COVID-19 Working Group, et al. The contribution of asymptomatic SARS-CoV-2 infections to transmission on the Diamond Princess cruise ship. eLife. 2020;9: e58699. pmid:32831176
  14. 14. Zhang J, Litvinova M, Liang Y, Wang Y, Wang W, Zhao S, et al. Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China. Science. 2020; eabb8001. pmid:32350060
  15. 15. Ministerio de Sanidad, Consumo y Bienestar Social, Instituto de Salud Carlos III. Estudio ENE-COVID19: Primera Ronda del Estudio Nacional de Sero-Epidemiología de la infección por Sars-CoV-2 en España. Informe Preliminar 13 De Mayo De 2020. 2020 [Cited 2020 May 30]. Available from:
  16. 16. Verity R, Okell LC, Dorigatti I, Winskill P, Whittaker C, Imai N, et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. The Lancet Infectious Diseases. 2020;20: 669–677. pmid:32240634
  17. 17. Ferguson N, Laydon D, Nedjati-Gilani G, Imai N, Ainslie K, Baguelin M, et al. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. Imperial College London. 2020 [cited 2020 April 7].
  18. 18. European Centre for Disease Prevention and Control (ECDC). Coronavirus disease 2019 (COVID-19) in the EU/EEA and the UK—tenth update—11 June 2020. Stockholm: ECDC; 2020. Available from:
  19. 19. Google COVID-19 Community Mobility Reports. 2020 [Cited 2020 Jul 02]. [Internet]. Available at:
  20. 20. Apple Mobility Trends Reports. 2020 [Cited 2020 Jul 02]. [Internet]. Available at:
  21. 21. Grasselli G, Zangrillo A, Zanella A, Antonelli M, Cabrini L, Castelli A, et al. Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy. JAMA. 2020;323: 1574–1581. pmid:32250385
  22. 22. School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University. WorldPop ( 2018. Global High Resolution Population Denominators Project—Funded by The Bill and Melinda Gates Foundation (OPP1134076).
  23. 23. Prem K, Cook AR, Jit M. Projecting social contact matrices in 152 countries using contact surveys and demographic data. PLOS Computational Biology. 2017;13: e1005697. pmid:28898249
  24. 24. Hilton J, Keeling MJ. Estimation of country-level basic reproductive ratios for novel Coronavirus (SARS-CoV-2/COVID-19) using synthetic contact matrices. PLOS Computational Biology. 2020;16: e1008031. pmid:32614817
  25. 25. Simini F, González MC, Maritan A, Barabási A-L. A universal model for mobility and migration patterns. Nature. 2012;484: 96–100. pmid:22367540
  26. 26. Masucci AP, Serras J, Johansson A, Batty M. Gravity versus radiation models: On the importance of scale and heterogeneity in commuting flows. Phys Rev E. 2013;88: 022812. pmid:24032888
  27. 27. Marguta R, Parisi A. Impact of human mobility on the periodicities and mechanisms underlying measles dynamics. Journal of The Royal Society Interface. 2015;12: 20141317. pmid:25673302
  28. 28. Marguta R, Parisi A. Periodicity, synchronization and persistence in pre-vaccination measles. Journal of The Royal Society Interface. 2016;13: 20160258. pmid:27278363
  29. 29. Tizzoni M, Bajardi P, Decuyper A, King GKK, Schneider CM, Blondel V, et al. On the Use of Human Mobility Proxies for Modeling Epidemics. PLOS Comput Biol. 2014;10: e1003716. pmid:25010676
  30. 30. Sørensen T. A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biol Skr. 1948;5: 1–34.
  31. 31. Lenormand M, Huet S, Gargiulo F, Deffuant G. A Universal Model of Commuting Networks. PLOS ONE. 2012;7: e45985. pmid:23049691
  32. 32. Yang Y, Herrera C, Eagle N, González MC. Limits of Predictability in Commuting Flows in the Absence of Data for Calibration. Scientific Reports. 2014;4: 5662. pmid:25012599
  33. 33. Secreteria General de Sanidad, Direccíon General de Salud Publica, Calidad e Innovacion. Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 108. Enfermedad por el coronavirus (COVID-19) 17.05.2020. 2020 [cited 2020 May 28]. Available from:
  34. 34. Gutmann MU, Corander J. Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models. Journal of Machine Learning Research. 2016;17 1–47.
  35. 35. Järvenpää M, Gutmann MU, Pleska A, Vehtari A, Marttinen P. Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation. Bayesian Anal. 2019;14: 595–622.
  36. 36. Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. Journal of The Royal Society Interface. 2009;6: 187–202. pmid:19205079
  37. 37. du Plessis L, McCrone JT, Zarebski AE, Hill V, Ruis C, Gutierrez B, et al. Establishment and lineage dynamics of the SARS-CoV-2 epidemic in the UK. medRxiv:2020.10.23.20218446 [Preprint]. 2020 [cited 2021 March 24].
  38. 38. Viner RM, Mytton OT, Bonell C, Melendez-Torres GJ, Ward J, Hudson L, et al. Susceptibility to SARS-CoV-2 Infection Among Children and Adolescents Compared With Adults: A Systematic Review and Meta-analysis. JAMA Pediatr. 2021;175: 143. pmid:32975552
  39. 39. Wearing HJ, Rohani P, Keeling MJ. Appropriate Models for the Management of Infectious Diseases. PLOS Medicine. 2005;2: e174. pmid:16013892
  40. 40. Pellis L, Scarabel F, Stage HB, Overton CE, Chappell LHK, Lythgoe KA, et al. Challenges in control of COVID-19: short doubling times and long delay to effect of interventions. medRxiv:2020.04.12.20059972 [Preprint]. 2020 [cited 2021 May 2].