Using mobility data in the design of optimal lockdown strategies for the COVID-19 pandemic

A mathematical model for the COVID-19 pandemic spread, which integrates age-structured Susceptible-Exposed-Infected-Recovered-Deceased dynamics with real mobile phone data accounting for the population mobility, is presented. The dynamical model adjustment is performed via Approximate Bayesian Computation. Optimal lockdown and exit strategies are determined based on nonlinear model predictive control, constrained to public-health and socio-economic factors. Through an extensive computational validation of the methodology, it is shown that it is possible to compute robust exit strategies with realistic reduced mobility values to inform public policy making, and we exemplify the applicability of the methodology using datasets from England and France.


Introduction
The COVID-19 pandemic has put quantitative decision-making methods (and the lack thereof) in the spotlight.Designing informed non-pharmaceutical intervention strategies (NPIS) to mitigate the pandemic effects has been a controversial issue worldwide.In particular, the planning of effective lockdown policies and their posterior lifting based on real-time data still remains a largely open problem.A vast amount of research efforts has been dedicated to model the COVID-19 pandemic focusing on the various aspects of the system dynamics, such as estimating the value of the basic reproductive number [17,21], evaluating the effect of containment measures and travel restrictions [32,15,21,26,12], assessing the effect of age on the transmission and severity of the disease [13,17] and estimating the impact on Health Services [18].It is remarkably hard, if not impossible, to capture every aspect of this complex phenomenon in an integrated and computationally tractable mathematical model.With this in mind, our goal in the present work is to study the dynamics of COVID-19 spread by integrating a dynamical epidemiological model with mobile phone data from the population.Such an adjusted model yields an accurate account of the real displacement of the population between different locations (e.g.workplaces, schools, etc.) during the pandemic, and serves as the basis for determining lockdown and exit strategies which are optimised according to public health and socio-economic constraints.
Related Literature.Previous works have modelled the impact of NPIS using extensions of the classical Susceptible-Infected-Removed (SIR) model [19] with the inclusion of compartments corresponding to asymptomatic population which are either exposed to infections but not yet infectious or infected and infectious [28].Given the current knowledge of the COVID-19 disease, this constitutes a complete description of the possible states.Among the works based on the aforementioned state space representation, [26], one of the main inspirations behind our work, uses an age-structured model to quantify the effect of control measures imposed in Wuhan, China and concludes that there exists a large potential on the use of NPIS for mitigating the COVID-19 pandemic.Furthermore, the authors recommend a gradual relaxation strategy in comparison to an early lifting of the imposed lockdown measures to avoid possible second and third waves of the pandemic.Using a stochastic modification of a compartmental model without age structure, [21] reaches similar conclusions and quantifies the effectiveness of lockdown measures by estimating the reproduction number, which decreased from a median value of 2.35 before travel restrictions were imposed, to 1.05 one week after the implementation of travel restrictions.In the context of resorting to optimal control methods to determine a lockdown policy, [27] compares a switching on-off strategy with a two-stage release from quarantine (with part of the population released first, and the others later).The authors consider a threshold-based sanitary cost functional aiming at releasing the largest possible population without exceeding the availability of hospital beds, and their conclusion is to favour the second strategy.[11] consider multiple control levers, such as the number of tests (both virologic and anti-body tests) and the increase of ICU beds in addition to the reduction of social contacts, and uses a cost functional involving both economic and sanitary costs.The authors suggest an optimal lockdown policy which involves a quick and strong isolation, followed by a large increase in the number of tests.
Our contribution.While the qualitative modelling and control concepts of the aforementioned works are aligned with the epidemiological literature, we remark that most of these papers use parameter values collected from previous works or estimated using either different models or from clinical knowledge.Hence, they cannot be directly applied to populations with different spatio-temporal patterns.In contrast, the main goal of this paper is to propose a framework which calibrates the model using epidemiological and real-time mobility data from a specific population, measured by Google Mobility through Android devices, and computes an optimal lockdown strategy for that population at any point of the pandemic.To achieve this, we combine parameter estimation for an epidemiological model with a subsequent optimal control step.Secondly, our framework includes the computation of the optimal lockdown policy in a nonlinear model predictive control framework, leading to a robust feedback protocol which allows not only real-time adaptation of the release strategy but also a partial lockdown, as opposed to a switching on/off strategy which can be too restrictive.Methodological summary.Our epidemiological model simulates the transmission dynamics of COVID-19 spread in the English and French populations using anonymised data on the reduction of the population mobility collected through smartphones and released by Google.Further, exploiting Approximate Bayesian Computation (ABC), we calibrate this model using data on daily deaths and the number of people in hospitals with COVID-19 released from public health authorities (Public Health England (PHE) and the National Health Service (NHS) in England, and Santé publique France (SpF) in France) [23,24,34,35].In ABC, we assume a prior distribution for each parameter value and, given a dataset with an inherent observation noise, we obtain a non-parametric estimate of the joint probability distribution of the parameter values.This allows us to estimate parameters of our model for the specific cases of England and France, as opposed to inheriting parameters from epidemiological models from other countries.Some key attributes of our model are the inclusion of age-dependent transition probabilities between the different compartments which are also estimated from data, as well as age-dependent social distancing, and the use of Google mobility data to quantify the effect of social distancing measures in reality.Having calibrated our model, we design a lockdown strategy which is differentiated according to social contact categories including schools, work, and others.This allows us to assign a different economic penalty for each one of them.Borrowing a leaf from optimal control theory, we synthesize an optimised lockdown and exit strategy which minimizes the number of COVID-19-related casualties in the population, but also takes into account economic constraints.In order to perform this task, we quantify the relation between the decrease in social contacts with the reduction in the population mobility and optimise with respect to the latter, which is an effectively measurable quantity compared to an abstract decrease in social contacts.A methodological summary is depicted in Figure 1, illustrating the interaction among the different building blocks of our approach.It is important to note that the proposed methodology transcends the design of lockdown strategies for the COVID-19 pandemic, and can be applied for more general epidemiological models, different datasets, and a variety of control objectives.What is fundamental in our approach is the existence of a dynamical model, the assimilation of data for the optimal estimation of model parameters and uncertainties, and the optimization of an external input action to control the system towards a desired state.

Epidemiological model using Google mobility
Our epidemiological model is developed in order to exploit data on change of mobility (in our case, provided by Google) and information on the social contacts patterns at different types of locations, such as schools or workplaces (eg.estimated by the BBC Pandemic project [20] for the UK and the POLYMOD study [25] for 152 countries in the world).In this manuscript, we will focus on explaining our model and the methodology will be illustrated for both England and France; however, we stress that the methodology can be extended to consider any other country, and, in fact, other epidemiological models, where suitable data is available.In this Section, we first explain the fundamental dynamic properties considered in our model and then explain how these two data sources are used in our compartmental model.

Model dynamics
Assuming a well-mixed population (namely, each person has the same probability of interacting with any other person in the population), we consider a compartmental model [8], splitting the population in different compartments representing different states of the infection.Although this is extremely simplifying, compartmental models under this assumption are widely used to describe the dynamics of epidemics over large populations.Our model considers the following compartments: • Susceptible (S), meaning people who did not have any contact with the infection, • Exposed (E) to the infection, but not yet infectious, • Infected SubClinical (I SC , split in I SC1 and I SC2 ), not needing medical attention, • Infected Clinical (I C , split in I C1 and I C2 ), needing medical attention, • Recovered (R), which we assume are resistant to a new infection, • Deceased (D).
As strong evidence towards the age-dependent severity of COVID-19 has been observed in previous research works [17,13], we consider age-stratification of all of the states along 5 age groups: 0-19, 20-39, 40-59, 60-79, 80+, hence we will use the notation E i to denote the Exposed population in the i-th age group, and similarly for the other states.The model will assume that all the age groups are susceptible to the infection in the same way, but that the severity is strongly dependent on the age of the patient through age-dependent probabilities of necessity of hospitalization (ρ i ) and death if hospitalized (ρ i ) for the i-the age group.Figure 1: Flow diagram of the data-driven approach for the synthesis of optimal lockdown policies.The initial step consists of a policy maker defining a performance measure based on sanitary and economic objectives, and a modeller selecting a consistent generic epidemiological model.Then, public healthcare/mobility data is used in conjunction with Approximate Bayesian Computation to calibrate the dynamical model and determine the degree of uncertainty in the model parameters.This assists the formulation of an optimal control problem where the original sanitary + economic performance measure is optimized constrained to the calibrated epidemiological model.An optimal lockdown policy is then computed via global optimization techniques, and the final output is an optimal lockdown policy.The optimised lockdown is then applied and its real-time effects can be sensed through public data, which fed back into the learning+optimization framework for re-computation and update.
Another key assumption of our model is that when a patient is hospitalized and diagnosed, they are isolated and therefore not able to spread the infection.To reflect this scenario, we assume that from the exposed state and after some incubation period, all patients will become sub-clinical I SC , in which state they are infectious.Afterwards, some of them will recover (R) and others will need clinical help (I C ); we model this by splitting I SC into two categories: the ones recovering straightaway (I SC2 ) and the ones in need of clinical care (I SC1 ).The split happens with an age-dependent probability ρ i .After some time, people in I SC1 will go to hospital, therefore moving to the I C state; similarly as before, the latter state is split in two categories according to the final outcome: the ones in I C1 will decease (D) after some time, while the ones in I SC2 will recover (R).This split is again described by an age-dependent probability, which we denote as ρ i .A visualization of the dynamics is given in Fig 2 .Figure 2: Graphical representation of the model, for each age group.The green color represents a compartment that is observed independently for each age group, while blue represents a compartment whose sum across age groups is observed.

S E
Mathematically, this can be described using the following system of ODEs: where and C is the contact matrix representing the frequency of contacts between different age groups [20], where each element C ij represents the average daily number of contacts a person in age group i has with people in age group j.To simulate from the model, the ODEs (Equations ( 1)-( 8)) are integrated using a 4th-order Runge Kutta integrator, with a timestep dt = 0.1 days; the dynamics is started on the 1st of March.

Influence of mobility on the dynamics
The contact matrices are a crucial component defining our model dynamics, and were estimated in the POLYMOD study [25] for a large set of countries (among which England and France).Specifically for the contact matrices for England, the findings of the more recent BBC Pandemic project [20] were integrated using the procedure described in [20].Note that [20] provided contact matrices for the whole UK, but we assume the ones for England are well represented by those.Finally, the age groups considered in these studies (namely, 5 year bands) are finer than the ones we consider in the present work; we therefore aggregate the data to make contact matrices suit our needs.The (i, j)−th entry of this contact matrices at different locations (eg.home, workplace, school and other locations) represent the amount of daily contacts an individual in age group i has with individuals from age group j in different settings (see Fig 3).Before the lockdown, the total contact matrix is simply the sum of the contributions of these different locations: Figure 3: Contact matrices at different locations in the UK for the age groups used in the present study (0-19, 20-39, 40-59, 60-79, 80+); these are obtained by aggregating and combining the contact matrices for 5-year bands provided by [20] and [25].
However, the introduction of lockdown measures lead to considerable change to people's social activity and mobility; we model this by introducing a set of multipliers (for each age group and for each of the locations) which will represent the change in the number of social contacts: where α i , for ∈ {school, other, work}, represents the change of social contacts for age group i in category .These multipliers are a function of time and not easily accessible.Instead, it is rather easy to measure the reduction of people's mobility towards the different locations; we choose then to express the α's as a function of the mobility values provided by Google, as explained in the next paragraph.
Mobility data is collected by Google to reflect the reduction of the population mobility during lockdown for each country, by following the movements of Android phones; anonymised data is publicly available [3].Similar datasets can be retrieved directly from other sources, such as mobile phone companies [7].This dataset captures mobility towards the following locations: "residential", "workplaces", "parks", "retail and recreations", "transit stations" and "grocery and pharmacy", which we denote respectively as m residential , m work , m parks , m retail , m transit , m grocery ; the changes in mobility are reported with respect to the baseline values prior to introduction of lockdown measures.As can be seen in the left panel of Fig 4, a strong weekly periodicity is present in this data; we therefore use a Savitzky-Golay filter [29] to remove it.
We moreover combined the mobility values m parks , m retail , m transit , m grocery in order to obtain an aggregated value for the reduction of mobility towards "other locations": m other = 0.1 • m parks + 0.3 • m retail + 0.3 • m transit + 0.3 • m grocery .Even though the numerical values of the weights are arbitrary, this choice is motivated by our observation that the value of the different contributions of the mobility data is quite similar; we also attribute a smaller value to "parks" as people get less in strict contact with each other there with respect to "retail", "transit" or "grocery" locations.
As no data with regards to schools were provided, we fixed the value of m school to be 0.1 from the day schools and universities were closed except for children of essential workers (23rd of March [38] for England, 16th March for France [37]).The mobility data obtained after the aggregation and smoothing operations described above is presented in the right panel of   4: Raw and elaborated mobility data in the UK.In the raw mobility data, which is scaled with respect to a baseline value representing average mobility in the months prior to the pandemics, a strong weekly seasonality is present, which we mostly removed using a Savitzky-Golay filter.Moreover, we note that the mobility towards "residential" (which is not used in our analysis) locations is larger during the lockdown months than before, as people spend more time in their homes.Finally, note the very large increase in people's mobility towards parks around the end of the winter season.That contribution is however only one of the components in our aggregated m other mobility value, so the latter does not increase that abruptly.In order to connect the reduction in mobility to the reduction in the number of contacts, we proceed in the following way: first we assume that the number of residential contacts stays constant; in fact, we expect the behavior of people at home not to differ too much with respect to what it was prior to the introduction of lockdown measures; for our model, this amounts to fixing α home i (t) = 1, ∀t, ∀i.With regards to the remaining contributions to the total number of contacts, we expect data on mobility reduction to be representative of the subset of the population which mostly uses smartphones, which is likely to be younger than 60 years old.Moreover, a large part of the population older than 60 years old does not take part in work or school activities.Motivated by these arguments, we define the values of α school i , α other i and α work i separately for population below and above 60.Specifically, we assume the following for age groups below 60 years old (age group index 1,2,3): where we made the dependence on time explicit in order to highlight that α 123 ∈ [0, 1] is a time-independent scalar, which is an additional parameter in our model; this amounts to assuming that the reduction in the number of contacts is due to a combination of reduced mobility and increased awareness of people, for instance by maintaining social distancing.For people above 60 years old (age group index 4 and 5) we assume instead that the reduction in contacts stays constant since the introduction of lockdown measures and that such reduction is equally distributed across the different categories (as the contacts for work and school will be relatively few): where α 4 , α 5 ∈ [0, 1] are time-independent scalars, which are parameters in our model as well.This latter assumption is motivated by the fact that such part of the population is more susceptible to the disease, so that the official advice will be for them to be as isolated as possible throughout the epidemics.We initialise our implementation of the model dynamics on the 1st of March and we fix the contact matrix to be the standard one relative to the country until the introduction of lockdown measures (which we assume to be on the 18th March, i.e. two days after the UK government advised people to self-isolate [2] and one day after the French government banned all except essential journeys [36]); from that day onward, we use the contact matrix obtained from Eq. ( 9), by fixing α home = 1 and obtaining the values for the other α's by Eqs.(10) and (11).

Intitialization and model parameters
At the beginning of the dynamics, most of the population is in the S state, except for a small number of individuals which seed the infection.We therefore assume that some people were already infected on the 1st of March and we denote that number as N in ; this number is split across the different categories and age groups in the following way: • First, the total number of infected population is spread across the age groups with the following rates (from youngest to oldest): 0.1, 0.4, 0.35, 0.1, 0.05; these values come from the assumption that the disease was brought to the country from abroad, and we took that as an estimate of the age distribution of international travellers (for UK, a dataset describing age distribution of flight passengers is available and approximately equal to the provided values [1]).
• Then the number of infected individuals in each age group is split in the E, I SC1 and I SC2 compartments in the following way: The other compartments are initialized to 0, except for S, which is initialized to the total population in the corresponding age group obtained by the most recent country-specific census, from which the number of people seeding the infection at the start of the dynamics is subtracted.
Parameters which define the dynamics of our model and need to be calibrated are the following: • β: probability of a contact between an S and I SC individual resulting in the S individual catching the infection.
• κ = 1/d L : transition rate of an Exposed individual becoming Infected SubClinical, with d L the average number of days in the E state.
• γ C = 1/d C transition rate of going from I SC1 to I C , with d C the average number of days it takes to undergo this transition.
• γ R = 1/d R recovery rate from I SC2 , with d R the average number of days it takes to recover.
the average number of days it takes to recover.
• ν = 1/d D death rate from I C1 , with d D the average number of days before death occurs after entering the I C1 state.
• ρ i 's: age dependent probabilities of going to I C instead of directly recovering from the I SC state.
• ρ i 's: age dependent probabilities of death after being hospitalized.
• N in : total number of individuals who carried the infection at the start of the training period (1st of March).
• α 4 : constant value of reduction in social contacts for people in age group 4, after the beginning of the lockdown period.
• α 5 : constant value of reduction in social contacts for people in age group 5, after the beginning of the lockdown period.
• α 123 : coefficient of proportionality between reduction of social contacts and reduction of mobility for age groups 1,2,3.
These parameters will be estimated using Approximate Bayesian Computation (ABC), which provides a posterior distribution for them -the details of the ABC methodology and associated results can be found in Appendix A and Appendix B correspondingly.The results below are integrated over this posterior distribution, which allows us to design robust controls and quantify the underlying uncertainty in our results.

Optimised mobility values based on uncertainty
We now adopt the viewpoint of a policy maker whose task is to determine mobility restrictions on a population in order to slow down the spread of the COVID-19 epidemics, while still keeping the economic costs of lockdown as low as possible.We therefore formulate the problem in an optimal control setting.In this context, we will minimise a cost functional which includes penalties on the number of COVID-19 related deaths, hospital beds occupancy, and the economic cost of different types of lockdown.
The control variables in our problem are the reductions of the mobility values m school , m work , m other ("Mobility data").These are related to the coefficients in the contact matrices α school , α work , α other via the inferred values α 123 , α 4 and α 5 in ( 10) and ( 11), relating our control policy to measurable quantities.We recall here that the mobility values m , for ∈ {school, other, work} only control the change of contacts for age groups 1,2,3 (below 60 years old), while the change of contacts for age groups 4 and 5 (above 60) is instead represented by the parameters α 4 , α 5 , which we inferred from data.Our optimisation framework therefore assumes that the reduction in social contacts for age groups 4 and 5 stays fixed to the inferred value throughout the optimisation horizon, and we optimise only on the change of mobility referred to younger age groups.This is reasonable as the older age groups constitute a minor part of the workforce and are the extremely vulnerable to the disease.Therefore, we expect that the official advice for them will be to remain with stricter isolation rules than the rest of the (working) population.
We will take advantage of the ABC inferential framework both to quantify the uncertainty of the parameters of our model and to develop a lockdown strategy that is robust to uncertainties.To this end, we proceed as follows, where the details of each step will be given throughout this section.
1. Uncertainty quantification: Perform inference on the model parameters using data from the public health authorities and Google mobility to obtain a posterior distribution of the parameter values given the dataset.

2.
Posterior loss based cost functional: Define a cost functional that takes into account the economic cost of closing venues / reducing mobility to different locations and the sanitary cost of increased infection.
3. Nonlinear model predictive control: Optimize, over a fixed time frame, a lockdown strategy by minimizing the sanitary/economic cost functional, constrained to the inferred epidemiological dynamics.The optimisation is based on the integrated posterior distribution: this involves solving the epidemiological model forward using various sets of parameters values, sampled from the posterior distribution, and computing the expectation of the cost functional with respect to this distribution.This optimization step determines an optimal policy that is applied for a reduced amount of time, after which the model is updated and the optimal policy recomputed.
A diagram synthesizing this data-driven optimal control approach is presented in Figure 5.The applicability of this methodology goes beyond the design of control strategies for the COVID-19 pandemic, and can be applied to different dynamics and cost functionals.We remark again that even though this procedure is computationally costly, it pays off by offering a control strategy that is robust to a number of possible (and highly likely) scenarios, by taking into account the uncertainty on parameter estimates [6].
Figure 5: Flow diagram of the data-driven optimal control approach.Starting from a generic-type SEIRD model, we learn optimal model parameters based on mobility/healthcare datasets and Approximate Bayesian Computation.The output is a posterior distribution of model parameters, which is used to generate calibrated SEIRD dynamics and a cost functional accounting both for sanitary and economic costs of a lockdkown.These two ingredients determine the formulation of an optimal control problem, which is solved by means of a global optimization algorithm.The final output of our approach is an optimal lockdown policy which can be recalibrated as new data is fed into the system.

Uncertainty quantification
We use approximate Bayesian computation (ABC) [22] to calibrate the parameters of our model, by using the datasets reporting on the number of hospitalized and deceased patients released by the public health authorities: • The daily number of deaths in hospitals attributed to COVID-19 (per age group) • The daily number of hospitalized people with COVID-19 related diseases We calibrate our model on data from the 1st of March up to different ending times t obs (eg.31st of August).ABC is suitable for the considered task as it relies only on simulations from the model, and works by looking for values of parameters such that the integrated dynamics is close to the observed one.In this way, it provides the user with samples from an approximate posterior distribution of the parameters given observed data (π ABC (θ|x obs )).The better the match between the simulation and the observation, the better is the approximation of the true posterior π(θ|x obs ).However, this comes at higher computational cost, so that the level of approximation needs to be balanced with computational considerations.We discuss additional details regarding the use of ABC to calibrate our model in Section A.2 of Appendix A. Moreover, ABC allows us to fix a prior uncertainty on the values of the parameters (defined by a prior distribution π(θ)), and to quantify prediction uncertainty through the approximate posterior π ABC (θ|x obs ), which we will use in designing the optimal control task next.
We remark that the ABC algorithm which we employ (discussed in Section A.2 of Appendix A) provides us with a set of samples from the posterior distribution associated with importance weights.We can obtain independent and identically distributed (i.i.d.) samples from it by using bootstrap; namely, we choose (with replacement) from the set of ABC samples with probability proportional to the importance weight itself.The new set of samples generated in this way can be considered as i.i.d.from the posterior distribution of the parameters given data.

Posterior loss based cost functional
In this step we identify the cost functional we will optimize to determine an optimal lockdown strategy.We consider a cost functional which combines the economic cost of lockdown with the sanitary cost of lifting restrictions.We focus on the representation of these terms and on the inclusion of the posterior distribution of the model parameters in its modelling.Determining the relevance of sanitary versus economic costs is a task left to the policy maker.However, when a quantitative choice has been made, our methodology allows to test the effect of such a choice and to evaluate the stability of the optimal strategy.
We denote by t 0 the start of the optimisation interval, corresponding to the day we wish to start the new lockdown policy (in our first example, the 24th of May), and by T h the length of the interval in days, for which we want to apply the lockdown strategy, also known as optimisation horizon.To study the cost of lifting restrictions, we penalise the predicted number of deaths during the optimisation horizon [t 0 , t 0 + T h ].This is given by i where ∆D i (t) is the daily increase in the number of deceased in the age group i.Furthermore, we need to guarantee that the number of infected individuals who need hospitalisation, i , remains below the overall hospitals capacity H max .This could be included as a hard state constraint, here instead we penalise the event in which I C surpasses the total capacity by including a term of the form We point out that this term will not be activated if the levels of infected people who need hospitalisation remain well below H max , which here we take to be H max = 10000.For example, the number of people in hospital in England with COVID-19 related symptoms at the end of our first training window (23rd of May) was 7106, while at the height of the peak of the epidemic in England (on the 12th of April) this number was 17933.For France, numbers reach higher values; however, we keep the same value of H max = 10000, in order to penalize a large number of infected people in the same way across the two countries, and to have comparable values in the optimization strategy.As a final measure of the sanitary cost, we introduce a final time cost, where we penalise the basic reproduction number (R) at the end of the optimisation horizon R(t 0 +T h ) (details about R number for our model can be found in Section A.1 of Appendix A).This terminal penalty ensures that the control strategy does not simply output an optimal solution which switches off the reduction on mobility towards the end of the optimisation horizon.While such a solution is consistent with the optimal control design and is an interesting instance of the turnpike phenomenon [30], it is not suitable in our context.Finally, to account for the economic cost of lockdown, we penalise the mobility reduction by introducing a quadratic cost of the form 1 − m (t) 2 , for ∈ {school, other, work}.As mentioned before, the forward model is run for different sets of parameters drawn from the posterior distribution obtained via ABC inference.For this reason, the ∆D i (t) and I C i (t) variables appearing in the cost functional depend on the chosen value of the parameters of the model.We hide the explicit dependence for notational convenience.The cost is computed by taking an expectation over the posterior distribution for the parameters, since each parameter value will lead to a different realisation of the dynamics.Due to the nonlinearity of the system, this is clearly not the same as minimizing the objective using the posterior mean of the parameters only.In practice, we use 50 i.i.d.samples from the posterior distribution and approximate the expectation with an average over the trajectories obtained with those parameter values.Minimizing an expected cost in this way is much more computationally expensive than a cost computed on a point estimate of the parameters (for instance the posterior mean).However, it provides a way to take into account the uncertainty in the parameters while producing more robust results.
Collecting the different terms in our cost, we optimise min through the control signal where , for ∈ {school, other, work}, represents the relative cost of limiting the mobility to schools, workplaces and other locations with respect to the sanitary cost, and where the expectation is taken over the posterior distribution of the parameters of the model.The choice of the values for affects the optimal policy by attributing a larger economic or social cost of closing one of the categories with respect to the others.Determining adequate weights for these costs is the ultimate task of the policy maker.

Nonlinear Model Predictive Control
To close our optimal control formulation we add specifications to the controls we expect to obtain, restricting the space of admissible signals.As the values of α are between 0 and 1, a reasonable assumption is to expect the m to be in the interval [0, 1] as well, for ∈ {school, work, others}.However, due to the fact that we cannot impose a 100% closure of all settings, we set a lower bound for m to be the lowest value of each m observed during the lockdown period.This results in the constraints m work ∈ [0.31, 1], m school ∈ [0.1, 1], and As can be seen from equation ( 13), the controls m are time-dependent, and they are computed by minimising the cost functional (13) subject to the state constraints (1)-( 8).Ideally, the numerical realization of the optimal control strategy would be driven by the calculation of first-order optimality conditions and a reduced gradient approach to minimise J (m).However, the nonlinearities in the dynamics and in the terminal penalty, where the reproduction rate is expressed as an eigenvalue of a parameter-dependent matrix, make our problem highly non-convex.Moreover, the penalty Φ(I C ) is non-differentiable.For the purposes of this paper we will compute the optimal control by using generalized simulated annealing (or dual annealing) [33] as implemented in the scipy [31] Python library.The use of meta-heuristics for the solution of large-scale nonlinear optimal control problems has been assessed in [5,9].We embed the solution of the optimal control problem (13) in a nonlinear model predictive control (NMPC) framework [16].To this end, we select a prediction horizon T opt , and optimise the control variables m (t), t ∈ [t 0 , t 0 + T opt ] using the current state at t 0 as our initial condition.From the optimal control sequence we recover the optimal action for a single day, that is m (t), t ∈ [t 0 , t 0 + 1] and evolve the dynamics for the same amount of time, and repeat the optimisation procedure in the updated time frame [t 0 + 1, t 0 + 1 + T opt ].This process is then repeated until the complete optimisation interval [t 0 , t 0 + T h ] is covered.The NMPC methodology recovers a robust optimal control law in feedback form that can be adjusted to account for disturbances in the control loop.Therefore, instead of using the current state predicted by the model as initial condition, we can update this to be the current state of the population in England estimated from data, every time new data becomes available.This ensures that the control methodology accounts for noisy observations, or for unexpected variations in the data.

Results and Discussion
In this section, we apply our methodology to each of the datasets specified in the previous sections.In the first half of the section, as a proof of concept, we show the importance of the various terms included in the cost functional and their influence on the results, and this is done for the England dataset with parameter values calibrated with data between the 1st of March and the 23rd of May, with lockdown strategies applied for 90 or 120 days starting on the 24th of May.In the second half, we apply our methodology to two populations: England and France, with the models calibrated up to the 31st of August and lockdown strategies applied from the 1st of September.The posterior distributions of the parameters are available in Appendix B. This section is organised as follows: 1. Choice of an appropriate prediction horizon, T opt .
2. Influence of the relative weight of the economic and sanitary costs (i.e., how large to choose ).
3. Influence of the relative costs between reducing mobility to different locations (i.e., the relative weights of school , work and other ).
4. Dynamic update of the control strategy as we recalibrate the model.

Application of our methodology to England and France.
Steps 1-3 can inform a policy maker on their decision of how to weight each term in the cost functional, but we remark that this is ultimately their decision.In steps 4 and 5 we fix the parameters explored in 1-3 and test the methodology in specific cases.
Before presenting the optimal lockdown results, we exemplify our inference results for England with plots of the variables that are relevant for the control methodology -for the complete inference results, we refer the reader to Appendix B. In Fig. 6, we plot the predicted number of people in the I C category -the red line is the median prediction and the shaded area denotes a confidence interval with 99% credibility -compared to the true data in green in the left panel.The middle panel reports on the same results for the number of deceased people, and the right panel reports the predicted basic reproduction number, R(t).

Proof of concept of the methodology
We now proceed to present the results of our methodology as described above on data for England observed until 23rd May.Each of the figures below is either of two types: 1) Lockdown strategy and its effect on the number of hospitalized individuals, and 2) Influence of the lockdown strategies on the reproduction number R(t).In the first case, we plot the number of hospitalised people (full red line) for each control strategy -these results are associated with the red axis, on the left of the figure -and the value of the optimized mobility values m in the three blue lines, associated with the blue values on the right axis.The full, dashed, and dash-dotted lines correspond to work, school and other settings, respectively.

Median prediction Tobs
Dependence on T opt : A crucial issue in the NMPC framework is the selection of the prediction horizon T opt .For small prediction horizons, the optimal action tends to be instantaneous and loses its capability to foresee long-term consequences of the policy.On the other hand, a sufficiently long prediction horizon will enforce a stabilizing control law, but its numerical realization becomes increasingly complex.Therefore, at the core of the selection of a suitable prediction horizon there is a trade-off between short-sightedness, stabilization capabilities of the policy, and computability.This is exemplified in Fig. 7, where we illustrate the role that is played by the prediction horizon in the performance of the control loop for a fixed value of .It can be observed that a short-sighted policy, with a prediction horizon of 20 days (left panel), is less stringent in the mobility reduction, causing a larger number of hospitalized people in the long term and a large uncertainty in the end result.However, it is clear that the lockdown strategies are similar for both prediction horizons of T opt = 30 (middle panel) and T opt = 40 (right panel): here, we observe a smaller overall number of hospitalized people over time, while the uncertainty on the results remains bounded, with slightly smaller credibility intervals in the second case.For this reason, we will from now on use the prediction horizon of T opt = 30, which is long enough to avoid a short-sighted strategy, while being short enough to be computationally cheaper.We point out that this is correctly aligned with the COVID-19 time scale for transmission, and a reasonable time frame for a policy maker.
Influence of the relative weights of the control penalties ( s , w , o ): The choice of suitable control penalties , ∈ {school, work, other} is a sensitive issue in any optimal control problem, and we now proceed to study its effect on the resulting lockdown strategies.In this context, there are two important properties to explore: the relative weight between the sanitary and economic costs of our lockdown strategy, which is represented by how large the values of each of s , w , o are, and the relative importance between each of the s , w , o .We observe that the predicted number of daily deceased at the end of the training interval is between 200 and 400 (see Fig. 6).Having in mind that the control variables are constrained to [0, 1], we conclude that each should be in the order of 100, so that the control strategies are sensitive to both the sanitary and economic costs.
In our next example, we assume that the economic cost of opening each type of location is the same (i.e. s = w = o ) and analyse the effect of varying their relative weight to that of the sanitary cost.This is shown in Fig. 8, where we present the lockdown strategies and corresponding values of hospitalized people for = 100 (left) and = 200 (right).We observe that higher values of result in the strategies which open workplaces earlier, as keeping them closed is more expensive, but also result in higher uncertainty.Interestingly, however, we observe that both control strategies keep m other constant at a value of approximately 0.41, which is the minimum value allowed for this parameter.This behaviour is consistently reproduced for any of the values that we explored as we will see below; the only situation in which we did not observe this behaviour was by attributing an unreasonably high weight to the economic cost, in which case all the lockdown measures are lifted, leading to a large increase of the number of infected people and to a second wave of the epidemics.To better understand this phenomenon, we inspect the contact matrix relative to the "other locations" category (Fig 3).There, it can be observed that the setting in which there are more contacts between the part of the population in which our optimisation strategy acts, (under 60 year-olds),and the older population is the "other locations".As the latter are the most vulnerable to the disease, the optimisation is limiting the number of contacts they have by reducing m other .The last property we investigate is the effect of varying the values of for each category, which can produce a more economically viable solution.In the cases we will explore, we observe that opening workplaces is clearly an important part of the economy, and this might not be achievable without opening schools as well.For this reason, we will attribute the highest economic costs to closing workplaces, followed by schools and then others.We point out that this is ultimately a choice of the policy maker, and in principle any combination of values for each could be considered.We tested a variety of combinations of these cost weights, each providing us with a lockdown strategy.In this case, we compare the efficiency of each strategy in Figures 9 and 10.As can be seen in Fig. 9, all of the strategies propose to open workplaces and schools earlier or later (or not at all, in the case of schools) depending on the relative weights between sanitary and economic costs, and, as before, keep other locations at its minimum value.We show an additional comparison in Fig. 10, where we plot the reproduction number R(t) resulting from each strategy, compared with what it would be if the mobility values remained unchanged from their estimated values on the 29th of May.We note that the optimal strategies always keep the value of R smaller than 1 for most of the optimization horizon, but the confidence intervals allow for values larger than 1 at the end of the optimization window.Dynamic update of the model and optimal control strategy As new data becomes available, we can re-perform the model fit, and obtain a new posterior distribution on the parameters to use in order to find the optimal mobility values.An instance of this dynamic update can be seen in Fig 11, where we show the optimal mobility values and the corresponding basic reproduction number after the model parameters and the optimal solution are updated at four instances: the 11th of April, 26th of April, 11th of May and 23rd of May.Here we used ( s , w , o ) = (150, 300, 3) as from the results in Figs 9 and 10 it can be seen that such a choice leads to a lockdown strategy that increases the mobility towards both workplaces and schools while keeping the value of R(t) < 1.
Figure 11: Dynamically updated control strategy: we fit the model on data for England up to t obs = 11th of April and determine the optimal mobility strategy up to the next observation point t obs = 26th April.Data until the latter is used to repeat the procedure, in order to update the optimal control strategy exploiting newly available information.Here ( s , w , o ) = (150, 300, 3).In panel (a), we show the resulting optimal mobility values, with the corresponding values of R(t) and their credibility intervals shown in panel (b).We observe that the lockdown strategy determined with parameter values fitted on data up to the 11th of April is extremely restrictive, as the predicted dynamics on that date badly overestimates the number of deceased and hospitalized people (see Appendix B).However, we see that recalibrating the parameters and updating the lockdown strategy using newly available data is beneficial: using data until the 26th of April results in a better prediction of hospitalized and deceased numbers, and we see an increase of the mobility towards schools already at the end of April.We repeat the procedure on the 11th and 23rd of May, resulting on a proposed strategy allowing schools to be open even further at the first instance, and then remaining almost unchanged, with an increased mobility towards workplaces around mid-June.We also note that, with the exception of the time interval between the 11th and the 26th of April -where the high value of R(t) is due to the fact that for the current values of the parameters the epidemic is predicted to increase rapidly, independently of the lockdown measures -the predicted R(t) value always stays below 1.This dynamic update of the model allows us to rely less on long-term predictions from the model, which are more computationally expensive and become more biased the farther in the future the prediction is.Together with the knowledge on model-specific biases coming from fitting the model to different horizons in the past, this approach enables a policy maker to assess the validity of the proposed optimal mobility strategy and update it on a periodic basis as new data becomes available.

Optimal strategy for England and France on 1st of September
After exploring all the different properties of our optimal control strategy, we proceed to test it in two populations, England and France, in a more recent setting, which is closer to a second wave of the epidemics.We fit the model again with data from the PHE and the NHS for England, and from SpF for France, and using the Google mobility data in both cases.The inference was performed on both cases for data up to the 31st of August and the results are presented in Fig 12 below.
Optimal lockdown strategy for England starting in September 2020: We apply our optimal control strategy again for the population of England, for a more recent time interval.As mentioned above, the parameters were fitted up to the 31st of August, and we apply a lockdown strategy for T h = 120 days, starting on the 1st of September.The prediction horizon was kept at T opt = 30 days and we used two  Optimal lockdown strategy for France starting in September 2020: Our methodology is valid in general, and in particular it can be applied to different datasets.Our last example applies the strategy to the French population, where the epidemics has exhibited a different evolution than in the UK.As in the previous paragraph, we fitted the parameters with data up to the 31st of August, and apply the lockdown strategy for We observe that in this case the relative values of have a much stronger influence, with the lockdown strategy shown in the left panel having extremely better results than that on the middle panel.This is visible both on the number of hospitalized people and the resulting reproduction number, and illustrates the importance of the choice of these parameters and associated discussion with policy makers.We reiterate that our methodology for both England and France recommends to keep mobility of "others" as low as possible if not at the lowest possible label of full lockdown.Remembering that the others category refers to "retail", "transit" or "grocery" locations, our methodology is recommending that the we should keep a stringent lockdown until December, while opening up the work and school places in a socially-distanced manner.Deviations from that will likely result in a second-wave of infection spreading.We have proposed a general estimation/control methodology for the determination of optimal lockdown strategies in the context of the COVID-19 pandemic.Our approach is composed of the following elements: system dynamics described through an age-structured SEIRD model, the use of public data such as Google Mobility for estimating model parameters, and the design of adaptive lockdown strategies in the framework of nonlinear optimal control.The current work focused on a study of the COVID-19 pandemic in England and France, however the underlying methodology can be extended to other spatio-temporal locations.It can be applied to epidemiological models in general, assuming the availability of healthcare and mobility data for a suitable calibration of the dynamics.Our systematic approach provides a computational tool to assist the design of lockdown strategies which can be periodically rectified as the model is fed with incoming public data.Moreover, the proposed strategies are parsimonious in the sense that they encode both healthcare and socio-economic factors, and realistic as they are expressed as mobility reduction parameters which can be effectively measured, as opposed to switching on-off strategies.Arguably, our model does not capture the increase in the testing capacity.It is reasonable to assume that a larger proportion of people in the subclinical compartment get tested; ideally, these people will adhere to stringent social isolation regimes with the expectation to slow down the spread of the disease.However, how the latter affects the evolution of the epidemics when reduced mobility measures for all citizens are already in place remains unclear.As we continue to work on our approach, a natural way to improve the accuracy of our model is through a further downscaling of our dynamics, for instance by considering 418 principal local authorities (LA) in the UK, along with a commuting network of UK citizens between them constructed from the 2011 census data.Such a refined model, whose numerical treatment will necessarily require the use of high-performance computing resources, would allow the design of space-time adaptive lockdowns.Our progress along these lines will be documented on the companion website [4].

A Appendix: Additional mathematical details A.1 Computation of R(t)
We explain here how to compute the reproduction rate (R) for our model, which is considered as one of the main measures to quantify the spread of an epidemic and the efficiency of our lockdown strategies proposed in the Results and Discussion Section in the main body of the paper, and measures the average number of secondary infections an infected individual is capable of generating in a fully susceptible population.Our computation strategy follows [15]; first, we define the next generation matrix, which relates the numbers of newly infected individuals in the various categories in consecutive moments, before and after a contact (in our case, a time step) in the case where one single person interacts with a fully susceptible population.To do this, fix S i = N i and consider the Jacobian matrix of the infection subsystem -which is composed by states E i , I SC1 i , I SC2 i , I C1 i , I C2 i , i = 1, . . ., 5 (we note that the states R i and D i were removed, as they are "final" states: once an individual is in one of these states, they remain there).Using the same notation as in [15], this Jacobian has the following form , where we note that I is the 5 × 5 identity matrix, and each zero corresponds to a 5 × 5 zero matrix.We also slightly abused the notation in the other entries.For example, the matrices SC N , ρκI, etc., are 5 × 5 matrices whose entries are given by where δ ij is the Kronecker delta, and similarly for the other matrices.The matrix J 0 is decomposed into the transmission matrix, T , and the transition matrix, Σ, defined as follows: The next generation matrix is then defined as K L = −T Σ −1 .The reproduction number R is the spectral radius of this matrix, namely the absolute value of its largest eigenvalue, which represents the secondary number of infections a single infected individual is capable of generating in a fully susceptible population.
As T depends on the contact matrix C, the reproduction number will be a function of the Google mobility for different locations (m school (t), m work (t) and m other (t)).Hence, we will focus on studying the evolution of R(t) over time.

A.2 Details on Approximate Bayesian Computation
Approximate Bayesian computation (ABC) gives an approximation of the posterior distribution of the parameters, starting from a prior π(θ) and a (possible stochastic) simulator model M (θ), for which the likelihood p(x|θ) cannot be computed.Specifically, the true posterior is obtained via Bayes' theorem as π(θ|x obs ) = π(θ)p(x obs |θ) p(x obs ) .
In ABC, we approximate this expression by looking for the values of the parameters which best approximate the observations.The fundamental ABC rejection sampling scheme iterates the following steps: • Draw θ from the prior π(θ).
• Simulate a synthetic dataset x sim from the simulator-based model M (θ).
Here, the metric on the dataspace d(x sim , x obs ) measures the closeness between x sim and x obs .The accepted values of θ are thus sampled from a distribution π ABC (θ|x obs ) ∝ π(θ)p d,γ (x obs |θ), where p d,γ (x obs |θ) is an approximation to the intractable likelihood function p(x obs |θ): Here, K γ (d(x sim , x obs )) is a probability density function proportional to 1{d(x sim , x obs ) ≤ γ}, 1{•} being an indicator function which equals 1 when the condition in the brackets is true and 0 otherwise.This guarantees that, in principle, the above approximate likelihood converges to the true one when γ → 0. In this paper, we used the PMCABC algorithm [10] as implemented in the Python library ABCpy [14] allowing efficient parallelization using MPI, to perform parameter inference.This is an iterative algorithm considering a set of points {θ i } which are given a certain weight representing how much the sample x i generated by each of them is close to the observation.The algorithm proceeds by iteratively perturbing the parameters and performing simulations from the model, and reducing the threshold γ so that the approximation to the posterior distribution improves.At the end of the algorithm, a weighted set of parameter points which are samples from the approximate posterior π ABC (θ|x obs ) is returned.
For the sake of calibrating the model, we want to match the number of people in I C (summed over all age groups) and the daily deaths, by date of reporting, for each of the 5 considered age groups; the model is structured so that it returns those values as outputs.Therefore, we consider the set of variables x = ((∆D 1 (t), ∆D 2 (t), ∆D 3 (t), ∆D 4 (t), ∆D 5 (t)) T t=1 , (I C tot (t)) T t=18 ) where we denote by t the day since the start of the dynamics, by ∆D i (t) = D i (t) − D i (t − 1) the deaths occurring on day t in age group i, and we consider I C tot (t) = The corresponding observation is denoted as: x obs = ((∆D obs 1 (t), ∆D obs 2 (t), ∆D obs 3 (t), ∆D obs 4 (t), ∆D obs 5 (t)) T t=1 , (I C,obs tot (t)) T t=18 ).
As discussed in the Model Calibration Section in the main body of the paper, data on I C is available only from the 18 th of March, and therefore we discarded the first 17 days for the corresponding observations.We now define a distance for ABC that enables us to make use of the above data, by relying on a weighted sum of the pointwise Euclidean distances between the different elements of the trajectories.Specifically, let us denote the pointwise Euclidean distances between the different elements of simulated and observed data by: Using this, the final distance we consider is: where w D,i and w I are weights which we can fix according to considerations on the speed of convergence.It is clear that the above, being a combination of Euclidean distances, is a valid distance for x for each choice of the weight; the latter however are important for the ABC algorithm in practice, as they force the algorithm to constrain more or less on some of the distances.The weights we have found to work best for the problem at hand (as they gave faster convergence) are: w D = (1, 1, 1, 2, 2) and w I = 0.1.

B Appendix: Additional experimental results
B.1 Proof of concept: England data until 23rd May

B.1.1 Inferred parameters
We calibrate our model on data up to four different days representing different stages of the epidemic (11th of April on the peak of the epidemic and further towards the tail, up to the 23rd of May, as shown in Fig 15).The amount of available data over those four days affects how well our model was able to fit the data and predict the future.We report the posterior mean and the corresponding standard deviation for each model parameter and for each observation horizon t obs in Table .1.We notice that the estimated values differ slightly over the days, but they are quite similar on 11th and 23rd of May as in those days we do not observe any significant changes in the dynamics of the epidemics.As point estimates are not able to capture the correlations present between parameters, we report the inferred joint posterior distributions using data until 23rd May between pairs of parameters in Figures 16,17  Figure 16: Single and bivariate marginals (represented using contour plots) for the parameters describing the transition between states for England, obtained using data until 23rd May; note for instance the negative correlation between β and d R ; this is expected from the dynamics of the model (the longer a person stay in the infectious state, the more people can infect; therefore, a similar dynamics can be obtained by a lower value of β).
Table 1: Estimated posterior mean and standard deviation of model parameters for England, using the different horizons for fitting the model.We note the large standard deviation for the ρ and ρ parameters related to younger age groups, with respect to the older age groups.This is due to the fact that less information is available with regards to the severity of infection for those age groups, thus rendering the estimate harder.This also proves the ability of our technique to assign meaningful uncertainty ranges.We also point out that the estimated standard deviation is larger than the posterior mean for some of the parameters; this is due to the fact that the posterior distribution is not centered on the posterior mean but skewed.
Observation period

B.1.2 Evolution of the epidemics predicted by our model
As devising a successful lockdown strategy depends on the accuracy of the prediction of our model, we compare the predicted number (median prediction and 99 percentile credibility interval) of hospitalized people and daily deaths of our calibrated model with real data on the four observation horizons considered here.To do this, we integrate the dynamical model using i.i.d.posterior samples from the ABC posterior distribution on each of these days and we show our results in Fig 19.In this way we are able to highlight how providing additional information to the model changes its predictions for the future.Additionally, we provide the estimated basic reproduction number R (Section 2 of Supporting Information S1) from the dynamics obtained from the posterior sample points, and plot that with the relative credibility bands.We note that, with the exception of the model calibrated on data until the 11th of April, the estimated reproduction number has a value larger than 2 before the introduction of containment measures by the government, which then decreases below 1 a few days afterwards, and remaining below one until the end of the training period.We also note that towards the end of the training period, the credibility interval includes some values above 1, due to the fact that the values of people's mobility have increased in later weeks.Finally, plots comparing the daily number of deaths with real data stratified by age groups are reported in Sections B.1.3.
The results presented in Fig 19 highlight that it is hard to forecast precisely the evolution of the epidemic with a simple compartmental model as the one we consider.This can be seen from the fact that using different observation horizons to determine the parameters of the model leads to very different predictions of the evolution; this phenomenon is extremely evident in the first line of Fig 19 (i.e. using data until the 11th of April), where the predicted number of deaths and hospitalized people is much larger than what eventually turned out to be the case (we remark that the prediction is taking into account the measured mobility even throughout the prediction horizon).We also remark that our model systematically overestimates the number of deceased in the tail of the epidemics; this could probably be explained by the fact that it does not take into account the increased capacity of the health system to fight the disease.Single and bivariate marginals (represented using contour plots) for the parameters describing the transition between states for France, obtained using data until 31st August; note for instance the negative correlation between β and d R ; this is expected from the dynamics of the model (the longer a person stay in the infectious state, the more people can infect; therefore, a similar dynamics can be obtained by a lower value of β).
Table 2: Estimated posterior mean and standard deviation of model parameters for France.We point out that the estimated standard deviation is larger than the posterior mean for some of the parameters; this is due to the fact that the posterior distribution is not centered on the posterior mean but skewed.

Figure
Figure4: Raw and elaborated mobility data in the UK.In the raw mobility data, which is scaled with respect to a baseline value representing average mobility in the months prior to the pandemics, a strong weekly seasonality is present, which we mostly removed using a Savitzky-Golay filter.Moreover, we note that the mobility towards "residential" (which is not used in our analysis) locations is larger during the lockdown months than before, as people spend more time in their homes.Finally, note the very large increase in people's mobility towards parks around the end of the winter season.That contribution is however only one of the components in our aggregated m other mobility value, so the latter does not increase that abruptly.

Figure 6 :
Figure 6: Comparison of predictions of our model with the real number of hospitalized people with COVID-19 and total daily deaths (green) on 23rd May for England.The solid red line denotes the median prediction, filled spaces denote the 99% credible interval and the vertical dashed line denotes the observation horizon.The different columns represent number of people in hospital, I C (left), daily deceased (middle) and value of R(t) (right).

Figure 7 :
Figure7: Dependence on the prediction horizon T opt for determining and optimal control strategy for England.. Here, ( s , w , o ) = (100, 100, 100) and the lockdown strategies were applied for T h = 90 days starting on the 24th of May.

Figure 8 :
Figure 8: Different relative weights between the sanitary cost and economic cost of the lockdown measures with the lockdown strategies starting on the 24th of May and applied for T h = 120 days, for England.

Figure 9 :
Figure 9: Different economic cost weights produce different opening strategies and predicted hospitalized people, with lockdown strategies starting on the 24th of May and applied for T h = 120 days, for England.

Figure 10 :
Figure 10: Evolution of R(t) corresponding to different opening strategies starting on the 24th of May, learned using different economic cost weights for different values, with 99% credibility intervals; these plots are referred to England.

Figure 12 :
Figure 12: Comparison of predictions of our model with the real number of hospitalized people with COVID-19 and total daily deaths (green) on 31st August in England and France.The solid red line denotes the median prediction, filled spaces denote the 99% credible interval and the vertical dashed line denotes the observation horizon.The different rows represent different observation horizons, while the columns represent number of people in hospital (I C compartment, left column), daily deceased (middle column) and value of R(t) (right column).

Figure 13 :
Figure13: Optimal strategy for England with parameters fitted with data up to the 31st August.The lockdown strategy is applied for 120 days starting on the 1st of September and uses a prediction horizon of T opt =30 days

T
h = 120 days, starting on the 1st of September.The prediction horizon was kept at T opt = 30 days and we used two possible sets of values for .The corresponding results are shown in Fig 14, and are organised in a similar manner to Fig 13.

Figure 14 :
Figure 14: Optimal strategy for France with parameters fitted with data up to the 31st August.The lockdown strategy is applied for 120 days starting on the 1st of September and uses a prediction horizon of T opt =30 days

Figure 15 :
Figure 15: Evolution of the epidemics in England, together with ending dates of the observation period t obs used in this study.The red line represents the number of hospitalized people (scale on y axis on the left), while the blue line represents the daily number of deaths (scale on the right).

and 18 .
The posterior distribution is obtained by Kernel Density Estimate (KDE) on the set of posterior samples.

Figure 17 :
Figure17: Contour plots representing bivariate posterior plots for ρ i and ρ i , for all age groups (i = 1, . . ., 5) for England, obtained using data until 23rd May.For all of them, some negative correlation is present; this is very evident in the case of age group 4.

Figure 18 :
Figure18: Single and bivariate marginals (represented using contour plots) for the parameters connecting the reduction in mobility to the change of social contacts for England, obtained using data until 23rd May.

Figure 19 :
Figure 19: Comparison of predictions of our model with the real number of hospitalized people with COVID-19 and total daily deaths (green), for England.The solid red line denotes the median prediction, filled spaces denote the 99% credible interval and the vertical dashed line denotes the observation horizon.The different rows represent different observation horizons, while the columns represent number of people in hospital (I C compartment, left column), daily deceased (middle column) and value of R(t) (right column).

Figure 26 :
Figure26: Single and bivariate marginals (represented using contour plots) for the parameters describing the transition between states for France, obtained using data until 31st August; note for instance the negative correlation between β and d R ; this is expected from the dynamics of the model (the longer a person stay in the infectious state, the more people can infect; therefore, a similar dynamics can be obtained by a lower value of β).