Optimal control of the spatial allocation of COVID-19 vaccines: Italy as a case study

While campaigns of vaccination against SARS-CoV-2 are underway across the world, communities face the challenge of a fair and effective distribution of a limited supply of doses. Current vaccine allocation strategies are based on criteria such as age or risk. In the light of strong spatial heterogeneities in disease history and transmission, we explore spatial allocation strategies as a complement to existing approaches. Given the practical constraints and complex epidemiological dynamics, designing effective vaccination strategies at a country scale is an intricate task. We propose a novel optimal control framework to derive the best possible vaccine allocation for given disease transmission projections and constraints on vaccine supply and distribution logistics. As a proof-of-concept, we couple our framework with an existing spatially explicit compartmental COVID-19 model tailored to the Italian geographic and epidemiological context. We optimize the vaccine allocation on scenarios of unfolding disease transmission across the 107 provinces of Italy, from January to April 2021. For each scenario, the optimal solution significantly outperforms alternative strategies that prioritize provinces based on incidence, population distribution, or prevalence of susceptibles. Our results suggest that the complex interplay between the mobility network and the spatial heterogeneities implies highly non-trivial prioritization strategies for effective vaccination campaigns. Our work demonstrates the potential of optimal control for complex and heterogeneous epidemiological landscapes at country, and possibly global, scales.

• The full model is the COVID-19 model designed in [1,2]. This model is based on ODEs, it includes full connectivity fluxes among nodes which are estimated from mobility data, and it is implemented in MATLAB using the explicit Runge-Kutta integration scheme of order 4 with adaptive stepsize. Using data assimilation, we obtain the temporal variations of the regional transmission parameters of this model.
• The simplified model used for optimal control is an approximation of the above model, integrated using an explicit Runge-Kutta 4 method with fixed step size. We simplified the problem by limiting the connectivity to the largest mobility fluxes (see Fig 1B of the main manuscript) and optimizing only one realization of the posterior. This model is implemented in Python with the CasADi algorithmic differentiation framework [3]. The details of the simplifications are described below.
The simplifications introduced in the second model are necessary to solve the optimal control problem in a reasonable time. To adapt our framework to another model/country, one would need to update the "true" model to a suitable candidate (which could be a stochastic model, a Hidden Markov model, or any other kind) and design a tractable approximation of this new model to be solved by optimal control. To evaluate the effectiveness of our approach, we first compute the optimal vaccination course that minimizes the objective based on the simplified model. Then, we assess this strategy and the alternative ones on the full model, for different posterior realizations. If the simplified model is sufficiently accurate, the performance loss is small and the optimal solution outperforms any other vaccination strategy on the full model, as shown in our results.
In the text below, we first detail the full COVID-19 model, then we describe the optimal control framework and the simplifications we introduced to bring the problem to a tractable form. Afterwards, we detail the data-assimilation procedure used to determine the model parameters, and we describe the algorithms defining the alternative strategies. Finally, we provide additional results, a sensitivity analysis, and an in-depth analysis of the optimal solution.

B COVID-19 transmission model B.1 Model equations
The optimal control framework may be used with any compartmental SARS-CoV-2 transmission model that can be approximated by ordinary differential equations. In this work, we consider a complex epidemiological model developed in previous work to describe the first wave of COVID-19 infections in Italy [1,2]. The model subdivides the Italian population into the 107 Italian provinces and connects them on the basis of human mobility fluxes. In each province, the human population is further subdivided according to its infection status into the epidemiological compartments of susceptible S, exposed E, pre-symptomatic P (incubating infectious), symptomatic infectious I, asymptomatic infectious A, quarantined Q, hospitalized H, recovered R, dead D, and vaccinated V . The possible transitions between these compartments are shown in Fig 1A of the main manuscript. Individuals in compartments P , A, and I are infectious and contribute differently to the force of infection, driving susceptible S into exposed individuals E.
Dynamics The COVID-19 transmission dynamics are described by the following set of ordinary differential equations in each node i: Let N i be the population of province i. Susceptible individuals get exposed to the pathogen at rate λ i (t), corresponding to the force of infection for community i, thus becoming latently infected (but not infectious yet). Exposed individuals transition to the post-latent, infectious stage at rate δ E . Post-latent individuals progress to the next infectious classes at rate δ P , developing an infection that can be either symptomatic-with probability σ-or asymptomatic-with probability 1 − σ. Symptomatic infectious individuals recover from infection at rate γ I and may seek treatment at rate η. Asymptomatic individuals recover at rate γ A . Infected individuals who sought treatment are either hospitalized (rate 1 − ζ) or quarantined (rate ζ) at home and are considered to be effectively removed from the community, thus not contributing to disease transmission. Individuals who recover from the infection are assumed to have long-lasting immunity to reinfection at the timescale studied, but possible loss of immunity can be easily included in the model. Hospitalized individuals die at rate α H and recover at rate γ H .
Vaccination Individuals in compartments S, E, P, A, R might receive vaccine doses. If the chosen strategy allocates v i (t) doses in node i at time t, the vaccination rate is: .
Vaccinated individuals are moved at rate r v i (t) from their original compartments to compartment V , where they do not contribute to the infection anymore.
Force of infection In addition to the province's local dynamics, the force of infection also considers that local susceptibles may enter in contact with infected individuals that are traveling, and oppositely, susceptible commuters may become infected through contact with local infected. We split the force of infection λ i (t) as the sum of the local force of infection λ L i (t), from infected in node i and a mobility-driven force of infection from the network The local force of infection reads: and the influence of other provinces on province i is written as: where β 0 is the baseline transmission rate, and the parameters ϵ A and ϵ I represent the reduction of transmission respectively for asymptomatic and symptomatic individuals with respect to pre-symptomatic individual transmissions. Matrix C(t) accounts for mobility: each element C i,j (t) of the matrix (i ̸ = j) represents the proportion of contacts among individuals moving from i to j, while the diagonal elements C i,i (t) are the proportions of contact for individuals in node i that do not move: where p i is the fraction of mobile people commuting from node i to the other nodes, q ij is the fraction of mobile people between nodes i and j, and r is an additional parameter describing the fraction of time spent traveling (here set to half a day: r = 0.5 days). Fractions p i and q ij were estimated from the commuting mobility assessment of Italy performed in 2011 by the Italian National Institute of Statistics (ISTAT, data available at https://www.istat.it/it/archivio/139381).

Force of infection, simplified model
The force of infection of the optimal control model is slightly simplified. We observe while simulating our model that . This is exploited to simplify the model for the optimal control: we update λ M i (t) from other nodes every day whereas λ L i (t) is updated at each integration step (hundred of times per day), loosening the coupling between nodes. This is a very important step in making the optimal control problem tractable.
Google mobility reports In this work, we assume that the initial local fraction of mobile individuals p i , in the following indicated with p i0 , changes during the simulation due to the mobility restrictions imposed by the national and regional governments during the epidemic. We adopted the mobility trends for places of work estimated in the Google COVID-19 Community Mobility Reports [4] as a proxy for the reduction in mobile individuals in a province i and day t, where g i (t) is the percentage of change in mobility in province i and day t with respect to the mobility at the beginning of February 2020, as reported by the Google COVID-19 Community Mobility Report (shown in Fig G).
The model further exploits Google COVID-19 Community Mobility Report estimates of mobility reduction as a proxy of changes in individual awareness and social distancing. This is represented in parameter β i (t), a spatially distributed and time-varying parameter describing site-and time-specific variations in transmission due to non-pharmaceutical interventions or other exogenous factors such as variants. At a given day t, we pose where parameter ϕ i (t) changes at the regional level and is calibrated in the data assimilation procedure in order to fit the hospitalization data (see section Data assimilation and model parameters, Fig G) The objective for our model is to minimize the total incidence of infections, i.e., t f ti i λ i (t)S i . Note that for the present model, this is equivalent to minimizing the total deaths or hospital admissions, as without risk classes the sizes of these two compartments are proportional to each other.

B.2 Spatial set-up
The modeling tools described in the following sections are applied to the Italian COVID-19 epidemic at the scale of second-level administrative divisions, i.e., provinces and metropolitan cities (currently, as of 2021, 107 spatial units). Official data about the resident population at the provincial level is produced yearly by the Italian National Institute of Statistics (Istituto Nazionale di Statistica, ISTAT; data available at http://dati.istat.it/Index.aspx?QueryId=18460). The latest update (January 1, 2019) has been used to inform the spatial distribution of the population.
The data to quantify nationwide human mobility come from ISTAT (specifically, from the 2011 national census; data available online at https://www.istat.it/it/archivio/139381). Mobility fluxes, mostly reflecting commuting patterns related to work and study purposes, are provided at the scale of third-level administrative units (municipalities) [5,6]. These fluxes were upscaled to the provincial level following the administrative divisions of 2019 and used to evaluate the fraction p i of mobile people in each node i, as well as the fraction q ij of mobile people who move between i and all other administrative units j (see Supplementary Material in [1]).
C Optimal control of spatial epidemiological models C.1 Optimal control method We lump the epidemiological compartments of each node i in variable ) and we define v i (t) as our control variable, representing the number of vaccines administered in node i at time t. We express the dynamics of the epidemiological model (Eq (A)) as an ordinary differential equation in each province i:ẋ where m i (t) carries the contribution of other provinces to the force of infection of node i (i.e., m i (t) = λ M i (t)). For simplicity, we drop the time dependence in the equations below, and we define the state and control variables for the full system as: where n is the number of spatial node considered (n = 107). The global dynamics for all provinces are denoted: The coupled force of infection in node i is denoted λ i . We define the running cost as the sum of total incidence of infections (transitions S i −→ E i ) for every node i, i.e., For the sake of generality, we also introduce the terminal cost M , which can in principle be used to ensure that the system is in a proper state at the end of the prediction horizon, instead of optimizing only the short-term gain. Since properly designing the terminal cost would require a long analysis, for simplicity we do not use it in this work, hence M (·) = 0.
Given states x, controls v, and dynamics F , the optimal control problem is: where we aim at minimizing the cost function over the prediction horizon T , while enforcing the modeled SARS-CoV-2 transmission dynamics (Eq (Hb) and Eq (Hc)). Moreover, constraints on June 29, 2022 5/27 vaccine availability and maximum vaccination rate are lumped in function H, which reads: where time is measured in days, and I b a is the set of all integers a ≤ k ≤ b. Eq (Ia) enforces that one can only distribute a non-negative amount of vaccine doses. Eq (Ib) states the logistic constraints, which limit the amount of individuals that can be vaccinated each day in each node to v max i ; here t d is the time at which each day starts. We assume that the daily capacity of each province is proportional to the population size of each node N i , because we assume a fair distribution of the sanitary infrastructure among provinces with regard to population, as shown in Fig A. The stockpile is materialized by Eq (Ic), which ensures that the total vaccine allocation across every node does not exceed the total availability D(t). The stockpile is replenished every Monday by the delivery of new vaccines, hence D(t) is a staircase function.
We convert our problem formulation in Eq (H) to a nonlinear programming problem using direct multiple shooting. Standard multiple shooting splits the time horizon [0, T ] using a time grid t 0 , . . . , t N , with N + 1 points and t 0 = 0, t N = T . The control function is parameterized using basis functions with local support. Common choices are a uniform time grid, i.e., t k+1 = t k + δ t and a piecewise constant control function, i.e., v(t) = v k , t ∈ [t k , t k+1 ]. The system dynamics are then discretized to obtain a discrete-time system: We perform the discretization using numerical integration techniques (such as a fourth-order Runge-Kutta scheme, with 50 steps per days) to obtain a good approximation of the true trajectory and cost. Finally, the path constraints H are relaxed and imposed at a finite amount of time instants, here coinciding with the time grid t 0 , . . . , t N . We ought to observe that, since in our case the constraints only involve the controls, we are not introducing any approximation by enforcing these constraints only on this uniform grid. The optimal control problem in Eq (H) is then approximated by the nonlinear programming problem: In Eq (J), both the states x = (x 0 , . . . , x N ) and the controls v = (v 0 , . . . , v N −1 ) are defined as optimization variables, which is a distinguishing trait of multiple shooting as opposed to single shooting.

C.2 Simplifications
The main difficulty in solving Eq (J) in the context of this paper is the large dimension of the system and the nonlinearity of the model, which can pose severe issues to the numerical solvers. In for each province. This logistic constraint bounds the maximum number of vaccines to 0.5M of doses per day, with a local rate that is proportional to the node population.
Here we show the maximum vaccination rate for each province (the constraint the solution has to comply with), in red, and the maximum rate prescribed by the optimal solution while simulating the pessimistic scenario with a stockpile delivery of 479'700 doses, in black. The optimal solution uses the maximal capacity of the logistic network, while respecting the constraint. the following, we will thus introduce a few simplifications, and we will verify through numerical simulations that these simplifications do not result in large errors in the optimal control solution (see Fig B).
We discretize the optimal control problem using a uniform grid with sampling time δt = 1 day. We assume that (a) vaccinations are administered instantaneously at the beginning of each day, rather than with a constant rate over the whole day; (b) the force of infection associated with mobility is constant over each day; and (c) the weakest mobility links can be pruned. Thus, each node dynamics can be made independent of the other nodes dynamics by introducing an auxiliary control variable z that is constrained to match the force of infection due to the other nodes at the beginning of each time interval. Then, the dynamics of the decoupled system in each node can be written as:ẋ where e i (x) = λ M i (t k ), i.e., the mobility part of the force of infection at the start of the interval.
Discussion on Simplification (a). We ought to remark that, realistically, vaccinations will occur over approximately eight hours per day. Our assumption, justified as a computationally convenient approximation of reality, is not a priori worse than assuming that vaccine administration takes place over the whole day. More refined approximations, while in principle possible, can pose severe issues because of the nature of the system dynamics. While for most initial values the system dynamics can be easily simulated with continuous-time vaccinations, the system becomes stiff by construction once almost the entire population has been vaccinated. In this case, numerical integration errors can drive the size of some compartments to be negative, which violates the model assumptions and makes the result of the numerical integration meaningless. The main issue in this case is that the optimizer will exploit these inaccuracies in order to reduce the cost. Therefore, this issue is much more evident when solving optimal control problems than when simply simulating the system dynamics. We have investigated some simple approaches to tackle this issue, but no technique yielded satisfactory performances. It is our impression that ad-hoc integration strategies will be required in order to reliably simulate and optimize dynamics with continuous vaccination rates. While this will be the subject of future research, the results obtained with the current approximation have yielded sufficient accuracy.
Discussion on Simplification (b). This simplification has been proposed in [7] as an approach to solve distributed optimal control problems by means of multiple shooting. In the original version, the coupling variable z is not necessarily piecewise constant, but rather piecewise polynomial. We Comparison between the incidence in the exposed compartment E (per 1'000 people) as evaluated by the model simplified for the optimal control (red) and the full epidemiological model (black). Results for the pessimistic scenario without vaccination. The exposed compartment is very sensitive and exhibits the largest error among all compartments. In spite of this, the error is very small, justifying the simplifications undertaken.

Fig C.
Simplification of the mobility matrix to obtain a sparse and tractable problem for the optimal control framework. Note that, after computing the optimal vaccination strategy, we assess its effectiveness of on the full epidemiological model. Base map layer from the Istituto Nazionale di Statistica; Istat, istat.it, CC-BY 3.0.
have observed in simulations that, for this problem, the piecewise constant parametrization yielded sufficient accuracy. We discretize the dynamics of each node using an explicit Runge-Kutta integrator of order four, with 50 integration steps per day. Our choice is motivated as a good compromise between efficiency and accuracy, evaluated in simulations comparing it to other methods such as explicit Euler and implicit Runge-Kutta integrators.
Discussion on Simplification (c). We sparsify the mobility matrix by pruning element below a threshold (see Fig C). This operation reduces the number of connection between nodes. Also in this case, we verified through numerical simulations that the introduced simplification had a small impact on the prediction and control accuracy. As seen in Fig B, these approximations yield a very small error, but make the problem considerably easier to optimize.
Possible further improvements Applying optimal control in open loop, i.e., solving the optimization problem once and applying the control input over the whole time interval, may lead to poor performance due to model inaccuracy and external perturbations. A common remedy consists in closing the loop by repeatedly solving the optimal control problem using the most updated information on the initial states. This is the principle behind Model Predictive Control (MPC) [8]. In this context, the state would be estimated on a daily, weekly, or monthly basis so as to solve again the optimal control problem and correct the optimal strategy.

C.3 Implementation of the optimal control framework
We implement the optimal control framework using the automatic differentiation framework CasADi [9], the interior-point solver Ipopt [10], and the HSL ma86 large sparse symmetric indefinite solver [11]. The full framework along with the optimization results and the analysis notebooks is available on GitHub: https://github.com/jcblemai/COVID-19_italy-vaccination-oc, and this manuscript's specific version is deposited on Zenodo with DOI: 10.5281/zenodo.6621051.
Solving this optimal control problem is both CPU and RAM intensive. For numerical computations, we used the Helvetios cluster the EPFL HPC facility (one problem per computing node, each equipped with 36 2.3 GHz cores and 192 GB RAM). On this cluster, it takes approximately four days to solve the large-scale optimal control problem just presented. It should be possible to solve even larger problems with more RAM available.  i (t) be the number of susceptible and new exposed individuals in node i (i = 1, . . . , n) and age class j (j = 1, . . . , 5), at time t, with the property that j S (j) . The new exposed individuals per age class are a fraction p i (t) of the total new exposed: where j p We estimate the coefficients c (j) i on the basis of epidemiological data. Before the beginning of the vaccination campaign (December 2020, see [12]) the total reported cases of COVID-19 in Italy where subdivided as in the following: p (1) = 12.13%, p (2) = 24.23%, p (2) = 33.92%, p (2) = 19.58%, p (2) = 10.14%.
Assuming that the depletion of susceptibles and reporting issues have a negligible impact on these cumulative statistics, we compute the coefficients c (j) i as follows: is the population in node i and age class j (data from [13]). The post-processing algorithm is initialized with S (j) i . In a day t, assuming to know S (j) i (t), the algorithm follows these steps: • Eq (L) evaluates the probabilities p • the new exposed E i (t) are subdivided per age class using Eq (K); • the number susceptibles are updated by removing the new exposed and the vaccinated susceptibles in that node and age class: • The number of deaths D i (t + τ ) per age class are computed from the exposures through an age-class specific case fatality rate µ j (equal per each node): where τ = 14 days is the average time between exposure and death (see [14]), and µ 1 = 0.01%, µ 2 = 0.04% µ 3 = 0.43%, µ 4 = 6.06%, µ 5 = 20.83% (data from [12]) (panel A), exposed (panel B) and deaths (panel C) among the five considered age classes. The algorithm provides results at the province level, which are here aggregated at the national level (see the section on age structure).

D.2 Data assimilation methods
The local transmission rates, computed as β 0 β i (t) = β 0 (1 + g i (t)/100)ϕ i (t), i.e., Eq(F), are the main parameters governing the force of infection of the model and, thus, the daily exposed individuals. To better track possible changes in the transmission rates, we adopt a data assimilation strategy based on an iterative particle filter [15] on a moving window to assess the values of the coefficients ϕ i (t).
We initialize the model state variables and parameters using the results of a previous calibration effort [2] which fits the initial conditions, the baseline transmission β 0 , as well as the coefficients ϵ E and ϵ I using a Bayesian framework for the period February 24 -May 1, 2020, on the basis of the official epidemiological bulletins released daily by Dipartimento della Protezione Civile [16] (data available online at https://github.com/pcm-dpc/COVID-19) and the bulletins of Epicentro, at Istituto Superiore di Sanità [17,18]. Bertuzzo et al. [2] presents three possible model calibrations based on three values of the symptomatic/asymptomatic infectious ratio (σ = 0.5, 0.25, 0.10). For our study we select the calibration results based on the central scenario, σ = 0.25.  Values of the transmission parameters β i (t) in Eq (F) as estimated in the data assimilation procedure (blue). The values used in the optimistic and pessimistic transmission scenarios are represented in green and yellow, respectively. The red lines represent the reduction in mobility and transmission computed using google mobility data (coefficient 1 + g i (t)/100 in Eq (F)).
In the data assimilation approach, the filter starts considering N r = 100 model realizations at time t 0 (February 21st, 2020), whose state variables are x (j) 0 , j = 1, . . . , N r , where the superscript (j) is the realization index and the subscript is the temporal index. Each realization is associated with a parameter combination that is randomly sampled from the posterior distribution evaluated in [2].
Note that all the epidemiological parameters estimated in [2], including the transmission rates, were spatially homogeneous, while possible temporal variations were imposed on fixed dates. Here, instead, temporal variations in the nodes' transmission rates are obtained by iteratively fitting coefficients ϕ (j) k,i against the regional hospitalization data on a moving window of τ = 14 days. At time t 0 , the coefficients ϕ k,i at time t k , the latter having an ensemble mean µ k,i , we run the model for τ days by sampling new regional coefficients ϕ (j) k+1,i from the truncated normal distributions (mean µ k,i , standard deviation 0.2, bounds 0.01-2.0) assuming that the regional coefficients remain constant. The regional likelihood of each realization is then evaluated during the moving window assuming that the daily hospitalizations follow a gamma distribution (as in [2]). A resampling step (systematic resampling, see, e.g., [19]) selects and duplicates the coefficientsφ k+1,i associated with the largest likelihood values. These coefficients are then used to update the mean value µ k+1,i . This procedure is iterated four times on the same temporal window by sampling from a normal distribution with updated mean value µ k+1,i and decreasing standard deviation to 0.05. This set of coefficients is used to compute state variables and parameters at time t k+1 .
The assimilation is then repeated by moving the estimation time window by one day, up to January 4, 2021 to produce the projections used in the main text. These projections are shown in Fig H, which displays the incidence for each province for the two scenarios. This view highlights the different trajectories between the optimistic and pessimistic scenarios and the spacial heterogeneity based on which the optimal control framework optimizes the vaccination strategy.

E Alternative vaccination strategies
We designed alternative vaccination strategies to establish a baseline for comparison of the efficacy of the optimal solutions. Each strategy uses a decision variable, V i , as a basis for ranking the provinces and, thus, prioritize the allocation of vaccines. The decision variable is one among: • modelled future incidence: the modelled total future incidence. This variable is updated daily during the projection reflecting the projected incidence taking into account the effects of the already allocated vaccines as projected by the full model.
• modelled initial susceptibility: the modelled number of susceptibles in each province at the start of the vaccination campaign; • the aforementioned decision variables values are taken either as absolute values or relative to population, e.g., one decision variable is incidence and another one is incidence per inhabitant, and the same holds for susceptibility.
• equal for all provinces.
Once a decision variable is selected, there are two ways in which the doses are distributed: focused and proportional.
• Focused Every day we allocate K = 1/7th of the weekly stockpile delivery as follows: every province is sorted (higher on top) according to its decision variable V i . We then allocate the maximum local rate v max i to every province starting from the top of the list until the stockpile Exposure incidence per 1000 Fig H. Projected incidence into the exposed compartment E (per 1'000 people) for the pessimistic (red) and optimistic (blue) scenarios.
June 29, 2022 16/27 is empty. In other words, we find the province index i that satisfies max i V i , and we assign to province i M i = min(v max i , K) vaccines. Then, we find the province j that satisfy max j,j̸ =i V j and we assign it M j = min(v max j , K − M i ). We iterate this procedure until no vaccine remains for allocation at the given day. This strategy will concentrate the allocation on nodes with the highest values of the considered decision variable.
• Proportional In this case, assuming that on a given day there is a quantity of vaccine K in the stockpile, we assign to each province i an amount M i = min(v max i , K · Vi j Vj ). This approach vaccinates each node proportionally to the value of its decision variable V i . Moreover, it exhibits very fast allocation as all vaccines that can be allocated in a day according to the logistic constraint are allocated the day following the stockpile delivery.
An additional strategy, proposed in [20] is developed in our benchmark. Named Greedy, it approximately optimizes the current projected impact of the allocation according to the following heuristic. For every week in the control horizon, after a delivery, we perform the following steps: 1. create 107 one-week strategies, i.e., one per province. In each strategy one province receive as many vaccines as possible, solely limited by the remaining susceptibles, the stockpile and the local logistic rate constraint for the seven days of the week.
2. Evaluate each of these strategies using the model, and compute the objective to minimize (number of infections).
3. Select the strategy that generates the highest reduction in the objective and keep it fixed in the next iterations.
4. redo steps 1.-3. until no vaccines remain in the stockpile. At every iteration, all already selected strategies are kept and thus taken into account The rationale behind this strategy is that it optimizes visible gains without requiring the full optimal control framework to be run.

F Additional results against alternative strategies
We present the number of averted infections for each proposed vaccination strategies and additional scenarios in Tables A-B, and we show them side-by-side in Fig I. The optimal strategy outperforms all the others strategies. In fact, for every given posterior realization, the optimal control solution always has the largest number of averted infections.

G Detailed optimal allocation analysis
To further investigate the features of the optimal solution, we present a linear scatter plot of the optimal proportion of vaccinated individuals per province (used as sorting variable for ordering the provinces) side by side with the province population, the projected incidence without vaccination, and the proportion of susceptible individuals at the start of the simulation. We present these results for the optimistic scenario in Fig J and for the pessimistic scenario in Fig K. We find no clear visual pattern associating these covariates to the proportion of vaccinated individuals by the optimal strategy, highlighting again that the optimal allocation uses the epidemiological variables in a non-straightforward way, different from every alternative strategy we considered. When the weekly stockpile delivery is increased, we observe in Fig L that the pattern in main text Fig 6A shifts to the right while remaining qualitatively consistent. Hence, the optimal allocation strategy is robust with respect to the overall vaccine availability constraint, and the same nodes are prioritized.  Finally, to highlight the temporal dimension of the prioritization strategy for the deployment of vaccine doses, we present a stackplot of the proportion of vaccine dose allocated in each province according to the optimal solution ( Fig M) and a heatmap (Fig N) where the spatio-temporal pattern is compared for the different scenarios. From left to right, as we go to scenarios with higher deliveries, the optimal solution makes use of the new doses by both further re-enforcing the vaccination in already prioritized provinces (like a focused strategy) and by vaccinating new provinces (like a proportional strategy).  Table B. Absolute number of averted infections for the scenarios with the largest weekly stockpile deliveries.

H Sensitivity analysis
As detailed in the Discussion, the optimal allocation is complex and highly specific: it uses every feature of the model and fully accounts for the current epidemiological state to gain efficacy over other strategies. This raises the concern that the optimal allocation might be unstable: if there are errors in model projections, one might be worried to have a significantly worse perform than with less specific strategies. In order to verify whether that is not the case, we perform a sensitivity analysis of the optimal allocation. Instead of evaluating the performance on other posterior realizations (as  Tables A-B for absolute values).
we did to generate Fig I), we evaluate it on completely different projections. These 100 projections are obtained by randomly shuffling the provinces projected dynamics (shown in Fig H). The results are shown in Fig O. As expected, we observe a strong degradation in performance for all strategies, but the optimal control strategy does not show a particular fragility under these conditions.    . The x-axis represents time (one square per day) and the y-axis space (one square per province, in alphabetical order), and the color represents the proportion of individuals vaccinated on every day in each province by the optimal solution, with black displaying the maximum logistic rate per inhabitant, which is equal for all provinces. Here the median realization of the modeled posterior is optimized (diamonds), and the comparison is done on shuffled dynamics (random allocation of each province's dynamics to another province, box plots). The results are normalized by the number of averted infections in the optimized solution (see Table A-B for absolute values).