A hybrid Neural Network-SEIR model for forecasting intensive care occupancy in Switzerland during COVID-19 epidemics

Anticipating intensive care unit (ICU) occupancy is critical in supporting decision makers to impose (or relax) measures that mitigate COVID-19 transmission. Mechanistic approaches such as Susceptible-Infected-Recovered (SIR) models have traditionally been used to achieve this objective. However, formulating such models is challenged by the necessity to formulate equations for plausible causal mechanisms between the intensity of COVID-19 transmission and external epidemic drivers such as temperature, and the stringency of non-pharmaceutical interventions. Here, we combined a neural network model (NN) with a Susceptible-Exposed-Infected-Recovered model (SEIR) in a hybrid model and attempted to increase the prediction accuracy of existing models used to forecast ICU occupancy. Between 1st of October, 2020 - 1st of July, 2021, the hybrid model improved performances of the SEIR model at different geographical levels. At a national level, the hybrid model improved, prediction accuracy (i.e., mean absolute error) by 74%. At the cantonal and hospital levels, the reduction on the forecast’s mean absolute error were 46% and 50%, respectively. Our findings illustrate those predictions from hybrid model can be used to anticipate occupancy in ICU, and support the decision-making for lifesaving actions such as the transfer of patients and dispatching of medical personnel and ventilators.


Introduction
On March 11 th , 2020, the World Health Organization (WHO) declared the COVID-19 pandemic an international health emergency [1]. Since then, COVID-19 has caused infections in millions of people [2], with a substantial proportion of infections (e.g. 9-11% [3]) requiring hospitalization in intensive care units (ICU). In multiple countries, demand of ICU beds exceeded bed availability [4][5][6], leading to excess mortality of COVID-19 patients as well as backlogs of patients for other pathologies that require hospitalization in ICU [7][8][9]. Monitoring and anticipating ICU occupancy has become critical to support decision-makers to impose (or relax) non-pharmaceutical interventions that can help mitigate the transmission of COVID-19, and thereby reduce its impact on healthcare systems. Where S (Susceptible), E (Exposed), I (Infected), P (infected but not yet hospitalized), H (= H 1 + H 2 , Hospitalized), ICU, D (Death), R (Recovered), and C (Cumulative Infected) are the model variables; R 0 the basic reproduction number, c the vaccination rate, and k the reduction/increase in transmission rate after a non-pharmaceutical intervention is introduced/ relaxed. Parameter values and their meanings are reported in Table 1. Daily vaccination rates were obtained from the public dashboard of the Swiss Federal Office of Public Health [32].
NN models to be trained on relatively small datasets that is comparatively higher than for other ML models such as deep neural networks [40].
Covariates (section 2.2.2.) were introduced in the NN with a lag corresponding to the maximum correlation with the response variable via cross-correlation. In particular, the lag of t-Δt,. . ., t-42 was explored among the possibilities, with 42 days as the minimum value. An additional covariate, ε t , was also added to account for the autoregressive nature of process.
The NN formulation was:

PLOS ONE
Where Δt was set to 3 or 7 days; ω 0 is the intercept of the output layer, and ω 0j the intercept of j th hidden node; ω j is the weight (also known as parameter) associated with the connection from the j th hidden node to the output layer, and w j T is the vector of weights associated with the connection to the j th hidden node; Γ is the Rectified Linear Unit (ReLU) activation function; x is the vector of covariates. The size of the of the hidden layer, determined by the number of hidden nodes, was optimized together with other hyperparameters. Specifically, hyperparameters are different from parameters: parameters are learned during model training, while hyperparameters need to be optimized externally to model training (see section 2.3.).

Covariates.
These covariates were used to predict ICU occupancy ( Table 2): i) the number of COVID-19 cases; ii) the number of COVID-19 cases associated with the Alpha variant (better known as UK variant); iii) the level of non-pharmaceutical interventions (e.g., school closures, workplace closures, and travel bans) as identified by the Containment and Health Index (i.e., a subindex of the Stringency Index [41]); and iv) the mean daily air temperature.

Model training and performance evaluation
We adopted a temporal cross-validation scheme [46] similar to the one used by Vollmer et al [29]. This scheme (Fig 2) allowed us to train and evaluate the performance of the hybrid model multiple times (n = 85 for the prediction at 3-days and n = 36 for the prediction at 7-days) over the simulated period.
The scheme works as follows: First, the time series is divided in three successive time windows: the training set, validation set, and test set (Fig 2). The training and the validation sets are used for the optimization of Table 2. List of covariates.

PLOS ONE
the parameters of the SEIR model (one κ, the reduction applied to R 0 , estimated separately for of the 3 phases), as well as the optimization of hyperparameters of the NN (i.e., the number of nodes in the hidden layer, learning rate, and dropout rate). A training-validation split of 90%/10% is adopted. The initial training and validation set included data from the 8 th of October, 2020 until the 6 th of November, 2020, in order to meet a minimum amount of data for model training.
Second, the performance of the trained model is evaluated on the test set using the mean absolute error. The test set consisted of n = Δt values, with Δt equals to 3 and 7 for the predictions at 3-and 7-days ahead, respectively.
Third, the training-validation set is expanded to include the test set of the previous iteration.
At the end of the iterative validation scheme, the overall performance of each model is estimated using the average MAE across iterations on the successive test sets (average MAE on red block for iteration 1 to n Fig 2).
In step 2, the optimization of the SEIR model was performed using maximum likelihood (Nelder and Mead algorithm [47]) on the complete training-validation set; residuals of the SEIR model are then calculated. The optimization of the hyperparameters of the NN was done as follows: a sampling space of 100 combinations of hyperparameters was generated using a Latin Hypercube [48]. A back-propagation algorithm (based on gradient descent) was used as a learning algorithm to modify the values of the weights and obtain the best matches possible between the true and estimated values of the residuals of the SEIR model in the training set. The mean absolute error (MAE) was used as fitting criteria on the validation set, and an early stopping mechanism was applied to stop the learning algorithm if the MAE did not achieve a decrease of 5 units within 100 epochs (i.e., number of iterations that the learning algorithm worked through the training set). The largest number of possible epochs was set to 4,000. We accounted for the stochastic nature of the optimization by repeating the simulation 10 times for each combination of hyperparameters. For each set of 10 simulations, we calculated the mean and standard deviation of the MAE in the validation set. The combination of hyperparameters that generated the minimum mean MAE in the validation set was selected as optimal for the NN.
Performance evaluation was evaluated on the test set as follows: i) predictions of the SEIR model ( d ICU tþDtÞ ) were based on the parameters inferred in the training-validation set (extrapolation); ii) predictions of the NN (b ε tþDtÞ ) were obtained after training a NN with the optimal combination of hyperparameters on both the training and validation set; iii) the sum of the two contributions ( d ICU tþDtÞ þ b ε tþDtÞ ) was compared with the observed ICU occupancy. Confidence intervals for the NN were generated using the standard deviation calculated on the validation set, while they were obtained as described in Zhao et al. for the SEIR model [30], with the 2.5% and 97.5% quantiles of the 10,000 predictions.
Furthermore, the predictions (and accuracy evaluated via the MAE) of the hybrid model were compared to that of the SEIR and NN model independently.

Downscaling at hospital level
Model predictions were obtained at the national-and cantonal-level, and from the cantonallevel downscaled to the hospital-level. Particularly, cantonal-level predictions were downscaled based on the percentage of occupancy of ICU beds in each hospital, calculated as moving average of the past Δt days. The ICU occupancy data of Swiss hospitals were provided daily by the Coordinated Sanitary Service of the Swiss Armed Forces, and refer to the number of ICU beds occupied by COVID-19 patients. The canton of Zurich was selected for model testing, since it is the most populated Canton in Switzerland with 15 hospitals with ICU.

Importance of covariates
We determined the relative importance of each covariate in making predictions for the hybrid model. Concretely, we computed a Deviation metric (2) between the MAE of the full model including n = 5 covariates with that of a reduced model with n = 1 covariates [49]. The Deviation was calculated on the test set at the end of each epidemic phase. The procedure was repeated n = 5 times excluding one covariate at a time. The Deviation (%) was calculated as follows: A positive Deviation signifies that the excluded covariate was important for the model. Specifically, a positive Deviation corresponds to a decreased accuracy of the reduced model compared to the accuracy of the full model that included all covariates.

Model comparison
We compared 3 types of epidemic models (i.e., SEIR, NN and hybrid,) to predict short-term (3 and 7 days ahead) ICU occupancy at the national-and cantonal-level.

Relative importance of covariates
The relative importance of covariates during each of the three phases is reported in Fig 5. In phase 1, a negative Deviation (marked with an asterisk in the Figure) was observed for a majority of the covariates (i.e., COVID-19 cases, proportion of COVID-19 cases associated to the alpha variant, Index of Containment and Health, and mean environmental temperature), meaning that their exclusion from the full model improved prediction accuracy. Conversely, the autoregressive covariate was important for making predictions, with Deviations equal to 92% and 66% for the hybrid and NN model, respectively. In phase 3, all of the covariates were informative; in this last phase, the NN predictions were more affected by the exclusion of covariates in comparison to the hybrid model. The average Deviation was 230% and 53%, for the NN and hybrid model, respectively.

Prediction accuracy
In this study, we showed increased prediction accuracy of ICU occupancy using a hybrid model combining a SEIR and a NN model. The model developed here could help guide interventions against future COVID-19 epidemics. At a national-level, during phase 1 (19 th of

PLOS ONE
October, 2020 -15 th of January, 2021) the overestimation of ICU occupancy by the SEIR model could be associated with its intrinsic nature to predict exponential growth at the beginning of a new wave. In contrast, the SEIR model underestimated the ICU occupancy during phase 3 (14 th of April, 2021 -1 st of July, 2021). This could be explained by the fact that the model lacks important covariates such as temperature, which may have been responsible of an increase of cases during the winter and thus for an increased ICU occupancy. For the NN model, its worst performance was observed during phase 1, where abrupt oscillations occurred. These oscillations could be attributed to the short time series available for model training at that stage, thereby compromising model training and limiting predictive performance. This interpretation is supported by the fact that the prediction accuracy of the NN model improved during phase 2 (15 th of January, 2021 -14 th of April, 2021) and 3 (14 th of April, 2021 -1 st of July, 2021), when longer time series became available for model training.
At the national-and cantonal-levels, the SEIR model was unable to capture the increase in ICU occupancy that occurred two months after the beginning of the second-dose vaccination campaign. The causal mechanisms behind this trend remains unclear, but may be associated with other drivers such new variants (e.g., Delta variant) that are not incorporated in the SEIR

PLOS ONE
model. In contrast, the hybrid and NN model could capture this trend, suggesting that both models succeeded in learning potential non-linear relationships between covariates and occupancy of ICU.

Relative importance of covariates
The fact that the relative importance of each covariate for our models changed between phases has multiple possible interpretations. The first is that a covariate is important for making predictions during one phase, while it is not important for another phase. For example, the proportion of the Alpha variant was not informative during phase 1 when its prevalence was < 10% of the total confirmed COVID-19 cases, while it was informative during phase 3, when its prevalence was > 50% of the total confirmed Covid-19 cases. The second reason could be associated to the length of the time series. For example, the model had limited data for training during phase 1, while the amount of data tripled for phase 3. This could have caused the full model (i.e., with 5 covariates) to perform worse than the reduced model (i.e., with 1-5 covariates) [50,51], leading to negative Deviation.
During phase 1, the autoregressive term was the only informative covariate, meaning that the models behaved similarly to an Automatic Regressive Integrated Moving Average

PLOS ONE
(ARIMA) model. Furthermore, on average, the Deviation associated with the hybrid model was always lower than the one associated to the NN. This means that the hybrid model was more robust in the exclusion of a specific covariate compared to the NN.

Possible applications and limitations
In Switzerland, a number of studies have focussed on providing long-term (>2 weeks) [52][53][54] and short-term (<2 weeks) [36] predictions using MMs. Predictions have predominantly on the national scale; while many of the lifesaving actions (e.g., transfer of patients) need to be planned at the cantonal-(provincial-level) or hospital-level. In this study, we attempted to increase the prediction accuracy of a SEIR model by coupling it with a NN, generating a socalled hybrid model. Among all the possible ways to combine a MM with a ML model, we opted for a configuration called residual modelling. In particular, we used a SEIR model for predicting occupancy of ICU beds under future scenarios at different geographical levels (national, cantonal, and hospital) in Switzerland; we trained a NN to supplement these predictions using the information embedded in covariates (temperature, etc. . .). This modelling framework could be applied in other geographic regions for which a MM (e.g., of the SIR family), and spatially explicit covariates are available. Specifically, different extension of the SIR model [14] can be used, from simple examples (like the SEIR used in this study), to increasingly complex frameworks such as SIDARTHE [12]. As for the ML model, we used a feed-forward NN with a single hidden (see section 2.2.1.). However, alternative formulations could have been chosen. For example, Maher Ala'raj et al. [27] coupled an ARIMA model, a very popular ML model for time series forecasting with a SEIRD model; Watson et al. [17] embedded a Bayesian time series model and a random forest algorithm within a SIRD model; Rahmadani and Lee [28] combined a deep-learning algorithm with a SEIR model.
As with any modelling study, our analysis also comes with limitations. For example, training of the NN is often computationally intensive and the selection of optimal hyperparameters is based on empirical rules such as try-and-error approaches [46]. In this study, the optimization of hyperparameters required a significant effort in terms of computational cost. Specifically, all simulations were run in parallel on ETH High Performance Computing facilities (Euler cluster) [55], requiring, on average, 1 minute per simulation on one CPU, and thus 15 CPU minutes for each iteration running simultaneously on 15 CPU cores. We optimized three hyperparameters, namely the number of nodes in the hidden layer, learning rate, and dropout rate; however, other hyperparameters such as the type of activation function, the number of batches, the number of epochs, etc., could also have been subjected to optimization. Furthermore, other type of search algorithms such as the sequential model-based optimization (SMBO, also known as Bayesian optimization) [56] could have been explored. Another drawback of the residual modelling configuration is the inability to enforce real-world constraints (e.g., ICU beds � 0), since the residuals are modelled instead of based on the actual ICU occupancy. One possible alternative could be to combine the SEIR model and the NN in series. In this case, the NN estimates intermediate variables to be used in the SEIR model, although it would impose structural changes on the SEIR model based on the variables selected, which may be challenging to implement.
As for the downscaling at the hospital-level, we used a simple method based on the moving average to downscale predictions at the cantonal-level (see section 2.4.), demonstrating a satisfactory degree of accuracy in hospitals in the Canton of Zurich. However, this method requires the availability of ICU beds at the hospital-level, which is not always the case. Consequently, more complex methods could be tested. In particular, Zhao et al. [30] presented a method to distribute ICU patients based on travel time from the location of the patient to the hospital. In the future, our modelling framework can be updated as growing knowledge is gained on the covariates associated with the spread of COVID-19. For example, new covariates such as other virus variants and mobility patterns in different regions (e.g., people coming in and out of Switzerland) could be included to improve predictions. Lastly, the framework could be applied to improve predictions of other infectious diseases, for which a MM already exists.