Data-driven estimation of the instantaneous reproduction number and growth rates for the 2022 monkeypox outbreak in Europe

Objective To estimate the instantaneous reproduction number Rt and the epidemic growth rates for the 2022 monkeypox outbreaks in the European region. Methods We gathered daily laboratory-confirmed monkeypox cases in the most affected European countries from the beginning of the outbreak to September 23, 2022. A data-driven estimation of the instantaneous reproduction number is obtained using a novel filtering type Bayesian inference. A phenomenological growth model coupled with a Bayesian sequential approach to update forecasts over time is used to obtain time-dependent growth rates in several countries. Results The instantaneous reproduction number Rt for the laboratory-confirmed monkeypox cases in Spain, France, Germany, the UK, the Netherlands, Portugal, and Italy. At the early phase of the outbreak, our estimation for Rt, which can be used as a proxy for the basic reproduction number R0, was 2.06 (95% CI 1.63 − 2.54) for Spain, 2.62 (95% CI 2.23 − 3.17) for France, 2.81 (95% CI 2.51 − 3.09) for Germany, 1.82 (95% CI 1.52 − 2.18) for the UK, 2.84 (95% CI 2.07 − 3.91) for the Netherlands, 1.13 (95% CI 0.99 − 1.32) for Portugal, 3.06 (95% CI 2.48 − 3.62) for Italy. Cumulative cases for these countries present subexponential rather than exponential growth dynamics. Conclusions Our findings suggest that the current monkeypox outbreaks present limited transmission chains of human-to-human secondary infection so the possibility of a huge pandemic is very low. Confirmed monkeypox cases are decreasing significantly in the European region, the decline might be attributed to public health interventions and behavioral changes in the population due to increased risk perception. Nevertheless, further strategies toward elimination are essential to avoid the subsequent evolution of the monkeypox virus that can result in new outbreaks.


S2 Synthetic Examples
To assess the accuracy of our parameter recovery process, we utilize a synthetic data set generated by the observational model (Equation 6, main file).Our objective is to estimate the values of the parameters r, p, and C 0 .We employ Bayesian estimation by training the model with only one data window and leveraging it to predict the observed cases.Furthermore, we present a comparative analysis by contrasting the outcomes obtained from the Bayesian sequential approach.

Example 1
In this scenario, parameters r and p are constant throughout the observed period.The synthetic data was generated using the following values: r = 10, p = 0.4, C 0 = 20, θ = 0.01 and ω = 2. Figure S2 displays the resulting synthetic data.
We employ the Bayesian approach with a dataset consisting of 35 data points to train the model and make predictions.We aim to predict the next 65 days.Since the parameters r and p remain constant over the entire time interval, training the model with the initial dataset is sufficient to make accurate predictions.In this case, using the adaptive approach does not yield any additional information, as the underlying dynamics remain the same.You can observe the posterior distribution in Figures S3  and S4.The training set is represented by the gray region, while the blue region indicates the forecasted period.The solid red lines depict the median incidence forecast.The darker-shaded blue region represents the interquartile forecast range, indicating the middle 50% of the predicted values.The lighter-shaded blue region indicates the 5th to 95th percentile range, encompassing a wider range of possible outcomes.The observed cases are denoted as individual data points.

Example 2
We generate a synthetic data set assuming that the values of r, p, and C 0 change over time.We utilized simple piecewise functions.Specifically, we defined r(t) = R[k] and Fig S4 .The predicted incidence time series is generated using the sequential Bayesian approach method with a window size of 35 data points and moving every 14 days.The solid red lines depict the median incidence forecast.The darker-shaded blue region represents the interquartile forecast range, indicating the middle 50% of the predicted values.The lighter-shaded blue region indicates the 5th to 95th percentile range.The observed cases are denoted as individual data points.
We initialized the variable C 0 as 20 and applied the GGM dynamic to update its value.Additionally, we kept the parameters θ and ω fixed at 0.04 and 3, respectively.The synthetic data generated from these settings are visualized in Figure S5.6, main file) with fixed dispersion parameters θ = 0.04 and ω = 3 and time-dependent parameters for r, p.
We utilized two models in our analysis: the Bayesian sequential model and a Bayesian model assuming constant parameters for r and p.The Bayesian sequential model involved a window size of 35 days, with updates every 14 days.The Bayesian model that assumed constant parameters utilized 35 data points for training, similar to Example 1.
However, it is crucial to highlight that the estimation obtained from the model assuming constant parameters and a fixed-size window was poor.This can be attributed to the fact that the model was trained on a dataset with low counts observed at the beginning of the outbreak while assuming constant values for r and p.Consequently, the predictions made by this model were relatively flat.Please refer to Figure S6 for a visual representation.
On the other hand, by acknowledging that the parameters r and p can change over time and adopting the sequential Bayesian approach, we were able to capture these variations and achieve improved predictive performance.The sequential Bayesian approach adjusts to the changing parameters, making more accurate predictions.

FigsFig S1 .
Fig S1.Density of the prior distribution in the adaptive sequential Bayesian approach for the parameters: deceleration of growth (p), growth rate (r), and initial condition (C 0 ).

Fig S3 .
Fig S3.Forecasting incidence time series.The training set is represented by the gray region, while the blue region indicates the forecasted period.The solid red lines depict the median incidence forecast.The darker-shaded blue region represents the interquartile forecast range, indicating the middle 50% of the predicted values.The lighter-shaded blue region indicates the 5th to 95th percentile range.The observed cases are denoted as individual data points.
Fig S5.Synthetic data generated with the Negative Binomial model (Equation6, main file) with fixed dispersion parameters θ = 0.04 and ω = 3 and time-dependent parameters for r, p.

Figure
Fig S6.The predicted incidence time series is illustrated in the plot.The training set is represented by the gray region, while the blue region indicates the forecasted period.The solid red lines depict the median incidence forecast.The darker-shaded blue region represents the interquartile forecast range, indicating the middle 50% of the predicted values.The lighter-shaded blue region indicates the 5th to 95th percentile range.

Fig S7 .
Fig S7.The predicted incidence time series is generated using the sequential Bayesian approach method with a window size of 35 data points and moving every 14 days.The solid red lines depict the median incidence forecast.The darker-shaded blue region represents the interquartile forecast range, indicating the middle 50% of the predicted values.The lighter-shaded blue region indicates the 5th to 95th percentile range.