Quantifying uncertainty about future antimicrobial resistance: Comparing structured expert judgment and statistical forecasting methods

The increase of multidrug resistance and resistance to last-line antibiotics is a major global public health threat. Although surveillance programs provide useful current and historical information on the scale of the problem, the future emergence and spread of antibiotic resistance is uncertain, and quantifying this uncertainty is crucial for guiding decisions about investment in antibiotics and resistance control strategies. Mathematical and statistical models capable of projecting future rates are challenged by the paucity of data and the complexity of the emergence and spread of resistance, but experts have relevant knowledge. We use the Classical Model of structured expert judgment to elicit projections with uncertainty bounds of resistance rates through 2026 for nine pathogen-antibiotic pairs in four European countries and empirically validate the assessments against data on a set of calibration questions. The performance-weighted combination of experts in France, Spain, and the United Kingdom projected that resistance for five pairs on the World Health Organization’s priority pathogens list (E. coli and K. pneumoniae resistant to third-generation cephalosporins and carbapenems and MRSA) would remain below 50% in 2026. In Italy, although upper bounds of 90% credible ranges exceed 50% resistance for some pairs, the medians suggest Italy will sustain or improve its current rates. We compare these expert projections to statistical forecasts based on historical data from the European Antimicrobial Resistance Surveillance Network (EARS-Net). Results from the statistical models differ from each other and from the judgmental forecasts in many cases. The judgmental forecasts include information from the experts about the impact of current and future shifts in infection control, antibiotic usage, and other factors that cannot be easily captured in statistical forecasts, demonstrating the potential of structured expert judgment as a tool for better understanding the uncertainty about future antibiotic resistance.


Additional methods
Exponential smoothing attaches greater weights to more recent observations. Here we describe each of the exponential smoothing methods we apply. Please note that the notation is the same as in [1].
Simple exponential smoothing is a weighted average of all historical observations as follows (equations from [1]): where is the estimate for the level at time and 0 ≤ ≤ 1 is the level smoothing parameter.
In Holt's linear trend method, we weight both the level and the trend as follows (equations from [1]): where is the estimate for the level at time , 0 ≤ ≤ 1 is the level smoothing parameter, and is the estimate for the trend at time , 0 ≤ ≤ 1 is the trend smoothing parameter.
Lastly, we introduce a dampened trend in Holt's linear method as follows (equations from [1]): where 0 < < 1 is the dampening parameter.
The exponential smoothing methods provide point estimate forecasts. For further detail on the projection intervals and innovations state space models please see [1] Chapter 7.5.

Fig E. France: Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW) assessments for the variables of interest.
Boxplots show the median estimate, 50% credible range, and 90% credible range.

Fig F. Italy: Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW) assessments for the variables of interest.
Boxplots show the median estimate, 50% credible range, and 90% credible range.

Fig G. Spain: Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW) assessments for the variables of interest.
Boxplots show the median estimate, 50% credible range, and 90% credible range.

Fig H. United Kingdom: Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW) assessments for the variables of interest.
Boxplots show the median estimate, 50% credible range, and 90% credible range.

Fig I. Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW) assessments for Streptococcus pneumoniae and intermediate susceptibility to penicillins. Boxplots
show the median estimate, 50% credible range, and 90% credible range.

Fig J. Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW) assessments for Neisseria gonorrhoeae resistance to third-generation cephalosporins. Boxplots
show the median estimate, 50% credible range, and 90% credible range.

Fig K. Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW)
assessments for pan-drug resistant Pseudomonas aeruginosa. Boxplots show the median estimate, 50% credible range, and 90% credible range.

Fig L. Expert, equal-weight decision maker (EW), and performance-weight decision maker (PW)
assessments for items concerning resistance rates at non-invasive sites in 2021. Boxplots show the median estimate, 50% credible range, and 90% credible range. SST = skin and soft tissue.

Additional statistical forecast results
In addition to the statistical forecasting models presented in the paper, we considered results from three additional forecasting models. Because combining forecasts using different methods often leads to better accuracy [5], we averaged the ARIMA and exponential smoothing models (Fig Q). We created an ARIMA model that bounds resistance such that it cannot exceed 60% (Fig R), reflecting experts' belief that resistance rates are unlikely to reach 100% as clinicians would adjust prescribing behaviour or other interventions would be undertaken before resistance hits that level. Results from this model do not differ greatly from the normal ARIMA model (Fig P), aside from the decreased maximum value. Finally, we created an exponential smoothing model without the logit transformation ( Fig S). This model is equivalent to a linear extrapolation of the historical trend. The resulting projected resistance rates are less than 0% or above 100% for some combinations, demonstrating the need for a transformation or bounding the forecast. The prediction intervals from this model are typically narrower than the prediction intervals from the other statistical forecasts.