Synthetic method of analogues for emerging infectious disease forecasting

doi:10.1371/journal.pcbi.1013203

Fig 1.

Diagram of sMOA.

Recall k is the time series segment length and h is the largest forecast horizon. (a) Three fully observed synthetic time series in the library. (b) Synthetic time series segments of length . The first k time points in black; the last h time points in red. (c) Fully observed time series . (d) Time series segment of length k (i.e., the last k observations from the time series in (c)). (e) Compute the distance between the observed time series segment and the first k observations of each synthetic time series segment in the library (i.e., the black points). (f) The point forecast is an aggregation (e.g., average) of the last h observations of the synthetic time series (i.e., the red points) with the smallest distances d_i.

More »

Expand

Fig 2.

A demonstration of sMOA forecasting during the early weeks of the COVID-19 epidemic.

Black lines correspond to point forecasts; the orange lines correspond to the true observed value. The basic ensemble model of the ForecastHub (‘COVIDhub-4_week_ensemble’) and the basic persistence model (‘COVIDhub-baseline’) forecasts are provided for reference for the dates where forecasts were provided. The third model used for later comparisons, the ‘COVIDHub-trained_ensemble’, does not provide forecasts this early in the COVID-19 epidemic.

More »

Expand

Fig 3.

Nominal vs. empirical coverage for sMOA over every state in the US and over four forecast horizons (1w, 2w, 3w, 4w), plotted using a black line.

The dotted line indicates a perfect match between nominal and empirical coverages for reference. Over every forecast made for the data application to COVID-19, nominal and empirical coverages approximately match.

More »

Expand

Fig 4.

Direct comparisons between models from the ForecastHub and sMOA, using mean MAE (left) and mean WIS (right).

The error comparison between sMOA and a given model from the ForecastHub is only calculated for the dates for which forecasts from the given model were reported. That is, a given point represents the mean error metric for a model from the ForecastHub calculated over every date, state, and forecast horizon available for that model, plotted against the same mean metric calculated using sMOA on these same dates, states, and forecast horizons. Models beneath the diagonal black line were outperformed by sMOA. Four outlier models were removed for ease of visualization.

More »

Expand

Fig 5.

The proportion of all models (black) and best-in-class models (red) sMOA outperforms in MAE (top) and WIS (bottom) if the validation window ranged from August 2020 through the x-axis date.

sMOA outperforms the majority of all models and best-in-class models if the validation date cut off is between October 2020 and March 2023. Directly before October 2020, there was a dip in incidence case counts that sMOA failed to forecast accurately that caused the initial lower performance.

More »

Expand