Real-time forecasting of data revisions in epidemic surveillance streams

doi:10.1371/journal.pcbi.1013709

Fig 1.

Overview of the Delphi-RF framework.

(A) Preparatory step: early exploration of historical revisions to define the target lag L before model training and forecasting. (B) Data structure in the report date–reference date space. Data above the lag = 0 diagonal correspond to revisions that are available (or will become available). The light blue region shows revisions with lag greater than the target lag, which are excluded from Delphi-RF. The blue parallelogram marks revisions with the target lag available, which are used for training. The green triangular region indicates the most recent revisions, where the target lag is not yet available. (C) Workflow of Delphi-RF: revisions with target lag available are used for training (small lags trained independently, large lags trained in groups). Saved models are then applied to real-time revisions for forecasting. L days later, predictions are evaluated using the Weighted Interval Score (WIS).

More »

Expand

Fig 2.

Data revision patterns for different indicators.

(A) Mean percentage of counts reported relative to the values revised 300 days later, averaged over all reference dates and plotted by reporting lag for Massachusetts. Shaded bands represent the 10th to 90th percentile interval. (B) Mean values of COVID-19-related fractions normalized by their corresponding revised values after 300 days, also averaged over all reference dates and plotted by lag for Massachusetts.

More »

Expand

Fig 3.

Revision patterns of COVID-19 claims and total claims

(A) COVID-19 claims in Massachusetts and (B) total claims in Massachusetts, shown as heatmaps of the proportion reported by reference date and reporting lag. (C) COVID-19 claims and (D) total claims across HHS Regions 1 & 2, showing cumulative reporting curves by state. Shaded bands represent the 10th to 90th percentile interval.

More »

Expand

Fig 4.

Evaluation of forecasts for counts, aggregated by lag.

(A) Forecasts of finalized confirmed COVID-19 case counts in MA. (B) Forecasts of COVID-19 insurance claims across all states, based on CHNG outpatient insurance claims data. Solid lines indicate the mean WIS, which approximates absolute relative errors between the most recent report and the target, averaged over locations and reference dates for each lag. Shaded areas represent the 10th to 90th percentile interval.

More »

Expand

Fig 5.

Evaluation of forecasts for fractions, aggregated by lag.

(A) Forecasts of the fraction of COVID-19 insurance claims based on CHNG outpatient insurance claims data. (B) Forecasts of the fraction of positive COVID-19 antigen tests based on Quidel antigen tests data. Solid lines represent the mean WIS, , which approximates absolute relative errors between the most recent report and the target, averaged over locations and reference dates for each lag. Shaded areas indicate the 10th to 90th percentile interval.

More »

Expand

Fig 6.

Evaluation of forecasts for counts, aggregated by reference date

Top: Forecasts of finalized confirmed COVID-19 case counts in MA. Bottom: Forecasts of COVID-19 insurance claims across all states, based on CHNG outpatient insurance claims data. Solid lines represent the mean WIS at lag 7, averaged over locations for each reference date. Shaded areas indicate the 10th to 90th percentile interval. The accompanying heatmaps display the corresponding target values, with darker shades indicating larger number of cases or claim counts.

More »

Expand

Fig 7.

Evaluation of forecasts for fractions, aggregated by reference date.

Top: Forecasts of the fraction of COVID-19 insurance claims based on CHNG outpatient insurance claims data. Bottom: Forecasts of the fraction of positive COVID-19 antigen tests based on Quidel antigen tests data. Solid lines represent the mean WIS at lag 7, averaged over locations for each reference date. Shaded areas indicate the 10th to 90th percentile interval. The accompanying heatmaps display the target values, with darker shades indicating higher fractions.

More »

Expand

Fig 8.

Boxplots illustrating the impact of surveillance conditions on forecast accuracy

(Each box displays the 25th, 50th (median), and 75th percentiles of the WIS). (A) Forecasts stratified by the direction of the target surveillance trend—“Up”, “Flat”, or “Down”. (B) Forecasts stratified by the magnitude of the target, categorized as “High”, “Medium”, or “Low”.

More »

Expand

Fig 9.

Comparison of count forecast evaluation results with NobBS and Epinowcast.

(A) Forecasts of finalized confirmed COVID-19 case counts in Massachusetts. (B) Forecasts of finalized COVID-19 insurance claim counts across all states based on CHNG outpatient data. (C) Forecasts of dengue fever case counts in Puerto Rico. (D) Forecasts of ILI case counts nationwide. Solid lines represent the mean WIS, which approximates absolute relative errors between the most recent report and the target, averaged over locations and reference dates for each lag. Shaded areas indicate the 10th to 90th percentile interval.

More »

Expand

Table 1.

Computing time comparison across methods and datasets.

Computing time required by different methods applied to various datasets, measured per location and per report date. The table presents the mean and standard error of the mean (SEM) for computing time. For daily data, all models are trained and generate forecasts every 30 days for CHNG outpatient insurance claims and every 7 days for MA-DPH COVID-19 confirmed cases. For weekly data, models are trained and generate forecasts on a weekly basis. To ensure a fair comparison, all settings—including maximum delay and training window size—are kept the same across methods.

More »

Expand

Fig 10.

Ablation study of Delphi-RF features for forecasting performance.

Each colored curve represents the performance when a specific feature group is dropped from the model (e.g., day-of-week effect, week-of-month effect, lagged values, revision magnitude, or 7-day average). The black curve shows the baseline model that includes all features. Error bars represent the standard error of the mean WIS. The right-hand y-axis shows the corresponding absolute relative error percentage. Lower values indicate better predictive performance.

More »

Expand