Fig 1.
Evaluation on the resistance of accuracy measures to a single forecasting outlier.
A: Synthetic time series data where Yt is the target series and are forecasts. The only difference between
is their forecasts on the observation Y8. B: Results of single forecasting outlier evaluation, which shows UMBRAE is less sensitive than other measures to a single forecasting outlier.
Fig 2.
Evaluation on the symmetry of accuracy measures to over-estimates and under-estimates.
A: Synthetic time series data where Yt is the target series and are forecasts.
makes a 10% over-estimate to all observations of Yt, while
makes a 10% under-estimate. B: Results of symmetric evaluation, which shows UMBRAE and all other accuracy measures except sMAPE are symmetric.
Fig 3.
Evaluation on the scale dependency of accuracy measures.
A: Synthetic time series data where Yt is the target series and are forecasts.
and
have the same mean absolute error, but errors are on different percentage scales to the corresponding values of Yt. B: Results of scale dependency evaluation, where MAE, RMSE, MASE and even GMRAE show no difference between
and
. MRAE and MAPE produce substantially different errors for the two cases. sMAPE and UMBRAE can reasonably distinguish the two forecasts.
Table 1.
Results on M3-Competition data at first six forecasting horizons.
Table 2.
Spearman’s rank correlation coefficient of the rankings in Table 1.
Table 3.
Results with a 3% trimming level on M3-Competition data at first six forecasting horizons.
Table 4.
Spearman’s rank correlation coefficient of the rankings in Table 3.
Fig 4.
Box-and-whisker plot and kernel density estimates for the absolute errors used by MAE.
Fig 5.
Box-and-whisker plot and kernel density estimates for the squared errors used by RMSE.
Fig 6.
Box-and-whisker plot and kernel density estimates for the absolute scaled errors used by MASE.
Fig 7.
Box-and-whisker plot and kernel density estimates for the absolute scaled errors used by AvgRelMAE (log-scale).
Fig 8.
Box-and-whisker plot and kernel density estimates for the relative absolute errors used by MRAE and GMRAE (log-scale, forecasts with zero or undefined error excluded).
Fig 9.
Box-and-whisker plot and kernel density estimates for the absolute percentage errors used by MAPE.
Fig 10.
Box-and-whisker plot and kernel density estimates for the scaled percentage errors used by sMAPE.
Fig 11.
Box-and-whisker plot and kernel density estimates for the bounded relative absolute errors used by UMBRAE (using the naïve errors as the benchmark).
Table 5.
Ratings of accuracy measures.