“Back to the future” projections for COVID-19 surges

We argue that information from countries who had earlier COVID-19 surges can be used to inform another country’s current model, then generating what we call back-to-the-future (BTF) projections. We show that these projections can be used to accurately predict future COVID-19 surges prior to an inflection point of the daily infection curve. We show, across 12 different countries from all populated continents around the world, that our method can often predict future surges in scenarios where the traditional approaches would always predict no future surges. However, as expected, BTF projections cannot accurately predict a surge due to the emergence of a new variant. To generate BTF projections, we make use of a matching scheme for asynchronous time series combined with a response coaching SIR model.


Introduction
"The past is prologue" (Shakespeare)."The best predictor of future behavior is past behavior" (Twain)."The best way to predict the future is to study the past or prognosticate" (Kiyosaki).These are all famous quotes which, when applied to important prediction or projection problems (projection being prediction into the future), suggest that a careful understanding of past events is essential to predicting future trends.
Yet, when applied to the problem of projecting a new surge of COVID-19 infections in India, back in mid February 2021, these strategies did not work.India had seen a arXiv:2202.08928v1[q-bio.PE] 17 Feb 2022 where, at time t, S is the susceptible population, I is the number of infectious, R is the number removed either by death or recovery, and N is the sum of these three: The parameters β and γ are the transmission and recovery rates, respectively.From (1), Also from (1), dividing the first equation by the third, and integrating with respect to S and R, S(t) = S(0)e −R0(R(t)−R(0))/N , where R 0 is the basic reproduction number given by R 0 = β/γ.At the outset of an epidemic, when S ≈ N , infection numbers begin to surge as R 0 1.Subsequent surges are characterized by the ratio N/S.When R 0 > N/S, infection numbers rise more rapidly, hit a peak when R 0 = N/S, and then decline as R 0 < N/S.
When assumed purely mechanistic, numerical methods such as Euler discretization or the Runge-Kutta approximation method (Butcher, 2016)

Time series ARIMA models
Time series models have also been exploited for modeling epidemic data trends (Alabdulrazzaq et al., 2021;Song et al., 2016).Using new notation, we will let the daily infection counts be Y t and defined , where d is the number of differences needed to make the series stationary, then the ARIMA(p, d, q) model (Box and Jenkins 1976) has the form where , p is the autoregressive order, q the moving average order, and L is known as the backshift operator.The random variable t is white noise assumed to follow a normal distribution.The α 1 , . . ., α p are the autoregressive parameters, and the θ 1 , . . ., θ q are the moving average parameters, both sets to be estimated by maximum likelihood.The order of the ARIMA(p, d, q) model is typically chosen using a model selection criterion, like BIC (Schwarz, 1978) or AIC (Akaike, 1974), among other methods.

Curve fitting
Curve fitting essentially amounts to deriving a functional relationship between Y t and t such that the estimated curve matches the observed daily infection count trend as closely as possible.This approach is generally considered less tied to underlying assumptions about features within the population that might be driving the daily infection numbers.However, the drawback is that it's not a mechanistic approach and thus may not do as well with longer term forecasts.Some examples include the generalized logistic model (Aviv-Sharon and Aharoni, 2020) and the generalized Gaussian cdf (Ciufolini and Paolozzi, 2020), both adopted by the Institute of Heath Metrics and Evaluation (IHME).These models have been extended to allow for incorporation of covariates that can connect different locations together (https://ihmeuw-msca.github.io/CurveFit/methods/).

Surge prediction and the sensitivity to point of inflection
The focus in this paper is to predict future COVID-19 surges before an inflection point for the surge itself -in other words, on the downward trajectory of the previous surge or in a valley before the future surge.
Projections before and after an inflection point can be markedly different.To illustrate this point, consider SIR models fit to daily case count data for the United Kingdom, as shown in Figure 1.The red arrow on the plot indicates a point of inflection before the start of the third surge around November, 2021.Let's suppose this is the surge we are trying to predict.The green curve is an SIR model fit to data prior to the inflection point that is been projected forward past the inflection point in Figure 1.Notice how it's descending to zero.Now assume we wait some days to make the projection for the third surge.The projected curve from such an SIR model would look like the blue curve in the figure.As expected, it is rising upwards towards the observed peak daily count.
This looks to be much more accurate.These types of projections are of less public health planning value, since with highly contagious viruses like the omicron variant of COVID-19, with an R 0 number estimated to be near 10, it is nearly impossible to blunt the surge after the point of inflection because one is always "running behind" the virus.
Our proposed methodology seeks to do better than the green curve based on the same observed data.

Back to the future projections
We now restrict our attention to one particular sequence -the daily infection counts over time.Our main interest is to project a future oncoming next surge during the downward trajectory of the current surge but prior to an inflection point in the curve that might indicate the start of a new surge.As just shown, standard approaches will have all projected curves descending down towards zero daily counts.
To improve naive projections, we exploit the very nature of a pandemic -the fact Once the best match ( Bm ) amongst the M countries has been established, an SIR model is fit to the observed daily infection counts forward in time for Bm .This data is actually observed, which is a key fact.Relevant SIR curve parameter estimates are then passed to country A, using A's current initial conditions, to make not yet observed projections forward in time.This is a type of statistical coaching.It's useful to impose a short "washout" period to allow the current surge to come to completion.We call this Back to the Future (BTF) projections.The steps can be summarized in the algorithm: Back to the Future Projection Algorithm Suppose we want to project country A after A's i-th surge.
(a) Select candidate countries such that i -its i-th surge happens before A's i-th surge, ii -it has sufficient data for i-th surge to match with A's i-th surge (same length interval), Denote these candidate countries as {B k } K k=1 .
(b) Fit ARIMA models for the i-th surge of A and B k .
Smooth these fits using cubic smoothing splines, with the degree of smoothing determined by leave-one-out cross-validation.
Denote the fitted models by Â and Bk .(f) Generate the projection using this new SIR model.

Sensitivity analysis
To analyze the sensitivity of the SIR model, we jitter the two parameters β and γ for a small amount δ.In practice, we sample the parameter pair by a uniform distribution on

Performance on a selection of countries
A BTF analysis was carried out for a selection of 12 countries from all 6 populated continents around the world.Thus the performance of our methodology was examined regardless of the regional variation that might exist from continent to continent.In particular, the chosen countries experienced second surges during our time window of analysis and the goal was to accurately forecast second surges from a lagged time point towards the end of their first surges (i.e.before the inflection point of the daily infection curve happened, indicating the start of a potential second surge).This would be a truly honest projection and would more clearly demonstrate the utility of the BTF methodology.Usual forecasting with compartment models, ARIMA models, or curve fitting, would all indicate the projected curves continue downwards, given that the projections were made from a point in time on the downward trajectory of the first surges.As a negative control, we also included Australia where no second surge was detected during the analysis time window.
Figures 2-4 show four panels each with each panel depicting the following: i) a observed daily infection curve; ii) BTF projected curves (solid blue curve) with sensitivity bands (darker blue shaded); iii) standard SIR projected curved (red) with sensitivity band (red shaded); iv) standard ARIMA(p, d, q) forecast (green curve) with 95% prediction interval (green shaded) and v) generalized logistic growth curve model with 95% bootstrap prediction intervals (purple curve and shaded regions).The time window of each surge of interest are the blue rectangular regions.Underneath each country's plot is the matching table from which the coaching country's curve was derived.
Making these kinds of projections is clearly a very challenging task and represent a type of aspirational goal (see Rosenfeld and Tibshirani (2021)).Thus, judging the accuracy of the BTF projections must be calibrated appropriately.For point estimatebased predictions, one can use absolute error; for interval-based predictions, the weighted integrated score is an option (Rosenfeld and Tibshirani, 2021).Iran, where matched fits were poorly determined.For Israel, the BTF absolute error curve looks worse later in the shaded time window than earlier on; this corresponds to the projected peak for the surge being shifted too far to the right.A similar "flip" in absolute error curves occurs for Italy.For Australia, no differences were found but this country was the negative control.
AE curves, while useful, do not convey other important information regarding surge projections.For instance, it is of particular interest whether a surge projection accurately estimated peak height (within plus or minus 10 days) and/or location (within plus or minus 10 days).Table 5 breaks this down for our analysis.It indicates that for 5 countries we did indeed achieve peak height match, and for 8 countries we achieved peak location match.Contrast this to SIR and ARIMA models which worked only for Australia, the negative control.
We also found an interesting result regarding India's projected surge.The second surge corresponded to the emergence of the delta variant of COVID, which produced a peak height of over 400K daily infections.Our projected estimate was only around 50K.However, it has also been estimated that at surge peak, fully 90% of the daily infection counts were attributable to the delta variant (https://clingen.igib.res. in/covid19genomes/).This means that 10% came from other existing variants found in other countries.Hence our projected peak approximates this number very accurately.
We actually would not expect to project a peak for a new variant accurately using BTF, since the method cannot accommodate new variants as currently formulated.

Justification for the matching
One of the features of a pandemic is that surges and recessions happen asynchronously across different countries.We are relying on the fact that finding a best matched country by time-shifting to create overlayed earlier surges will in fact provide useful information to coach future surge projections of interest.Thus it is necessary to say something regarding the optimality of this type of matching.
The correlation between asynchronous time series has been examined in what is termed lead-lag relationships between different financial markets (de Jong and Nijman, 1997).For example, a link has been established between index futures and the cash market where the futures market tends to lead the cash market (see for instance Stoll and Whaley (1990)).The analysis of information flows between markets on short varying time intervals is an active area of research.In de Jong and Nijman (1997), they developed a method for estimating correlations from irregularly spaced transactions data.
For two stationary ARIMA processes, we can test for the presence of cross-correlation functions between the two asynchronous series.However, this approach is sensitive to the choice of lag length and cannot tell the directionality of causality, only the presence or absence of it.In addition, the statistic lacks power, as compared to regression-based tests discussed next.
One more clear way forward is to conduct a direct test for Granger causality (Eichler, 2013), by regressing each variable on lagged values of itself and the other.This can be written as where t is white noise and t = t − l m , with l m the lag between S 1 and S 1,m .Then Granger causality between the two asynchronous time series can be assessed by testing whether the κ k = 0 or not, using an F-test based on comparing nested residual sum of squares.
estimated using an SIR model.Remember that the interval corresponding to S 2,m is lagged with respect to the interval corresponding to S 2 .
Inspired by the response coaching idea of Tibshirani and Hinton (1998), we can write where δ 2,m is a coaching parameter vector specific to S 2,m and shared with A during its S 2 .Thus the prediction of Y t for t ∈ S 2 can be coached by country Bm via the shared parameter vector δ 2,m and estimated by where δ2,m is estimated from the fit of Bm in t ∈ S 2,m and using the population characteristics of A during S 2 .
The fact that S 1 and S 1,m are not the same, and that S 2 and S 2,m are not the same, but that we are expecting Bm to be informative regarding S 2 implies a periodic property of the pandemic across matched countries in time.That is, Bm represents a country that experienced a very similar first surge and thus there is information to be gleaned about predicting Y by learning from Bm 's experience during their (earlier) second surge.

Discussion
Making surge projections before the next surge begins is frankly necessary, given the highly infectious nature of many of the COVID-19 variants.Waiting until after an inflection point will simply mean that one is always playing catch up against the virus.
Our methodology attempts to do exactly this by employing a matching scheme to other candidate countries and then appealing to Granger causality in order to borrow from that matched country's observed ensuing daily case counts.As we have discussed, once matching has occurred, the BTF projections themselves use a form of response coaching which can reduce variance over a non-coached model (Tibshirani and Hinton 1998).
It should be emphasized that our BTF projections cannot work well when a new variant emerges for the first time and is responsible as the driver of a new surge.There

Fig. 1 .
Fig. 1.Sensitivity of the UK SIR model projections to the inflection point (red arrow) of daily infections curve.
(c) From {B k } K k=1 , select the country most similar to A by Bm ≡ arg min Bk median Âsd − Bsd k , where Âsd , Bsd k be the standardization of Â, Bk by its maximum.(d) Fit an SIR model to Bm after its i-th surge (a 10-day gap may be introduced to washout the effect of the i-th surge).(e) Pass the estimated parameters β( Bm ), γ( Bm ) by the SIR model of Bm to the SIR model with A's initial conditions.

8
Daniel Andr és Díaz-Pach ón [β − δ, β + δ] × [γ − δ, γ + δ] (in our cases δ is chosen as 0.01).Then run the SIR model by each pair of these parameters.We can shade the union of these individual runs.3.2.Why coaching using a compartment model?The mechanistic nature of the systematic component of the compartment model provides a more parsimonious representation of country Bm over a longer period of time.It also allows a clear path to incorporating country A's specific characteristics.This helps to anchor the BTF projections and to generate more accurate projected trends over longer windows of time, rather than purely generating accurate short-term projections which are of limited public health benefit.Contrast this to coaching using an ARIMA model instead from country Bm .Since only lagged effects can be modeled in the ARIMA model, country Bm 's shape of their next surge (after the matching one), will not fully inform country A forecasts of interest.4.DataCOVID-19 infection volume time series came from the Johns Hopkins University CSSE COVID-19 Tracking Project and Dashboard which when the data was pulled ranged from 01/22/2020 until 04/12/2021 (correct dates here).

Figures 5 -
Figures 5-7 show the absolute errors (AE) over time for BTF (blue curve) versus using the naive SIR model (red curve), ARIMA model (green curve) and generalized logistic growth curve (purple) projected forward from the same point in time.Once again, the light blue shaded rectangular regions correspond to the time windows of the future surge of interest.Lower values of absolute error indicate a better fit to the actually observed future data.In 9/12 circumstances the BTF's projections dominate naive model projections.The exceptions being Japan and Germany, where very large peaked surges were projected and much smaller peaked surge actually emerged; and

Fig. 2 .
Fig. 2. Projection curves for Australia, Colombia, Germany, and India, using BFT projections.Projected blue curve and region of projection (before inflection point) of next surge in shaded blue.Note that a 10 day washout period is forced before projections start.Matching country ranking tables shown underneath each plot.

Fig. 3 .
Fig. 3. Projection curves and matching country ranking tables for Iran, Israel, Italy, and Japan, using BFT projections.

Fig. 4 .Fig. 6 .
Fig. 4. Projection curves and matching country ranking tables for Singapore, South Africa, United Kingdom, and United States, using BFT projections.

Table 1 .
Countries where BTF projections matched surge peak height and location.The * for India under the peak height match col-