Fundamental limits on inferring epidemic resurgence in real time

We find that epidemic resurgence, defined as an upswing in the effective reproduction number (R) of the contagion from subcritical to supercritical values, is fundamentally difficult to detect in real time. Intrinsic latencies in pathogen transmission, coupled with often smaller incidence across periods of subcritical spread mean that resurgence cannot be reliably detected without significant delays, even if case reporting is perfect. This belies epidemic suppression (where R falls from supercritical to subcritical values), which can be ascertained 5-10 times more rapidly. These innate limits on detecting resurgence only worsen when spatial or demographic heterogeneities are incorporated. Consequently, we argue that resurgence is more effectively handled proactively, at the expense of false alarms. Responses to recrudescent infections or emerging variants of concern will more likely be timely if informed by improved syndromic surveillance systems than by optimised mathematical models of epidemic spread.

series of incident cases necessitates assumptions about differences among meaningful variation (signal) and random fluctuations (noise) [10][11][12]. Modern approaches to epidemic modelling and monitoring aim to maximise this signal-to-noise ratio either by enhancing noise filtering and bias correction methods [13][14][15], or by amplifying signal fidelity through improving surveillance quality and diversity [16][17][18]. While both approaches have substantially advanced the field, there have been few attempts to explore what, if any, fundamental limits exist on the timely detection of these changes. Such limits can provide key benchmarks for assessing the effectiveness of modelling or data collection and deepen our understanding of what can and cannot be achieved by real-time outbreak response programmes, ensuring that model outputs are not overinterpreted and redirecting surveillance resources more efficiently [19][20][21].
While studies are examining intrinsic bounds on epidemic monitoring and forecasting [22][23][24][25], works on transmissibility have mostly probed how extrinsic surveillance biases might cause R misestimation [14,[26][27][28]. Here we address these gaps in the literature by characterising and exposing fundamental limits to detecting resurgence and control from a perfectly ascertained incidence time series. This provides vital insights into the best real-time performance possible and blueprints for how outbreak preparedness might be improved. We analyse a predominant real-time epidemic model [1,2] and discover stark asymmetries in our innate ability to detect resurgence and control. While epidemic control or suppression change-points are inferred robustly and rapidly, inherent delays (5-10 times that for control) strongly inhibit real-time resurgence estimation from widely used incidence curves or data.
We show that these fundamental constraints on resurgence worsen with smaller epidemic size, steepness of the upswing in R and spatial or demographic heterogeneities. Given this bottleneck to timely outbreak analysis, which exists despite perfect case reporting and the use of optimal Bayesian detection algorithms [15,29], we argue that methodological improvements to existing models used to analyse epidemic curves (e.g., cases, hospitalisations or deaths) are less important than enhancing syndromic surveillance systems [30,31]. Such systems, which fuse multiple data sources (including novel ones e.g., wastewater [32]) to triangulate possible resurgences might minimize some of these fundamental limitations. We conclude that early responses to suspected resurging epidemics, at the expense of false alarms, might be justified in many settings, both from our analysis and a consensus that lags in implementing interventions can translate into severely elevated epidemic burden [33][34][35][36]. Using both theory and simulation, we explore and elucidate these conclusions in the next section.
We first provide intuition for why resurgence and control might present asymmetric difficulties when inferring transmissibility in real time. Consider an epidemic modelled using a renewal branching process [37] over times (usually in days) 1 ≤ ≤ . Such models have been widely applied to infer the transmissibility of numerous diseases including Ebola virus, COVID-19 and pandemic influenza. Renewal models postulate that the incidence of new cases at some time s, denoted ! , depends on the effective reproduction number, ! , and the past incidence, " !#" as in Eq. (1) [2]. Here $ % means the set { $ , $&" , … , % } and ≡ indicates equality in distribution. . (1) In . This posterior distribution only uses data up until time s and defines our real-time estimate of R at that time. We can analyse its properties (and related likelihood function P( " ! | ! )) to obtain the Fisher information (FI) on the left side of Eq. (2). This FI (see [10,39] for derivation) captures how informative " ! is for inferring ! , with its inverse defining the smallest asymptotic variance of any ! estimate [10,40]. Larger FI implies better statistical precision.

. (2)
As resurgence will likely follow low incidence periods, we might expect +(!) to be small, while ! rises. This effect will reduce the FI in Eq. (2), making these changes harder to detect. In contrast, the impact of interventions will be easier to infer since these are often applied when cases are larger and reduce ! . We expand on this intuition, using the R posterior distribution to derive the real-time resurgence probability P( ! > 1| " ! ) = ∫ P( ! | " ! ) ! fluctuations in incidence (the posterior distributions for ! overlap less). Panel B bolsters this idea, showing that when +(!) is smaller (as is likely before resurgence) we need to observe larger relative epidemic size changes (Δ +(!) = ) for some increase in P( ! > 1| " ! ) than for an equivalent decrease when aiming to detect control (where +(!) would be larger). , (increasing from blue to red). The degree of separation and hence our ability to uncover meaningful incidence fluctuations from noise, improves with the current epidemic size, +(!) . Panel B shows how this sensitivity modulates our capacity to infer resurgence (P( ! > 1| " ! )) and control (P( ! ≤ 1| " ! ) = 1 − P( ! > 1| " ! )). If epidemic size is smaller, larger relative incidence changes are needed to detect changes in ! (curves have gentler gradient). Resurgence (likely closer to the blue line, top right quadrant) is appreciably and innately harder to detect than control (likely closer to the red line, bottom left quadrant).

Fundamental delays on detecting resurgence but not control
. CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101/2021.09.08.21263270 doi: medRxiv preprint The intrinsic asymmetry in sensitivity to upward versus downward shifts in R (see Figure 1) implies that it is not equally simple to infer resurgence and control from incident cases. We investigate ramifications of this observation by comparing our real-time ! -estimates to ones exploiting all the future incidence information available. We analyse two foundational posterior distributions, the filtered, ! , and smoothed, ! , distributions, defined in Eq. (3). Here ! considers information until time s and captures changes in ! from " ! in real time. In contrast, ! extracts all the information from the full incidence curve " 6 , providing the best possible (in mean squared error) ! -estimate [29]. The differential between ! and ! , summarised via the Kullback-Liebler divergence, D( ! | ! ), measures the value of this future information.
Bayesian filtering and smoothing are central formalisms across engineering, where real-time inference and detection problems are common [29,41]. We compute formulae from Eq. (3) via the EpiFilter package [15,28], which employs optimal forward-backward algorithms. This method improves on the window-based ( ( )) formulation of the last section and maximises the signal-to-noise ratio in estimation. We also obtain filtered and smoothed probabilities of . The probability that the epidemic is controlled (i.e., R ≤ 1) is the complement of these expressions. Our main results, which average the above quantities over many simulated Ebola virus and COVID-19 epidemics, are given in Figure 2 and Figure 5 (appendix), respectively. We uncover striking differences in the intrinsic ability to infer resurgence versus control in real time.
Upward change-points are significantly harder to detect both in terms of accuracy and timing.
Discrepancies between !and ! -based estimates (the latter benchmark the best realisable performance) are appreciably larger for resurgence than control. While decreases in R can be pinpointed reliably, increases seem fundamentally more difficult to detect. These limits appear to exacerbate with the steepness of the R upswing. We confirm these trends with a detailed simulation study across five infectious diseases in Figure 3. There we alter the steepness, , of transmissibility changes and map delays in detecting resurgence and control as a function of the difference in the first time that !and ! -based probabilities cross 0.5 (Δ 75 ) and 0.95 (Δ 87 ), normalised by the mean generation time of the disease. We find that lags in detecting resurgence are at least 5-10 times longer than for detecting control. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint   [15]). Middle panels average the Kullback-Liebler divergences from those simulations and bottom panels present overall filtered (P( ! > 1 | " ! ), blue), and smoothed (P( ! > 1 | " 6 ), red) resurgence probabilities. We find fundamental and striking delays in detecting resurgence, often an order of magnitude longer than those for detecting control or suppression in transmission (see lags between red and blue curves). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint   42,43]). We simulate 1000 epidemics from each disease using renewal models and estimate ! with EpiFilter [15]. Panels C and D (colours match panel B, is normalised by the mean generation times of the diseases) show that delays in detecting resurgence (dots) are at least 5-10 times longer than for indicating control (diamonds). Our ability to infer even symmetrical transmissibility changes is fundamentally asymmetric.

Fundamental delays worsen with spatial or demographic heterogeneities
In previous sections we demonstrated that sensitivity to changes in R is asymmetric, and that intrinsic, restrictive limits exist on detecting resurgence in real time, which do not equally inhibit is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101/2021.09.08.21263270 doi: medRxiv preprint detecting control. While those conclusions apply generally (e.g., across diseases), they do not consider the influence of spatial or demographic heterogeneity. We examine this complexity through a simple but realistic generalisation of the renewal model. Often R-estimates can be computed at small scales (e.g., at the municipality level) via local incidence or more coarsely (e.g., countrywide), using aggregated case counts [3,13]. We can relate these differing scales with the weighted mean in Eq. (4), where the overall (coarse) R at time s, ^! , is a convex sum of finer-scale R contributions from each group ( ! [ ] for the j th of p groups) weighted by the epidemic size of that group (as in Eq. (2) we use windows ( ) for analytic insight). ;)"

. (4)
Our choice of groupings is arbitrary and can equally model demographic heterogeneities (e.g., age-specific transmission), where we want to understand how dynamics within the subgroups influence overall spread [7]. Our aim is to ascertain how grouping, which often occurs naturally due to data constraints or a need to succinctly describe the infectious dynamics over a country to aid policymaking or public communication [44], affects resurgence detection. Eq. (4) implies . Since resurgence will likely first occur within some specific (maybe high risk) group and then propagate to other groups [7], this expression suggests that an initial signal (e.g., if some ! [ ] > 1) could be masked by non-resurging groups (which are from this perspective contributing background noise).
As the epidemic size in a resurging group will likely be smaller than those of groups with past epidemics that are now being stabilised or controlled, this exacerbates the sensitivity bounds explored earlier via Eq. (2). We can verify this further loss of sensitivity by examining how the overall posterior distribution depends on those of the component groups as follows, with ⊛ as a repeated convolution operation and Ω / as the posterior distribution for the j th group.
While Eq. (5) holds generally, we assume gamma posterior distributions, leading to statistics analogous to Eq. (2). We plot these sensitivity results at = 2 and 3 in Figure 4 , where group 1 features resurgence and other groups either contain stable or falling incidence. We find that as p grows (and additional distributions convolve to generate ^! ) we lose sensitivity is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101/2021.09.08.21263270 doi: medRxiv preprint conservative than P( ! [1] > 1) (blue, solid median line). As ! [3] falls (red to blue) the ability to detect resurgence also lags relative to that from observing group 1 (black).

Discussion
Probing the performance limits of noisy biological systems has yielded important insights into the real-time estimation and control of parameters in biochemistry and neuroscience [45][46][47].
Although models from these fields share dynamic similarities with those in epidemiology, there has been relatively little investigation of how real-time estimates of pathogen transmissibility, parametrised by R, might be fundamentally limited. This is surprising since R is among key parameters considered in initiatives aiming to better systematise real-time epidemic response [48,49].
Here we explored what limits may exist on our ability to reliably detect or measure the change-points in R that signify resurgence and control. By using a combination of Bayesian sensitivity analyses and minimum mean squared error filtering and smoothing algorithms, we discovered striking asymmetries in innate detection sensitivities.
We found that, arguably, the most crucial transitions in epidemic transmissibility are the most inherently difficult to detect. Specifically, resurgence, signified by an increase in R from below to above 1, can at earliest be detected 5-10 times later than an equivalent decrease in R that indicated control (Figure 2, Figure 3 and Figure 5). As this lag could be of the order of the mean generation time of the disease under study, even when case reporting is perfect and optimised detection algorithms are applied, this represents a potentially sharp bottleneck to real-time responses for highly contagious diseases. Intuition for this result came from observing that sensitivity to R change-points will weaken (due to noise masking the signal) with declining epidemic sizes and increasing 'true' R, both of which likely occur in resurgent settings (Eq. (2) and Figure 1). Furthermore, these latencies and sensitivity issues would only exacerbate when considering heterogeneous groupings across geography or demography (Eq. (4), (5) and Figure 4).
Practical real-time analyses would almost surely involve such groupings or data aggregations [9,13], in addition to being hindered by reporting and other latencies (e.g., if notification times, hospitalisations or deaths are used as proxies for incidence) [14,50]. Consequently, we argue that while case data provide robust signals for pinpointing when an epidemic is under control (and possibly disentangling the impact of interventions), they are insufficient, on their own, to sharply resolve resurgence timepoints. This does not invalidate the importance of approaches that do seek to better characterise real-time R changes [1,2,13,28], but instead adds context is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; on how such inferences should be interpreted when informing policy. Given intrinsic delays in inferring resurgence, which can associate with critical epidemiological changes, such as the emergence of variants of concern or important shifts in population behaviours [6,7], there are grounds for conservative approaches that enact interventions swiftly at the expense of false alarms. This might support, for example the ongoing COVID-19 policies of New Zealand and Australia [51], and adds impetus to recent studies showing how lags in the implementing of interventions can induce drastic costs [33][34][35][36].
Moreover, our analysis suggests that enhancing syndromic surveillance systems, which can comprehensively engage diverse data sources [30,31] may be more important than improving models for processing case data. Fusing multiple and sometimes novel data sources, such as wastewater or cross-sectional viral loads [18,32], may present the only truly realistic means of minimizing the innate limits to resurgence detection that we have demonstrated. Approaches aimed at enhancing case-based inference generally correct reporting biases or propose more robust measures of transmissibility, such as time-varying growth rates [14,49,52]. However, as our study highlights limits that persist at the gold standard of perfect case reporting and, further it is known that under such conditions growth rates and R are equally informative [53], these lines of investigation are unlikely to minimise the delays we have exposed.
There are two main limitations of our results. First, as we considered renewal model epidemic descriptions, which predominate real-time R studies, our work necessarily neglects the oftencomplex contact networks that mediate infection spread [54]. However, other analyses using somewhat different approaches to ours (e.g., Hawkes processes [55]) show apparently similar sensitivity asymmetries and there is evidence that renewal models can be as accurate as network models for inferring R [56], while being easier to run and fit in real time. Second, we did not include any explicit economic modelling. While this is outside the scope of this work it is important to recognise that resurgence detection threshold choices (i.e., how we decide which fluctuations in incidence are actionable) imply some judgment about the relative cost of true positives (timely resurgence detections) versus false alarms [12]. Incorporating explicit cost structures could mean that delays in detecting resurgence are acceptable. We consider this the next investigative step in our aim to probe the limits of real-time performance.

Appendix
We provide simulations in Figure 5 for COVID-19, showing that significant delays in detecting resurgence but not epidemic control persist. These support is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101/2021.09.08.21263270 doi: medRxiv preprint (where estimated credible intervals reflect noise from the incidence of that simulation) also display this asymmetry, confirming that real-time resurgence detection is innately hard.   [43]. Top panels plot posterior mean estimates from filtered (E 9 [ ! ], blue) and smoothed (E : [ ! ], red) distributions from every realisation (computed via EpiFilter [15]). Middle panels average Kullback-Liebler divergences from those simulations and bottom panels show overall filtered (P( ! > 1 | " ! ), blue), and smoothed (P( ! > 1 | " 6 ), red) resurgence probabilities.
We again find fundamental and appreciable latencies in detecting resurgence, often an order of magnitude longer than those for detecting epidemic control (compare red and blue curves).  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ;