An exact method for quantifying the reliability of end-of-epidemic declarations in real time

We derive and validate a novel and analytic method for estimating the probability that an epidemic has been eliminated (i.e. that no future local cases will emerge) in real time. When this probability crosses 0.95 an outbreak can be declared over with 95% confidence. Our method is easy to compute, only requires knowledge of the incidence curve and the serial interval distribution, and evaluates the statistical lifetime of the outbreak of interest. Using this approach, we show how the time-varying under-reporting of infected cases will artificially inflate the inferred probability of elimination, leading to premature (false-positive) end-of-epidemic declarations. Contrastingly, we prove that incorrectly identifying imported cases as local will deceptively decrease this probability, resulting in delayed (false-negative) declarations. Failing to sustain intensive surveillance during the later phases of an epidemic can therefore substantially mislead policymakers on when it is safe to remove travel bans or relax quarantine and social distancing advisories. World Health Organisation guidelines recommend fixed (though disease-specific) waiting times for end-of-epidemic declarations that cannot accommodate these variations. Consequently, there is an unequivocal need for more active and specialised metrics for reliably identifying the conclusion of an epidemic.


Introduction
The timing of an end-of-epidemic declaration can have significant economic and public health consequences. Early or premature declarations can negate the benefits of prior control measures (e.g. quarantines or lockdown), leaving a population at an elevated risk to the resurgence of the infectious disease. The Ebola virus epidemic in Liberia (2014-2016), for example, featured several declarations that were followed by additional waves of infections [1]. Late or delayed declarations, however, can unnecessarily stifle commercial sectors such as agriculture, trade and tourism, leading to notable financial and livelihood losses. One of the first studies advocating the need for improved end-of-epidemic metrics suggested that the MERS-CoV epidemic in South Korea was declared over at least one week later than was necessary [2]. Balancing the health risk of a second wave of infections against the benefits of reopening the economy earlier is a non-trivial problem and is currently of major global concern as many countries prepare to meet the challenge of resurging COVID-19 caseloads.
World Health Organisation (WHO) guidelines adopt a time-triggered (i.e. decisions are enacted after some fixed, deterministic time) approach to end-of-epidemic declarations, recommending that officials wait for some prescribed period after the last observed infected case has recovered, before adjudging the outbreak over. The most common waiting time, which applies to Ebola virus and MERS-CoV among others, involves twice the maximum incubation period of the disease [3]. While having a fixed decision time is simple and actionable, it neglects the stochastic variation that is inherently possible at the tail of an outbreak. Recent studies have started to question this time-triggered heuristic and to investigate the factors that could limit its practical reliability.
Initial advances in this direction were made in [2], where mathematical formulae for assessing the end of an epidemic, in a data-driven manner, were derived. These formulae use the time-series of new cases (incidence) across an epidemic together with estimates of its serial interval distribution and basic reproduction number to compute the probability that the outbreak is over at any moment. The serial interval distribution describes the random inter-event times between the onset of symptoms of an infector and infectee, while the basic reproduction number is the average number of secondary infections per primary infection at the start of an epidemic [4,5]. The output of this method is an epidemiologically informed statistical measure of confidence in an end-of-epidemic declaration. This approach is important, but not perfect. It assumes that infected cases are reported without any error and it depends on parameters that relate to the initial growth phase of the epidemic. Moreover, to maintain simplicity, it adopts a mathematically conservative description of transmission, making its end-of-epidemic declaration time estimates likely to be late or delayed [2]. More recent studies [6,7] have applied forward simulation to investigate the tail dynamics of an outbreak. These have revealed the impact of the constant under-reporting of cases [6] and demonstrated the sensitivity of declarations to the effective reproduction number [7], a parameter that generalises the basic reproduction number and that remains relevant across all phases of the epidemic. The influence of different routes of transmission on declarations has also been examined in [1] using the framework of [2]. However, there is still much we do not know about the dynamics of an outbreak as it approaches its end. Specifically, analytic and general insight into the sensitivity of end-ofepidemic declarations to practical surveillance imperfections is needed. Real incidence data are corrupted by time-varying trends in under-reporting, delays in case notification and influenced by the interaction of imported and local cases [8][9][10]. Previous works have either assumed perfect reporting [2] or treated constant under-reporting within some simulated scenarios [6,7]. Here we attempt to expose the implications of more realistic types of data corruption, particularly time-varying case under-reporting and importation, by developing an exact framework that provides broad and provable insights. Understanding how realistic surveillance patterns can bias our perception of the epidemic end is the first step to engineering sensible and effective countermeasures against these biases.
We build on the renewal process transmission model from [11,12], to derive and test a novel and exact real-time method for estimating the probability of elimination; defined as the probability that no future local cases will emerge conditioned on the past epidemic incidence. We explain this model in Fig 1. Using this probability, we define an event-triggered [13,14]  , which form an incidence curve, seed new infections with probabilities proportional to w u defined by the generation time distribution of the disease, which is approximated by the serial interval distribution. The total infectiousness Λ s+1 sums the contributions of past cases according to the set of {w u }. The effective reproduction number R s determines how many effective infections are passed on to the next time unit s + 1. It is common to group R s values over a window τ(s) to improve estimation reliability. When all future incidence values are zero we conclude that the epidemic is over or eliminated. Panel B shows how R s acts as a reproductive parameter, controlling whether the epidemic grows or dies out. This parameter is therefore essential to predicting the dynamics of an epidemic. Panel C provides a breakdown of more realistic observation assumptions, where we might not be able to directly measure the local and complete incidence I s due to unreported U s or imported (migrating) M s cases. If we can only observe sampled cases, N s , or the total number of cases, C s , then our epidemic predictions will be biased.
https://doi.org/10.1371/journal.pcbi.1008478.g001 declaration metric that guarantees confidence in that declaration provided the assumptions of the model hold. The trigger is the first time that this probability crosses a threshold e.g. we are 95% confident in our declaration if the threshold is 0.95. Event-triggered decision-making was essentially proposed by [2], has proven effective in other fields [15][16][17] and belies the time-triggered WHO approach, which fixes the time (elapsed since the last case 1 ) but not the confidence in declaration.
We benchmark our estimate against the true probability of elimination, i.e. the probability if the statistics and effective reproduction number of the epidemic were known precisely, and show consistency under the perfect conditions in [2] but with the caveat that we estimate effective reproduction numbers from the incidence curve in real time. We find that even the true elimination probabilities strongly depend on the specific stochastic incidence curve observed, confirming that time-triggered decision heuristics are unwarranted. Using our exact framework we prove two key results about imperfect surveillance. First, any type of time-varying under-reporting will lead to premature or false-positive event-triggers and hence declarations, unless explicit knowledge of the under-reporting scheme is available. Second, a failure to identify and account for the differences between local and imported cases will result in delayed or false-negative event-triggers, regardless of the dynamics of case importation.
Many infectious disease epidemics, including the ongoing COVID-19 pandemic, are known to feature extensive time-varying under-reporting and repeated importations from different regions [18,19]. As this pandemic progresses into a potential second wave in several countries, public health authorities will need to decide when to relax and reapply intervention measures such as lockdowns, social distancing policies or travel bans [20]. Our work suggests that intensive surveillance, both of cases and their origin, must be sustained to make informed, reliable and adaptive decisions about the threat posed by the virus in the waning stages of the outbreak, even if reported case numbers remain at zero for consecutive days. We hope that our method, which is available at https://github.com/kpzoo/End-of-epidemic-declarations, will aid understanding and assessment of the tail kinetics of infectious epidemics.

Infectious disease transmission models
We can mathematically describe the transmission of an infection within a population over time with a renewal process based on the Euler-Lotka equation from ecology and demography [5]. This process models communicable pathogen spread from a primary (infected) case to secondary ones at some time s using two key variables: the effective reproduction number, R s , and the generation time distribution with probabilities {w u } for all times u. Here R s defines the number of secondary cases at time s + 1 one primary case at s infects on average, while w u is the probability that it takes u time units for a primary case to infect a secondary one [5]. As infection events are generally unobserved, we approximate the time of primary and secondary infection with the corresponding times of symptom onset i.e. the serial interval. This amounts to making the common assumption that the serial interval distribution, which can be observed, is a good approximation to the generation time distribution [2,12].
If I s counts the newly observed infected cases at s and a Poisson (Poiss) model is used to represent the noise in these observations then the renewal model captures the reproductive dynamics of infectious disease transmission with I s * Poiss(R s−1 Λ s ) [4]. Here L s ≔ P sÀ 1 u¼1 I sÀ u w u is the total infectiousness of the disease up to time s − 1 and summarises how previous cases contribute to upcoming cases on day s. We use I s 1 ≔ fI 1 ; I 2 ; . . . ; I s g to represent the incidence curve from time 1 to s. A schematic of this approach to epidemic transmission is given in Fig 1. Usually we are interested in estimating the R s numbers in real time [21,22] from the progressing I s 1 , assuming that the serial interval distribution is known (i.e. derived from some other linelist data) [12].
This effective reproduction number is important for forecasting the kinetics of the epidemic. If R s > 1 then we can expect the number of infections to increase monotonically with time. However, if R s < 1 is sustained then we can be confident that the epidemic is being controlled and will, eventually, be eliminated [23]. In order to enhance the reliability of these estimates we usually assume that the epidemic transmission properties are stable over a look-back window of size k defined at time s as τ(s) ≔ {s, s − 1, . . ., s − k + 1} [12,24]. We let the reproduction number over this window be R τ(s) and apply a conjugate gamma (Gam) prior distribution assumption: R τ(s) * Gam(a, 1 / c ) with a and c as shape-scale hyperparameters. This formulation, together with the use of gamma prior distributions, is standard in current renewal model frameworks [12,21,25].
The posterior distribution of R τ(s) given the relevant window of the past incidence curve of data i.e. I tðsÞ ≔ I s sÀ kþ1 is also gamma distributed as follows [22] R tðsÞ j I tðsÞ � Gam a þ i tðsÞ ; with grouped sums i τ(s) ≔ ∑ u2τ(s) I u and λ τ(s) ≔ ∑ u2τ(s) Λ u . If some variable y * Gam(α, β) then PðyÞ ¼ y aÀ 1 e À y=b =b a GðaÞ and E½y� ¼ ab. As a result, Eq (1) yields the posterior mean estimate ofR tðsÞ ¼ a tðsÞ b tðsÞ with α τ(s) ≔ a + i τ(s) , β τ(s) ≔ 1 / c+λτ(s) . Eq (1) allows us to infer the grouped or averaged effective reproduction number over the window τ(s), which is considered an approximation of the unknown R s . We can derive the posterior predictive distribution of the next incidence value (at time s + 1) by marginalising over the domain of R τ(s) as in [22]. If the space of possible predictions at s + 1 is x|I τ(s) and NB indicates a negative binomial distribution then we obtain Eq (2) completely describes the uncertainty surrounding one-step-ahead incidence predictions and is causal because all of its terms (including Λ s+1 ) only depend on the past observed incidence curve I s 1 [22]. If a random variable y * NB(α, p) then PðyÞ ≔ aþyÀ 1 y � � ð1 À pÞ a p y and E½y� ¼ pa = 1À p . Hence our posterior mean prediction isÎ sþ1 ¼ E½x j I tðsÞ � ¼ L sþ1RtðsÞ . The current estimate of R τ(s) influences our ability to predict upcoming incidence points. Thus, we expect that good estimation of the effective reproduction number is necessary for projecting the future behaviour of an infectious disease epidemic. In Results we rigorously extend and apply this insight to derive an exact method for computing the probability that an epidemic is reliably over at some time s i.e. that no future infections will occur from s + 1 onwards.

Under-reported and imported cases
The above formulation assumes perfect case reporting and that all cases, I s 1 , are local to the region being monitored. We now relax these assumptions. First, we consider more realistic scenarios where only some fraction of the local cases are reported or observed at any time. We use N s and U s for the number of sampled and unreported cases at time s. We consider a general time-varying binomial (Bin) sampling model with 0 � ρ s � 1 as the probability that a true case is sampled at time s (hence 1 − ρ s is the under-reporting probability). Then N s * Bin(I s , ρ s ). The smaller ρ s is, the less representative the sampled curve N s 1 is of the true I s 1 . This is a standard model for under-reporting [8,26] and implies the following statistical relationship Raikov's theorem [27] states that if the sum of two independent variables is Poisson then each variable is also Poisson. Consequently, U s is Poisson with mean (1−ρ s )R s−1 Λ s . Most studies assume that ρ s = ρ for all s i.e. that constant under-reporting occurs. The persistence of the Poisson relationship in Eq (3) means that we can directly apply the forecasting and estimation results of the previous section to N s . Practically, if we observe only N s 1 then unless we have independent knowledge of ρ s (which can often be difficult to ascertain reliably [18,26]) we can only construct an approximation to ρ s Λ s asL s ¼ P sÀ 1 u¼1 w u N sÀ u with E½LðsÞ� ¼ r s L s . Second, we investigate when imported or migrating cases from other regions, denoted by count M s at time s, are introduced, resulting in the total number of observed cases being C s . Within this framework we ignore the under-reporting of cases and assume that I s is observed to avoid confounding factors. We follow the approach of [9] and describe M s as a Poisson number with some mean at time s of � s . Using Raikov's theorem we obtain Eq (4) models how imported cases combine with existing local ones to propagate future local infections.
While our work does not require assumptions on � s , for ease of comparison later on we adopt the convention that the sum of imports and local cases drive the epidemic forward with the same reproduction number and serial interval [28]. Consequently, Practically, when surveillance is poor (i.e. local and imported cases cannot be distinguished), it is common to assume that all observed cases are local and conform to the approximate model C s � PoissðR sÀ 1 � L s Þ [25]. The forecasting and estimation results of the previous section therefore also apply under these conditions.
In Results we examine the impact of imperfect (our null hypothesis H 0 ) and ideal (the alternative H 1 ) surveillance within the context of under-reporting and importation in turn. We treat each problem individually to isolate the impact of each bias. Ideal surveillance then represents the ability to know either U s or M s (depending on the problem of interest) and hence account for their contributions. Imperfect surveillance refers to only having knowledge of N s or C s and basing inferences on these curves under the strong assumption that they approximate the true incidence. This assumption is often made in the literature [2,12,21] for the purposes of tractability and means Eqs (1) and (2) are valid. Fig 1C summarises the relationships from Eqs (3) and (4).

An exact method for declaring an outbreak over
We define an epidemic to be eliminated or over [23] at time s if no future, local or indigenous infected cases are observed i.e. I s+1 = I s+2 = � � � = I 1 = 0. We can define the estimated probability of elimination, z s , as with I s 1 as the incidence curve (data), observed until time s. We refer to z s as an estimated probability because we do not have perfect knowledge of the epidemic statistics e.g. we cannot know R s precisely. The importance of this distinction will become clear in the subsequent section (see Eq (10)). However, we observe that if we could have this idealised knowledge then Eq (5) would exactly define the probability of no future cases given I s 1 . Declaring the end of an epidemic with confidence μ% translates into solving the optimal stopping time problem with t 95 , for example, signifying the first time that we are at least 95% sure that the epidemic has ended. Note that z s is a function of I s 1 and practically characterises our uncertainty in the outcome of the epidemic (i.e. if it is over or not). This uncertainty derives from the fact that a range of possible epidemics with distinct future incidences I 1 sþ1 can possess the same I s 1 and R s 1 values. Some uncertainty exists even if R s 1 is known perfectly. Eq (6) presents an event-triggered approach to declaring the end of an epidemic with the μ threshold serving as an informative trigger. Event-triggered formulations have the advantage of being robust to changes in the observed data [13,14], a point visible from the dependence of z s and hence t μ on I s 1 . While Eq (6) is written in absolute time, we may also clock time relative to the last observed case, t 0 . Our waiting time until declaration is then Δt μ = t μ − t 0 , which is more useful for comparing z s values from various realisations of I s 1 and for deriving confidence intervals. Later, we consider differences in the Δt μ , denoted δt μ , proposed by comparable methods.
Previous works on end-of-epidemic declarations have either approximated z s with a simpler, more conservative probability [2] or used simulations to estimate a quantity similar to z s that is averaged over those simulations [6] [7]. No study has yet (to our knowledge) included real-time estimates of R s , within its assessment of epidemic elimination, despite the importance of this parameter in foretelling transmission [23]. By taking the renewal process approach to epidemic propagation (see Fig 1), we explicitly embed uncertainty about R s estimates to obtain an analytic and insightful expression for the probability that the outbreak is over given past observed cases (Eq (5)).
We derive this by inferring R s within a sequential Bayesian framework from I s 1 , using a moving window of length k time units. We denote this estimate R τ(s) with window τ(s) spanning I s sÀ kþ1 [12,22]. Our main result is summarised as a theorem below (see Methods for further details and notation). Fig 2 illustrates how our computed z s probability varies across the lifetime of an example incidence curve, thus providing a real-time, causal and dynamically updating view of our confidence in its end. Theorem 1. If the posterior distribution of the grouped effective reproduction number, R τ(s) , given the incidence curve I s 1 has form Gam(α τ(s) , β τ(s) ) then the estimated probability that this epidemic has been eliminated at time s is z s ¼ andR tðjÞ ¼ a tðjÞ b tðjÞ as the mean posterior incidence prediction and effective reproduction number estimate at time j, respectively. We outline the development of this theorem. First, we decompose Eq (5) into sequentially predictive terms as: For simplicity, we rewrite Eq (7) as z s ¼ q 0 Q 1 j¼1 q j . The factor q j conditions on I sþj 1 , which includes all the epidemic data, I s 1 and the sequence of assumed zeros beyond that i.e. I sþj sþ1 ¼ 0 for j � 1. This sequence is treated as pseudo-data. Observe that q 0 is simply a one-step-ahead prediction of 0 from the available incidence curve. We solve Eq (7) by making use of known renewal model results derived in [12,22,24] and outlined in Methods. The renewal transmission model allows us to estimate the effective reproduction number R s and hence compute z s in real time (see Fig 1). This estimate at time s, R τ(s) , uses the look-back window τ(s) of k time units (e.g. days). The posterior over (1)).
Here (a, c) are hyperparameters of a gamma prior distribution placed on R τ(s) and i τ(s) and λ τ(s) are grouped sums of the incidence I u and total infectiousness Λ u for u 2 τ(s). The total infectiousness describes the cumulative impact of past cases and is defined in Methods.
Under this formulation, the posterior predictive distribution of the incidence at s + 1 is negative binomially distributed (NB) (see Eq (2)). The probability of I s+1 being zero from this distribution gives q 0 ¼ ð1 þ L sþ1 b tðsÞ Þ À a tðsÞ by substitution. The next term, q 1 , is computed similarly because we condition on I s+1 = 0 as pseudo-data (i.e. the sequential terms in Eq (7)) and update Λ s+2 , β τ(s+1) and α τ(s+1) with this zero. Iterating for all terms yields We simulate a single incidence curve, I s (blue, case counts on left y-axis), under the serial interval distribution for Ebola virus [29] and a true R s profile that step changes from 2 to 0.5 at s = 100 days. We compute the true and estimated elimination probabilities, z � s and z s , conditional on all cases observed up to time s in grey and red respectively (right y-axis). The circle (black) indicates when the outbreak can be declared over with 95% confidence. Observe how z s and z � s respond to the low I s at the beginning of the epidemic before remaining 0 until we get to the tail of the outbreak, where a couple fluctuations occur due to some final cases. An estimate of the WHO declaration time, t WHO [3], which is mostly insensitive to past case profiles is in dark blue. The central question in this study is how few cases need to be observed in the recent past before we can be confident that the epidemic has been eliminated. which is an exact expression for z s . As a string of zero incidence values accumulate with time Λ j+1 ! 0 and hence q j ! 1. Consequently, only a finite number of terms in Eq (8) need to be computed and the initial ones are the most important for evaluating z s .
The posterior mean estimate of R τ(s) isR tðsÞ ¼ E½R tðsÞ j I s 1 � ¼ E½R tðsÞ j I tðsÞ � ¼ a tðsÞ b tðsÞ with I τ(s) as the incidence values in the τ(s) window (the remaining I sÀ k 1 are assumed uninformative [12]). This follows from the Gam distribution and implies a posterior mean incidence predic-tionÎ sþ1 ¼ E½I sþ1 j I tðsÞ � ¼ L sþ1RtðsÞ from the NB posterior predictive distribution [22]. Substituting these into Eq (8) gives: This completes the derivation. Theorem 1, when combined with Eq (6), provides a new, analytic and event-triggered approach to adjudging when an outbreak has ended. Eq (9) provides direct and quantifiable insight into what controls the elimination of an epidemic and can be easily computed and updated in real time.

Understanding the probability of elimination
We dissect and verify the implications of Theorem 1, which provides an exact formula for estimating the probability, z s , that any infectious disease epidemic has been eliminated by time s. Eq (8) formalises the expectation that any decrease in case incidence increases z s . This results because @q j / @α τ(j) < 0 for all α τ(j) , meaning that q j is monotonically decreasing in α τ(j) and hence i τ(j) . As z s is a product of q j and every q j is positive then z s is also monotonically decreasing in all incidence window sums. Consequently, any process that reduces historical incidence surely increases the probability of elimination, provided other variables are relatively fixed.
The main variable controlling z s is the average predicted incidenceÎ jþ1 (see Eq (9)). Reducing either Λ j+1 orR tðjÞ therefore increases our confidence in a declaration made after a fixed time (the time-triggered approach) or, decreases the time of declaration for a fixed confidence (the event-triggered approach). This highlights the two known ways that sustained interventions, e.g. vaccination, social-distancing or quarantine, can help drive an epidemic to extinction. First, such measures explicitly limit R j and henceR tðjÞ , leading to an expected rise in z s [23]. Second, they may also implicitly reduce the duration of the serial interval, resulting in smaller Λ j+1 [30]. Accordingly, under-or over-estimatingR tðjÞ or using incorrectly smaller or larger Λ j+1 sums induces spurious fluctuations in z s and promotes premature or delayed declarations, respectively. This insight underlies later analyses, which investigate how surveillance imperfections can modulate the declaration time. Because we cannot reduce either reproduction numbers or serial intervals to arbitrary values of interest (e.g. certain diseases have intrinsically wider serial interval distributions) some epidemics will be innately harder to control and eliminate [31].
Interestingly, while z s is controlled by mean estimates and predictions, it appears insensitive to the uncertainty around those means, despite its derivation from the posterior distributions of Eqs (1) and (2). This follows from the inherent data shortage at the tail of an epidemic (there are necessarily many zero incidence points), which likely precludes the inference of higher order statistics [24]. Moreover, when the incidence is small stochastic fluctuations can dominate epidemic dynamics. Consequently, to maximise the reliability of our z s estimates we recommend using long windows (large k) forR tðjÞ . Short windows are more sensitive to recent fluctuations and are more prone to yielding uninformative estimates when many zero incidence points occur [22,32].
Last, we validate the correctness of our estimated z s by considering a hypothetical setting in which the true reproduction number, {R s : s � 0}, is known without error. This allows us to derive the true (but unknowable) probability of elimination z � s at time s, given complete information of the epidemic statistics. Under the renewal model PðI sþ1 ¼ 0 j I s 1 Þ ¼ e À R s L sþ1 . Repeating this process sequentially for future zero infected cases (akin to describing the likelihood of that observation series) gives: Clearly z � s depends on the serial interval distribution and past incidence (through Λ j+1 ) and the sequence of reproduction numbers R j , which are the main factors underlying the transmission of the infectious disease.
The true declaration time with confidence μ% is then t � m ¼ arg min s z � s � m 100 (see Eq (6)). We can verify our approach to end-of-epidemic declarations if we can prove that t μ sensibly converges to t � m . At the limit of α τ(j) ! i τ(j) ! 1, the estimatedR tðjÞ tends to the true R j because under those conditions the posterior mean estimate coincides with the grouped maximum likelihood estimate of R j , which is unbiased. Applying this limit to q j in Eq (9) we find that aŝ R tðjÞ ! R j : implying that z s ! z � s , and consequently that t m ! t � m . This asymptotic consistency suggests that z s and t μ indeed approximate the true but unknowable probability of elimination z � s and declaration time t � m . Other end-of-epidemic metrics in the literature have not presented such theoretical justification. We illustrate z s and z � s across a simulated and representative incidence curve in Fig 2. There we find a good correspondence between these probabilities and observe their sensitivity to changes in incidence at the beginning and end of this outbreak. Note that z s and z � s (and hence t μ and t � m ) depend on I s 1 and are more precisely written as z s j I s 1 and z � s j I s 1 . The WHO declaration time, t WHO , which is included for reference, is mostly independent of the shape of I s 1 [3], explaining why it provides no confidence guarantee.

Practical comparisons and verification
We have only validated our approach at an asymptotic limit that is not realistic for elimination i.e. the proof that z s and t μ converge to their true counterparts requires infinite incidence. While this proof suggests our formulation is mathematically correct, it does not indicate its performance on actual elimination problems. We now verify out method more practically. We first use simulated data to show that Δt μ = t μ − t 0 and Dt � m ¼ t � m À t 0 correspond well over several end-of-epidemic problems, where we are far from this limit, and with t 0 as the time of the last observed case. We characterise this via histograms of the error dt m ¼ Dt m À Dt � m ¼ t m À t � m , which are given in Fig 3A-3C. There we present 95% (μ = 0.95) declaration time errors calculated over 1000 simulated epidemics with serial interval distributions from the COVID-19 pandemic [33], MERS-CoV in Saudi Arabia [25], Marburg virus in Angola [29] and Measles in Germany [12].
We investigate true R s profiles that describe rapidly controlled (Fig 3A), partially recovering ( Fig 3B) and exponentially rising and falling transmission (boom-bust, Fig 3C). For each profile we use the renewal model to simulate conditionally independent I s 1 curves and compute z s j I s 1 and z � s j I s 1 using Eqs (9) and (10). The declaration time errors then follow as above and from Eq (6). Fig 3D plots these R s profiles (top) and the serial interval distributions for each True and estimated declaration times. We simulate 1000 independent incidence curves under various renewal models and provide normalised histograms of the difference between the estimated and true declaration times i.e. dt 95 ¼ Dt 95 À Dt � 95 ¼ t 95 À t � 95 . Panels A-C present these histograms for various infectious diseases under R s profiles indicating rapidly controlled, recovering and rising and then decaying transmission (boom-bust). The top row of D plots the true R s curves in absolute time, while the bottom row of D provides the serial interval distributions of the infectious diseases examined. Generally we find that t 95 � t � 95 to a reasonable degree. The quality of this approximation depends on the variability of the serial interval distribution (see S1 Fig)

PLOS COMPUTATIONAL BIOLOGY
Exact methods for reliable and real-time end-of-epidemic declarations disease (bottom). Generally, we find that t μ is a good approximation to t � m , with some error naturally emerging from the difficulty of estimating R s in conditions where data are necessarily scarce [32]. Our prior distribution over R τ(j) is Gam (1,5), which is both uninformative and has a large mean of 5.
This error, δt 95 , is more prominent for diseases featuring wide serial interval distributions, which are fundamentally more difficult to estimate, due to their dependence on much earlier epidemic dynamics. These simulations also demonstrate why time-triggered approaches can be misleading; they do not adapt to the shape of the specific instance of I s 1 observed. An example of this is given in S1 Fig, where we find that the WHO declaration time Δt WHO = t WHO − t 0 is delayed relative to both the true (Dt � 95 ) and estimated (Δt 95 ) event-triggered declaration times, for Ebola virus disease, which has a wide serial interval. Depending on the disease of interest Δt WHO could also be premature. The large variability among the possible Dt � 95 provides a clear visualisation of the non-deterministic nature of epidemic end-points and the need for adaptive metrics with stated confidence.
At present, we have only verified our method under ideal reporting conditions. Practical surveillance is investigated in later sections. We now compare our method to the event-triggered one of [2], which assumes ideal surveillance and models epidemic transmission with a NB branching process that is strictly only valid at the beginning of the outbreak. This notably differs from our renewal model approach and the elimination probabilities derived in [2] are a mathematically conservative approximation to our z s . We compare both methods on MERS-CoV data from South Korea, examined in [2], by running them on a set of bootstrapped incidence curves generated from fitting the model of [2] to that data and compute 95% confidence intervals on the probability of elimination. Fig 4 presents our main results with time relative to the last observed case in each bootstrap (Δs) and blue and red curves as the outputs of [2] and our method. While the median 95% relative declaration times (black circles) are close, the approach of [2] yields a delayed declaration. This effect is reduced if we use the lower bound of the z s curves instead of their median. When z s is small (which is not practical for defining end-of-epidemic declarations) we find that the methods are less consistent. The WHO declaration time (dark blue) for this epidemic is over one week later than the time proposed by both methods [2]. While our method shows wider uncertainty, the similarity of these intervals suggests that our formulation is robust to moderate model mismatch.

Under-reporting leads to premature declarations
Having verified z s and hence t μ as reliable and sensible means of assessing the conclusion of an epidemic, we investigate the effect of model mismatch due to imperfect surveillance. We start with case under-reporting, which affects all infectious disease outbreaks to some degree. While previous works have drawn attention to how constant under-reporting can bias end-of-epidemic declarations [6] [7], no analytic results are available. Moreover, the impact of time-varying under-reporting, which models a wide range of more realistic surveillance scenarios [8,34], remains unstudied. We provide mathematical background for our under-reporting models in Methods. Fig 1C illustrates how under-reporting results in only a portion, N s , of the total local cases, I s being sampled or observed. We use U s = I s − N s � 0 to denote the unreported cases. We investigate two hypotheses or models about the incidence curve, a null one, H 0 , where we assume that the observed cases N s 1 represent all the infected individuals and an alternative hypothesis H 1 , in which the unreported cases U s 1 (and hence I s 1 ) are known and distinguished.
The estimated elimination probabilities under both surveillance models are: Here H 0 portrays a naive interpretation of the observed (N s ) incidence, while H 1 indicates ideal surveillance. Intensive and targeted population testing should interpolate between H 0 and H 1 . We compute z s j N s 1 by constructing the sampled total infectiousnessL s ≔ P sÀ 1 u¼1 w u N sÀ u and then applying Theorem 1. This follows because N s can also be described by a Poisson renewal model (see Methods for details). We therefore find that z s j N s 1 ¼ Q 1 j¼s ð1 þL jþ1btðjÞ Þ À aÀ n tðjÞ with n τ(j) andl tðjÞ as the sums of N u andL u within the τ(j) window andb tðjÞ ¼ 1 = cþl tðjÞ . We get z s j I s 1 directly from Eq (8) since this is the perfect surveillance case.
This allows us to construct the ratio of z s j N s 1 to z s j I s 1 as Y 1 j¼s ð1 þ � j Þ aþi tðjÞ ð1 þ� j Þ À aÀ n tðjÞ with ϕ j = Λ j+1 β τ(j) and� j ¼L jþ1btðjÞ . Here� j approximates ϕ j and both are small compared to 1 (for sensible window lengths). This combined with the fact that i τ(j) is bigger than n τ(j) means the above ratio is greater than 1 (powers dominate the expression). This result may be violated if� becomes large relative to ϕ and the unreported case count in the window, u τ(j) = i τ(j) − n τ(j) , is small. However, this is not possible since� and ϕ necessarily converge as u τ(j) tends to 0 We compare 95% confidence intervals on the elimination probability from [2] (blue) and z s from Eq (9) (red) on bootstrapped epidemics based on the MERS-CoV data from South Korea used in [2]. Black circles define the median relative declaration time (Δt 95 ) when each method deems the epidemic to be over with 95% confidence (the event trigger). Time is relative to the last observed case in each epidemic bootstrap and the WHO (time-triggered) declaration time (Δt WHO ) is in dark blue. https://doi.org/10.1371/journal.pcbi.1008478.g004 (i.e. perfect surveillance). Consequently, we obtain the inflation At no point have we assumed any form for the under-reporting fraction, denoted ρ s at time s (see Methods). Our derivation only depends on under-reporting causing smaller (absolute) historical incidence. If we know all R j (which is unlikely) this result is also easily obtained from (10) since L jþ1 �L jþ1 .
Thus any under-reporting, whether constant (i.e. all ρ s are the same) or time-varying will engender premature or false-positive end-of-epidemic declarations provided N s is randomly sampled from I s (so Theorem 1 holds; see Eq (3)). We highlight this principle by examining a random sampling scheme using empirical SARS 2003 data from Hong Kong [12]. We binomially sample the SARS incidence with random probability ρ s * Beta(a, b). We set b = 40 and compute a so that the mean sampling fraction E½r s � ¼ f r takes some desired (fixed) value. We investigate various f ρ and show that premature declarations are guaranteed in Fig 5A and 5B. The impact of ρ s is especially large when under-reporting leads to early but false sequences of 0 cases, which is additional to the bias from Eq (13). We present results in absolute time to showcase this effect.

Importation results in late declarations
The influence of imported cases on end-of-epidemic declarations, to our knowledge, has not been investigated in the literature. Repeated importations or migrations of infected cases are a common means of seeding and re-seeding local infectious epidemics. Failing to ascertain which cases are local or imported can significantly change our perception of transmission [9]. We assume that I s is the total count of local cases in our region of interest but that at time s there are also M s imported cases that have migrated from neighbouring regions. The total number of infected cases observed is C s = I s + M s as displayed in Fig 1C. We provide mathematical background on how importations are included within the renewal framework in Methods. We consider two hypotheses about our observed incidence data that reflect real epidemic scenarios.
Under the null hypothesis, H 0 , we assume that all cases are local and so we cannot disaggregate the components of C s . The alternative, H 1 , assumes perfect surveillance. Imported cases are distinguished from local ones under H 1 and their differing impact considered. The relevant elimination probabilities for each model are Since H 0 deems all cases local, it models C s as a renewal process with total infectiousness � L s ≔ P sÀ 1 u¼1 C sÀ u w s . Thus we use Theorem 1 to obtain the j th factor of z s j C s 1 as q j j C s Here c τ(j) and � l tðjÞ are sums of C u and � L u over window τ(j).
Under H 1 the imported cases are distinguished but all cases still contribute to ongoing local transmission [9,28]. Consequently, I s still adheres to a renewal transmission process and Theorem 1 yields the j th factor of z s j I s 1 as q j j I s 1 ¼ ð1 þ � L jþ1 � b tðjÞ Þ À aÀ i tðjÞ . We compare q j j I s 1 with q j j C s 1 directly to easily prove that Not accounting for migrations shrinks the elimination probability leading to false-negative or unnecessarily late declarations. This result makes no assumption on the dynamics for importation other than it possesses Poisson noise (so Theorem 1 is valid for C s ) and so holds quite generally (see Methods for further details). Case under-reporting and importation lead to premature and delayed declarations respectively. In A and B we binomially sample an empirical SARS 2003 incidence curve from Hong Kong with reporting probabilities drawn from a beta distribution with mean f ρ . In A we plot the elimination probability z s when surveillance is ideal i.e. there is no underreporting (red) versus when the under-reporting is unknown (blue). The difference in the 95% declaration times, denoted δt 95 , from these curves is in B. As f ρ decreases we are more likely to declare too early. In C and D we consider an empirical MERS-CoV 2014-5 incidence curve from Saudi Arabia with local and imported cases. We increase the mean fraction of imported cases to f � by adding Poisson imports with mean � and in C compute z s with (red) and without (blue) accounting for the difference between imports and local cases. The change in t 95 is given in D. As � and hence f � increase later declarations become more likely. We repeat our sampling or importation procedure 1000 times to obtain confidence intervals in A-D. As f � ! 0 or f ρ ! 1 we attain the ideal surveillance scenarios of no unreported or imported cases. We illustrate this phenomenon using empirical MERS-CoV data from Saudi Arabia [25] in Fig 5C and 5D. Here repeated importations occur as zoonotic transmissions from camels to humans. We show the increasing effect of importation by adding further (artificial) imports via a Poisson noise variable with mean � (see Eq (4)). The mean fraction of imported to total cases across the incidence curve is then f � . In Fig 5 we see that larger � promotes increasingly later declaration times. Note that we do not add any imports beyond the time of the last local case. If imports do come after this case, and seed no further local infections, which is likely for epidemics with large heterogeneity, then the t 0 assumed under H 0 will be later, and further exacerbate the bias from importation.

Discussion
Understanding and predicting the temporal dynamics of infectious disease transmission in real time is crucial to controlling existing epidemics and to thwarting future resurgences of those outbreaks, once controlled [21]. To achieve this understanding it is necessary to characterise and study the infectious disease throughout its lifetime. While many works have focussed on the growth, peak and controlled phases of epidemics (see Fig 2), relatively less research has examined how the tail of the outbreak shapes the kinetics of its elimination. For example, while much is known about how the basic and effective reproduction numbers influence the growth rate, peak size and controllability of an epidemic [5,35], the relationship between these numbers and the waiting time to epidemic elimination or extinction is still largely unexplored.
However, this relationship has important implications for public health policy. Knowing when and how to relax non-pharmaceutical interventions, such as travel bans or lockdowns, can be essential to effectively managing and mitigating the financial and social disruption caused by an outbreak as well as to safeguarding populations from the risk of future waves of the disease [1,2]. The ongoing COVID-19 pandemic for instance, which for some countries such as New Zealand involved import-driven resurgence after an initial declaration of elimination [32], provides a current and important example where such questions might soon become urgent.
Existing WHO guidance on deciding when an outbreak can be safely declared over takes a time-triggered approach. This means a fixed waiting time from the last observed case, usually based on the incubation period of the disease, is adopted [3]. While this approach is easy to follow, it does not change informatively between outbreaks of the same disease, even if the patterns of transmission are very different and cannot provide a measure of the reliability of this suggested declaration time. The few existing studies that have investigated this waiting-time problem [2,6,7] have all converged to what is known as an event-triggered solution in control theory [13].
Event-triggered decision-making has been shown to be more effective than acting at deterministic or fixed times for a range of problems including several involving the optimising of waiting or stopping times [14][15][16][17]. Moreover, because it directly couples decision making to observables of interest (in our case the incidence curve), it can better adapt or respond to changes in dynamics. Here we have attempted to build upon these realisations to better characterise the relationship between epidemic transmission and elimination. Specifically, we focussed on computing the probability at time s, z s , that the total future incidence of the epidemic is zero.
This probability is directly responsible for determining how quickly an epidemic will end. In fact, if an outbreak is defined as surviving if it can propagate at least 1 future infection then 1 − z s is precisely its survival function and is therefore rigorously linked to the future risk of cases. By taking a renewal process approach, we were able to derive an analytic and real-time measure of z s that explicitly depends on up-to-date estimates of the effective reproduction number (see Eq (9)). This result formed the main theorem of this paper and provided a clear and easily-computed link between epidemic transmission and elimination. To our knowledge, no previous work has directly obtained z s . Specifically, [2] computed a simpler and more conservative quantity while [6] and [7] approximated something similar via simulation, and so cannot provide real-time formulae. The event-trigger for declaring an outbreak over with μ% confidence is then the first time that z s crosses a threshold of m 100 .
To validate the correctness of our approach we considered several comparisons. We proved mathematically that our formulae recover the true elimination probability and event trigger given perfect knowledge of the epidemic. This provided theoretical justification for our approach (Eq (11)). We verified practical performance by benchmarking our method against the known (true) declaration times from simulated outbreaks of several infectious diseases (Fig 3) and on empirical data by directly comparing to the approach in [2] (Fig 4). We found that our method generated sensible and reasonably accurate estimates, given the fundamental difficulties of inferring R s at low incidence. Integrating our method with newly developing approaches that improve on R s estimates in these low data conditions [32], should further enhance performance and forms part of our future work. Figs 3 and 4 and S1 Fig also explained why time-triggered methods, such as the existing WHO guidelines, can be unreliable or deceptive. Replicate epidemics driven by the same timeseries of reproduction numbers can engender significantly different relative declaration times Δt 95 . This variability exists even if R s is known perfectly (i.e. when we have Dt � 95 ). As no single, fixed time can reasonably approximate this distribution, time-triggered approaches are necessarily performance limited. Moreover, we can never guarantee the confidence in such a declaration because z s and z � s also vary considerably for epidemics of the same disease even under identical transmission dynamics. These issues will only worsen with the additional noise deriving from non-ideal surveillance.
Exploring non-ideal surveillance noise and rigorously assessing its impact on the tail dynamics of epidemics was the main motivation for developing our method. Consequently, we investigated two prevalent and potentially dominant sources of noise in surveillanceunreported and imported cases [9,26]. While both [6] and [7] looked at the effect of constant under-reporting on declarations, general insight into the more realistic time-varying case is lacking. Further, the influence of importation on the epidemic tail has, to our knowledge, not yet been examined. By adapting z s to various surveillance hypotheses we proved two key results and developed a flexible framework for incorporating and analysing the influence of other related noise sources.
First, we showed that any type of random under-reporting will precipitate early declarations, which worsen as the fraction of unreported cases increases (Eq (13)). Second, we found that any random importation process will lead to late declarations that become more delayed as the fraction of imports increase (Eq (15)). Moreover, under-reporting and importation processes can respectively, cause falsely early and late starts (i.e. t 0 in our notation) to the sequence of zero incidence days that are used to determine declaration times, thus exacerbating the bias from each noise source. We illustrated the biases of both unreported and imported cases using empirical data (Fig 5), clarifying how the epidemic tail is sensitive to these imperfections in the collection or reporting of incidence data.
The theoretical framework we employed to reveal these biases can also help generate insight into other noise sources and surveillance hypotheses. It provides a scheme for investigating case misidentification, asymptomatic transmission and reporting delays, among others. The first occurs when cases of a co-circulating diseases are misattributed to the disease of interest due to overlapping symptoms and is common among influenza-like illnesses [8]. The disease of interest is then effectively over-reported, which may be modelled as a false importation process with M s as the over-reported cases in Eq (4), but past M s counts do not contribute to I s (and so are not in its total infectiousness term). It then follows that declaration times will likely be delayed.
Asymptomatic transmission and reporting delays are effectively types of under-reporting. In the first, the cases observed at any time represent only the symptomatic fraction of actual infections. Consequently, a formulation similar to Eq (3) applies, with variations depending on whether the asymptomatic proportion has the same or a different serial interval distribution [36]. The result is that end-of-epidemic declarations that do not account for asymptomatic transmission will potentially be early. Reporting delays act as time-varying under-reporting fractions, which especially degrade the more recent case days [10]. While the model required is more involved than Eq (3), since the declaration times largely depend on cumulative case counts, they are also likely to be premature.
While our method presents a clean framework for estimating the lifetime of an epidemic and investigating surveillance noise sources, it has several limitations. It commonly assumes that the serial interval distribution is known [12]. However, if surveillance is poor and changes to the serial interval (e.g. contractions due to interventions [30]) are not measured or included in computing z s then declaration times might be biased. Moreover, we neglect transmission heterogeneity, are necessarily hindered by the difficulty of estimating reproduction numbers at low incidence and do not consider interactions among noise sources. While these factors could limit the accuracy of our predicted declaration times, many can be accommodated as future extensions. We can incorporate heterogeneity by using negative binomial renewal models [1], improve on low incidence estimates by capitalising on specialised methods [32] and extend the models in Fig 1 to examine mixed noise types.
A key contribution of this work has been clarifying and highlighting how realistic imperfections in the collection or reporting of incidence data can significantly influence and bias the tail dynamics of an epidemic. Heightened surveillance should therefore be sustained even in periods of negligible incidence. Intensive testing and tracing is especially essential as it provides a means of measuring and compensating for case under-reporting, which we found to be among the strongest sources of bias. Maintaining good quality screening and geodata is also important since having accurate case origins can prevent misidentification, which is a main cause of unknown or unrecognised imports. These sentiments echo many issues currently being faced across the COVID-19 pandemic [19,37].
Real-time assessments of epidemic dynamics are crucial for understanding and aptly responding to unfolding epidemics [21]. We hope that the analytic approach developed here will serve as a useful tool for gaining ongoing insight into the tail dynamics of an outbreak, motivate the adoption of more event-triggered decision making and provide clear impetus for improving and sustaining surveillance across all phases of an epidemic. Our method is available (in R and Matlab) at https://github.com/kpzoo/End-of-epidemic-declarations. Our future work aims to develop this tool from its current form as a passive means of understanding and uncovering biases to an approach that can actively infuse additional data streams (e.g. case ascertainment ratios) to compensate for these biases in end-of-epidemic declarations.
Supporting information S1 Fig. Event and time-triggered declarations. We compare 95% event-triggered declaration times to the WHO time-triggered equivalent for Ebola virus disease over 1000 simulated epidemics. Left panels show the true (Dt � 95 ) and estimated (Δt 95 ) declaration times (based on Eqs (10) and (9)) relative to the time of the last observed case. The significant variability in both, which reflects the different shapes of possible epidemic curves with the same reproduction number profile (R s ) indicates why time-triggered approaches such as the WHO one [3] (Δt WHO , which is based on 42 days plus the time to recovery) can be insufficient. The error between the true and estimated times (δt 95 ) and the serial interval and reproduction number profile used are shown in the right panels.
(EPS) 1 Strictly, it is the time elapsed since the last case has recovered or died. However, as this additionally delay is not informative, it does not invalidate or alter any of the results or statements in this paper and so we speak in terms of the last case time for simplicity.) but not the confidence in declaration.