Using a latent Hawkes process for epidemiological modelling

Understanding the spread of COVID-19 has been the subject of numerous studies, highlighting the significance of reliable epidemic models. Here, we introduce a novel epidemic model using a latent Hawkes process with temporal covariates for modelling the infections. Unlike other models, we model the reported cases via a probability distribution driven by the underlying Hawkes process. Modelling the infections via a Hawkes process allows us to estimate by whom an infected individual was infected. We propose a Kernel Density Particle Filter (KDPF) for inference of both latent cases and reproduction number and for predicting the new cases in the near future. The computational effort is proportional to the number of infections making it possible to use particle filter type algorithms, such as the KDPF. We demonstrate the performance of the proposed algorithm on synthetic data sets and COVID-19 reported cases in various local authorities in the UK, and benchmark our model to alternative approaches.


Introduction
The novel coronavirus disease (COVID-19) has been declared a Global Health Emergency of International Concern with over 557 million cases and 6.36 million deaths as of 3 August 2022 according to the World Health Organization. In the absence of vaccines, countries initially followed mitigation strategies or countermeasures to prevent the rapid spread of COVID-19, such as social distancing, quarantine, mask wearing, and lock-downs.
A large number of studies have been carried out to understand the spread of COVID-19, forecast new cases and when the peak of the pandemic will occur, and investigate "what-if-scenarios". For example, Ferguson et al. [1] presented the results of epidemiological modelling looking at a variety of nonpharmaceutical interventions. Several compartmental models [2][3][4][5] using ordinary differential equations (ODE) have been proposed for modelling the spread of COVID-19. Various models using Hawkes processes [6][7][8][9][10][11][12], widely used to model contagion patterns, have been introduced as an alternative to ODE models. Others have used a Poisson autoregression model of the daily new observed cases [13] and a Bayesian model linking the infection cycle to observed deaths [14].
We introduce a novel epidemic model using a latent Hawkes process [15] with temporal covariates for modelling the infections. Unlike other Hawkes models, the Hawkes process is used as a latent, i.e. for modelling the actual unobserved infection cases. Observations, such as reported infection cases, are then modelled as random quantities driven by the latent Hawkes process. Other models that use the latent processes in epidemiological models (e.g. [14]) usually have time-aggregated counts of infections as latent process, i.e. the latent process works on a discrete scale. We propose using a Kernel Density Particle Filter (KDPF) [16,17] for inference of both latent cases and reproduction number and for predicting the new cases in the near future. It is feasible to employ particle filter type algorithms, like the KDPF, because the computational effort is linear to the number of infections. Modelling the infections via a Hawkes process allows us to estimate by whom an infected individual was infected. We demonstrate the performance of the proposed algorithm on synthetic data and COVID-19 reported cases in various local authorities in the UK. The methods [10,18] provide similar estimates of reproduction number to the proposed algorithm. The ability of our model to estimate individual latent cases and reveal epidemic dynamics provides an important advantage over other models.

Related work
The Hawkes process is a well known self-exciting process in which the intensity function depends on all previous events assuming infinite population that allow for parametric or nonparametric estimation of the reproduction number (that is, the expected number of infections triggered per infected individual). Hawkes processes have been widely used in numerous applications such as social media, criminology and earthquake modelling. In this section, we present the application of the Hawkes processes in the modelling of COVID-19.
First, we briefly review basic compartmental models and their connection with Hawkes process and COVID. The Susceptible-Infected-Recovered (SIR) and Susceptible-Exposed-Infected-Recovered (SEIR) models are the two basic compartmental epidemic models for modelling the spread of infectious disease [5,19]. The SIR model defines three classes of individuals: those susceptible to infection (S), those currently infected (I) and those recovered (R). The SEIR model involves an additional compartment (E) that models the exposed individuals without having obvious symptoms. For many diseases, including COVID-19, there is an incubation period during which exposed individuals to the virus may not be as contagious as the infectious individuals. A variant of the SEIR model called SuEIR was introduced by Zou et al. [5] for modelling and forecasting the spread of COVID. The SuEIR compared to SEIR has an additional compartment (u) that models the unreported cases. Estimates based on compartmental models can be unreliable, as they are highly sensitive to initial conditions and parameters such as transmission and recovery rates [8].
A stochastic formulation of SIR called Stochastic SIR [20] is a point process having events that are either the recovery times or the infection times of individuals with exponentially distributed recovery times. Rizoiu et al. [21] introduced the SIR-Hawkes process (also known as HawkesN), which is a generalization of the Hawkes process concerning finite population. They showed that the conditional intensity of the SIR-Hawkes process with no background events and exponential infectious period distribution is identical to the expected conditional intensity of Stochastic SIR with respect to the recovery period distribution. The Hawkes process with gamma infectious period distribution can approximate stage compartment models if the average waiting times in the compartments follow an independent exponential distribution [12,22].
Kresin et al. [7] claim that although the SEIR model is mostly used for COVID modelling compared to the Hawkes process, a Hawkes model offers more accurate forecasts. Specifically, they suggest a SEIR-Hawkes model in which the intensity of newly exposed cases is a function of infection times and size of the population. Chiang et al. [12] introduced a Hawkes process model of COVID-19 that estimates the intensity of cases and the reproduction number. The reported cases are modelled via a Hawkes process. The reproduction number is estimated via a Poisson regression with spatial-temporal covariates including mobility indices and demographic features. Based on the branching nature of the Hawkes process, Escobar [8] derived a simple expression for the intensities of reported and unreported COVID-19 cases. The key to this model is that at the beginning of a generation the infectious will either (1) be registered, (2) not be registered but continue being contagious, or (3) recover with fixed probabilities. However, we believe that the probability of remaining contagious and not being registered infectious should be a decreasing function of time and not fixed.
Garetto et al [6] proposed a modulated marked Hawkes process for modelling the spread of COVID-19 under the impact of countermeasures. Each mark corresponds to a different class of infectious individuals with specific kernel functions. Three classes of infectious are considered: symptomatic, asymptomatic and superspreader, for obtaining the average intensity function and the average total number of points up to a specific time. Symptomatic people are those who will develop evident symptoms and by extension they will be quarantined. Asymptomatic people are those who will not develop strong enough symptoms to be quarantined. Superspreaders are individuals who exert a high infection rate but do not get quarantined. The model estimates the reproduction number taking into account the amount of recourses employed by the health service to discover the infected population, the countermeasures, as well as the stages that all infectious go through: random incubation time, presymptomatic period, random disease period and random residual phase.
Koyama et al. [10] developed a discrete-time Hawkes model for estimating the temporally changing reproduction number, and hence detecting the change points via assuming a negative binomial distribution for the daily cases. Further analysis in [9,23] examined the daily death data to avoid the issues raised from the reported cases. Browning et al. [9] modelled the reported daily deaths using a discrete-time Hawkes process, where the cases are assumed Poisson distributed. They considered one fixed change point that breaks the period of analysis into two phases: the initial period where the virus is spreading rapidly and the period after the introduction of preventative measures. The model provides accurate predictions for shorttime intervals.
All the aforementioned stochastic Hawkes models use the Hawkes process for modelling either the reported infections or the newly exposed cases. Herein, we provide a novel epidemic model for the infections using a latent Hawkes process with temporal covariates and, in turn, the reported cases using a probability distribution driven by the underlying Hawkes process. Working on a continuous scale offers the inference of individual latent cases and reveals unobserved transmission paths of the epidemic. We apply particle methods for inferring the latent cases and the reproduction number and predicting observed cases over short time horizons. The simulation analysis shows that the estimated reproduction number and the intensity of latent cases depict the epidemic's development and capture the trajectory of cases.

Model
We introduce a novel epidemic model using a latent Hawkes process of infections that then trigger a process of reported infection cases.
We focus on an infinite homogeneous population and restrict our attention to an epidemic process over a horizon [T 0 , T), T 0 < T, in which we assume immunity to re-infection. This immunity is a reasonable assumption over the time scales we consider. We break the horizon [T 0 , T) into k subintervals T j ¼ ½T jÀ 1 ; T j Þ for j = 1, .., k with T k = T. We assume that the epidemic is triggered by a set of infectious individuals at the beginning of the process, the times of their infections denoted by a finite set H 0 � ðÀ 1; T 0 Þ.
The epidemic process is seen as a counting process N(t) with a set of jump times T N ¼ ft 0 < t 1 < t 2 < . . .g and intensity given by being the set of all infection events prior to time t. The kernel h(t − t i ) represents the relative infectiousness at time t of an infection at time t i . We assume that the transition kernel h is a probability density function with non-negative real-valued support: h: [0, 1)![0, 1) and The process R(t) represents the instantaneous reproduction number that is the average number of newly infected people that each infected individual would infect if the conditions, such as interventions and control measures for restriction of epidemic, remained as they were at time t [18]. It is natural to see the reported infections as a counting process M(t) with a set of jump times T M ¼ ft 1 < t 2 < . . . < t m g and intensity of observed cases at time τ as a function of the times of infection up to time τ, namely for τ > 0, where β is the expected number of observed cases per infected individual at time τ (also known as ascertainment rate). The transition kernel g(τ − t i ) represents the relative delay between the infection at time t i and the time at τ the infection is detected. Similar to the transition kernel of latent cases h, we specify the transition kernel of observed cases g to be a probability density function with non-negative real-valued support. M(t) is usually only observable in daily or weekly aggregates. We will use T n as aggregation intervals and let Y n be the number of reported cases in T n . We model Y n via a distribution G having mean μ n equal to the expected observed cases in T n given by The usual options of G are Negative Binomial (NB) [10,24] and Poisson distribution [9,18]. We model the reproduction number R(t) as a stepwise function having as many weights as the number of subintervals, that is, where {R n } is assumed to be a Markov process. Usually, a random walk on a logarithmic scale [25] or a normal scale [10] is imposed as a prior on the weights {R n }.
The model is described by the equations: Y n � G with mean EðY n Þ ¼ m n ; n ¼ 1; ::; k ; ð3Þ fR n g k n¼1 is a Markov process ; ð5Þ

Inference algorithm
Given a set of observed infections {Y 1 , .., Y k }, we seek to infer the counting process N(t) and the reproduction number R(t). The proposed epidemic model described by the Eqs (2)-(6) is seen as a state-space model with a latent state process {X n : 1 � n � k} and an observed process {Y n : 1 � n � k}. Each hidden state X n consists of the reproduction number's weight R n associated to T n and the set of latent cases S N n falling into T n . The time-constant parameters are the parameters associated with the distribution G and the prior imposed on the weights fR n g k n¼1 . We apply a KDPF [16,17] for inferring the counting process N(t), the weights fR n g k n¼1 , and the time-constant parameters. We consider that the ascertainment rate β is given.
We focus on illustrating the performance of our model on COVID-19. As the COVID-19 reported cases are subject to erroneous observation and for the data we observe that the sample variance is larger than the sample mean, we model the observed cases Y n via a negative binomial distribution (NB) with mean μ n and dispersion v > 0. We use the following form of the negative binomial distribution with mean E(Y n ) = μ n and variance var(Y n ) = μ n (1 + vμ n ). Before we discuss the KDPF, we define the transition kernels of the observed and latent cases and the prior on weights fR n g k n¼1 for COVID-19. We also suggest a simple method to initialize H 0 . Transition kernels. The dynamics of the latent and observed cases are determined by the generation interval (GI) and incubation period (IP) [26]. The generation interval is the time interval between the time of infection of the infector (the primary case) and that of the infectee (the secondary case generated by the primary case). The incubation period is the time interval between the infection and the onset of symptoms in a specific case. Zhao et al. [27] assume that the GI and IP follow a gamma distribution. They infer that the mean and SD of GI are equal at 6.7 days and 1.8 days and those of IP at 6.8 and 4.1 days by using a maximum likelihood estimation approach and contact tracing data of COVID-19 cases without considering COVID-19 variants. We follow the same assumption for the GI (namely, the transition kernel of latent cases is a gamma density with a mean at 6.7 days and SD of 1.8 days). We model the time interval between the observed time and actual time of infection as a gamma density with a mean at 8.8 days and SD of 4.1 days (that is, the transition kernel of observed cases is a gamma density having mean equal at 8.8 days and SD of 4.1 days). For the transition kernel of the observed events, we adopt the values inferred by Zhao et al. [27] for IP with a slightly increased mean to consider the necessary time for conducting a test against COVID-19. Fig 1 illustrates the transition kernels. We also conduct a sensitivity analysis in the mean of GI and the period between observed and actual infection times using the real cases in the local authority Ashford (19/12/2021-9/4/2022) [28] available in S1 Appendix.
Set of infectious at the beginning of the process, H 0 . We adopt a heuristic approach to initialize H 0 . The transition kernel of latent cases illustrated in Fig 1 shows that a latent case at t w can influence the latent intensity at t if t w has occurred at most 21 days before t. Otherwise, the influence of t w is negligible. Therefore, as the history of the process, we consider the latent cases of 21 days/3 weeks before the beginning of the process. The mode of the transition kernel of the observed cases equal to 6.216 demonstrates that an event is most likely to be observed 7 days after the actual infection time. Considering the observed cases are daily, we initialize the history of latent case, H 0 by uniformly spreading on the day −i the number of cases occurred on the day (−i + 7) times 1/β. In simulation analysis, we propose initialization of H 0 when we deal with weekly reported cases.
Imposed prior on weights fR n g k n¼1 . A geometric random walk (RW) is imposed as prior on weights fR n g k n¼1 : We impose a gamma prior on the noise of RW � n with equal shape and rate at d. This induces that the weight R n is gamma distributed with a mean equal to R n−1 and standard devia- . The stronger fluctuations in the observed data, the more flexible modelling we need. Smaller values of d have higher standard deviation and lead to a wider range of possible values of R n increasing the flexibility of the model.
Kernel Density Particle Filter. We apply a KDPF (Algorithm 2) for inferring the counting process N(t), the weights fR n g k n¼1 , and the time-constant parameters. The time-constant parameters for modelling COVID-19 infections are the shape d of the noise � n and the dispersion parameter v. The KDPF builds on the auxiliary particle filter (APF) [29][30][31] by adding small random perturbations to all the parameter particles to reduce the sample degeneracy by modelling the timeconstant parameters as random quantities and their posterior via a mixture of normal distributions. We assume independence among the time-constant parameters, and, following Sheinson et al. [16], we use logarithms for the time-constant parameters, as they have positive support: The posteriors p(log d n+1 |Y 1:(n+1) ) and p(log v n+1 |Y 1:(n+1) ) are smoothly approximated via a mixture of normal distributions weighted by the sample weights w jn given by where N ðm; s 2 Þ is a Gaussian pdf with mean μ and variance σ 2 . The KDPF uses a tuning parameter Δ 2 (0, 1] and two quantities as a function of that parameter: The parameter Δ is typically taken to be between 0.95 and 0.99 for reducing the chance of degeneracy [16,32].
The mean values and the variances of the posteriors of time-constant parameters are defined as follows [16,32]: Following Sheinson et al. [16], we define the initial densities of parameters d and v to be log-normal: The transition densities of the timeconstant parameters are given by The initial density of the hidden process is given by while the transition density is given by f ðx n jx 1:ðnÀ 1Þ ; H 0 ; d; vÞ ¼ PðS N n jR n ; S N 1:ðnÀ 1Þ ; H 0 ÞPðR n jR nÀ 1 ; dÞ: Sampling the latent cases. We sample the latent cases S N n falling into the subinterval T n by applying Algorithm 1, which is a simulation procedure based on the branching structure of the Hawkes process [15]. The proposed algorithm is a superposition of Poisson processes in the interval T n ; the descendants of each latent event at t i form an inhomogeneous Poisson process with intensity . This induces that: • The number of events n i triggered by an event at t i in the interval T n is Poisson distributed with parameter hðs À t i Þds: • The arrival times of the n i descendants are t i + E i with E i being iid random variables with pdf the truncated distribution The computational cost of Algorithm 1 is linear to the number of infections falling into (η + 1) consecutive subintervals, that is Oð P n v¼nÀ Z jS N v jÞ, with η being the number of former subintervals that influence the latent cases falling into T n determined by the transition kernel of latent cases. The O-notation denotes the asymptotic upper bound [33]. Algorithm 1 Sample S N n jS N 1:ðnÀ 1Þ ; H 0 ; R n 1: Input: S N 1:(n¡1) , H 0 , R n 2: Initialize an empty queue: v=n¡´w ith´being the number of former subintervals we consider (the value of´is determined by the transition kernel of latent cases). 4: while Q t is not empty do

5:
Remove the first element t i from Q t .

6:
Draw the number of events n i triggered by an event at t i from a Poisson distribution h(s ¡ t i )ds that is the average number of offsprings generated by an event at t i in T n .

7:
Generate n i events from the truncated distribution h(t) in [max(t i ; T n¡1 ); T n ), and add the new elements to S N n and the back of queue Q t . 8: end while 9: Return S N n .
Who infected whom. The Hawkes process is an excellent option for modelling the evolution of an epidemic due to its mutually exciting nature, making it feasible to estimate by whom an infected individual was infected. Bertozzi et al. [11] describe how we can infer the primary infection i that triggered a secondary infection j using a self-exciting branching process. The parent of each infection j falling into T z is assumed to be sampled from a multinomial distribution parameterized by π j , where p j ¼ fp ji g i2h j with being the probability of secondary infection j having been caused by primary infection i, where and η the number of former subintervals that influence the latent cases falling into T z determined by the transmission kernel of latent cases (η = 21 days for COVID -19). Alternatively, by recording the parent of each latent infection at step 7 of Algorithm 1, the proposed model can show the branching structure of the process. This approach increases the computational complexity of the algorithm, as more memory units will be required. There is a set of model parameters, including the ascertainment rate β, the transition kernels of latent and observed cases, which we consider as given. The set of infectious at the beginning of the process, H 0 , is described applying the heuristic approach described above. We rely on the Bayesian paradigm for regularizing the parameters for inference.
Fixed-lag smoothing densities. As the resampling step leads to path degeneracy, it is difficult to obtain a good approximation of the smoothing density p(x 1:T |y 1:T ) for large T via SMC. Therefore, we use SMC to sample from the fixed-lag smoothing densities with lag L. Resampling results in replicating samples, and in the long run results in a lack of diversity called particle degeneracy [34]. We apply the multinomial resampling step when the Effective Sample Size (ESS) is less than the 80% of the number of particles, to avoid unnecessary resampling steps. Algorithm 2 Kernel density particle filter for j in 1 : N do (i) sample index i j from a multinomial distribution with probabilities g n+1 (ii) ¹ X j n = X i j ;n (iii) g j;n+1 = 1 end for end if 11: Regenerate the fixed parameters: for j in 1 : N do To draw a sample from P ³ X 1:n+1 jY 1:n+1´. We do resampling with weights fw j;n+1 g N j=1 g if resampling was performed at step 10. Otherwise;we do resampling with weights k j;n+1 /~j ;n+1 g j;n+1 . 16: end for

Simulation analysis
We carried out a simulation study on synthetic data to illustrate the performance of the KDPF (Algorithm 2) for inferring the intensity of latent cases, the reproduction number and the time-constant parameters.
Two different scenarios illustrated in Fig 2 were simulated as follows: • We deal with 16 hidden states fX n g 16 n¼1 . Each state X n is associated with the latent cases falling during the week T n and the parameter R n associated with that week. We infer the latent intensity λ N (t) and the weights fR n g 16 n¼1 as well as the weekly latent cases via the particle sample derived by drawing samples from the smoothing density with lag equal to 4. Figs 3 and 4 illustrate the estimated latent intensity, the estimated weekly hidden cases and the estimated weights of the reproduction number for both scenarios using 40000 particles. We note that the 99% Credible Intervals (CIs) of the time-constant parameters include the actual values of the parameters. The simulation analysis shows that the KDPF approaches well the ground truth.
To confirm the convergence of posterior estimates of weights and weekly hidden cases concerning the number of particles (N), we find the associated Monte Carlo Standard Errors (MCSEs) that give a sense of the variability of particle mean per state. The MCSEs of the average of posterior means of weights and weekly latent cases are given by and MCSEðYÞ ¼ 1 16 where var(z) is the variance of z and Y i the aggregate latent cases in i th week. The MCSE verifies the convergence of posterior estimates concerning the number of particles (see Tables  1 and 2). Finally, we compare the performance of the KDPF (Algorithm 2), APF (Algorithm 3), bootstrap filter (BF) (Algorithm 4) and particle marginal Metropolis-Hastings (PMMH) (Algorithm 5) [35] for inferring the latent intensity λ N (t) and the reproduction number R(t) illustrated in a new simulation scenario C (see Fig 5). Scenario C concerns a process triggered by 661 infectious and generated similar to scenario A assuming that α = 0.5, b = 2, d = 15.11, v = 0.01, R 1 = 1.57, d min = 10, d max = 20, v min = 0.0001 and v max = 0.5. The time-constant parameters d and v are known for BF and APF. We used 10000 iterations of the PMMH sampler with a burn-in of 5000 iterations. We use APF using 50 particles as an SMC sampler. The   average acceptance ratio is about 0.1844 resulting in a Markov chain that mixes well. For the KDPF, Δ was set to 0.99. We find the Average Absolute Error (AAE) and the Root Mean Square Error (RMSE) of the computed estimates defined in Forecasting. Table 3 shows the errors related to KDPF, APF, BF and PMMH for scenario C. The errors associated with KDPF are comparable to those obtained using BF and APF for which the time-constant parameters are known. The performance of KDPF compares well with PMMH, having the advantage that it is a more computationally efficient algorithm than PMMH.   For each particle j, we calculate an estimate of X j;n+1 called~j ;n+1 by drawing a sample from P (X n+1 jXn ; H 0 ; d):

j=1~j
;n end for

Real data
We apply the KDPF (Algorithm 2) to real cases in the local authorities: Leicester (4/9/2021-24/12/2021) [36], Kingston upon Thames (11/12/2021-8/4/2022) [37] and Ashford (19/12/ 2021-9/4/2022) [28] available from the government in the UK. Fig 6 illustrates the daily and weekly observed cases in the local authorities. We deal with 16 hidden states fX n g 16 n¼1 and 16 subintervals fT n g 16 n¼1 ; each subinterval corresponds to the duration of one week. We infer the latent intensity λ N (t), the reproduction number R(t), and the weekly and daily latent cases via the particle sample derived by drawing samples from the smoothing density with lag equal to 4. We demonstrate that the proposed model can be applied to predict the new observed cases over short time horizons.
We assume that the initial reproduction number during the first week is uniformly distributed over the interval from 0.5 to 2. Our initialization includes the 90% Confidence Interval published from the government in the UK: 0.9-1.1 on 4/9/2021 and 11/12/2021, 1-1.2 on 19/ 12/2021 [38]. We also assume d min = 1, d max = 10, v min = 0.0001 and v max = 0.5. Figs 7-9 show the estimated latent and observed intensity, the estimated weekly and daily hidden cases, the estimated reproduction number and the time-constant parameters in the local authorities. We illustrate the intensity of observed cases, approximating via Eq 1. We note that the estimated latent intensity and the estimated latent cases are in agreement with the reported cases. According to the analysis, the instantaneous reproduction number R(t) depicts the pandemic's development and capture dynamics. For the COVID-19 pandemic, there is a maximum delay of 21 days between the reported and actual infection times, which provides information regarding the progression of the epidemic. As a result, estimates have become more uncertain towards the end of the horizon. To assess the performance of our algorithm, we compute the mean absolute percentage error (MAPE) of the computed estimate of weekly observed cases (see Algorithm 6): where Y i andŶ i are the true and estimated weekly observed cases via the posterior median in week i, respectively. The metric is 1.46%, 1.08% and 2.09% for Ashford, Leicester and Kingston upon Thames. Fig 10 shows the estimated weekly observed cases for the local authorities. The analysis demonstrates that our algorithm provides a good approximation of the weekly reported cases. (c) wp min :n(i) = X ¤ 1:n ,P (Y1:njdi; vi) =P (Y1:njd ¤ ; v ¤ ). Otherwise, di = di¡1, vi = vi¡1, X1:n(i) = X1:n(i ¡ 1),P (Y1:njdi; vi) =P (Y1:njdi¡1; vi¡1).

Algorithm 6
Estimating the weekly observed cases 1: for n = 1; ::; 16 do 2: for j = 1; ::; N do 3: Calculate the mean of observed cases in the interval Tn denoted by ¹ j;n . 4: Y j;n » NB(¹ j;n ; v jn ). 5: end for 6: Use the sample fY j;n g N j=1 to find the posterior median and the 95% CI of the estimated observed cases in Tn . 7: end for We compare the proposed algorithm with two methods of estimating the reproduction number. The method suggested by Cori et al. [18] (EpiEstim) estimates the reproduction number from incidence time series using a Bayesian framework with a gamma distributed prior imposed on the reproduction number. An alternative method suggested by Koyama et al. [10] is a state-space method for estimating the daily reproduction number from a time series of daily reported infections using a random walk prior to the reproduction number and log-normal distribution as the distribution of the serial interval (SI). We assume that the mean and standard deviation of the SI distribution is at 6.9 days and 5.6 days following Zhao et al. [27]. We apply EpiEstim by using the gamma and log-normal distribution as the distribution of SI. Both choices lead to identical results. Fig 11 shows the weekly average of daily estimates of the reproduction number via posterior median derived by the method of Koyama et al. [10] and the posterior medians of R(t) given by EpiEstim and the proposed algorithm following the course of the pandemic. The method of Koyama et al. [10] and EpiEstim provide similar estimates to those of Algorithm 2 most of the time. Koyama et al. [10] and EpiEstim do not build a delay between reported and actual infection time in their models, which is why there are variations in their estimations compared to our algorithm. Therefore, the reproduction number given by EpiEstim responds later to changes compared to our estimation. Koyama et al. [10] shows a bit less of a time lag, which we conjecture to be due to it working with daily reproduction numbers and cases (which are being shown averaged in Fig 11). In the first week, the estimates of Koyama et al. [10] and EpiEstim have essential higher values than one of the proposed algorithm due to different assumptions about the initialization of the epidemic.
We also compare the estimated rate of latent cases λ N (t) and observed cases λ M (t) with the estimated daily number of events derived by Koyama et al. [10]. Fig 12 shows that the expected daily number of events is almost identical to λ M (t) and in agreement with λ N (t) after the end of the 3rd week. The differences in the first three weeks are due to different initializations of the methods.
Forecasting. Using the proposed model, it is also possible to predict the number of new observed cases in the near future, by fitting the model with data up to week T k and forecasting cases for the week T kþ1 using Algorithm 7. To analyse the performance of this algorithm, we conduct a rolling-window analysis and predict the observed cases in weeks T 11 À T 17 . Table 4 shows the estimated numbers in the local authorities by applying Algorithm 7 and the method introduced by Koyama et al. [10], assuming that the reproduction number remains at the value obtained for the last day. Table 5 shows the metrics MAPE and AAE of the estimated cases via posterior median. The empirical coverage probability of our 80% CIs is about 86%. Our estimates are similar to those given by Koyama et al. [10] most of the time.

Discussion
In this paper, we introduce a novel epidemic model using a latent Hawkes process with temporal covariates. Unlike other Hawkes models, we model the infections via a Hawkes process and the aggregated reported cases via a probability distribution G with a mean driven by the underlying Hawkes process. The usual options of G are Negative Binomial and Poisson distribution. We propose a KDPF for inferring the latent cases and the instantaneous reproduction number and for predicting the new observed cases over short time horizons. We demonstrate the performance of the proposed algorithm on COVID-19.
The analysis of synthetic data shows that KDPF compares well with PMMH, having the advantage that it is a more computationally efficient algorithm than PMMH. We also demonstrate that our predicted new cases, and our inference for the latent intensity, the daily and weekly hidden cases are consistent with the observed cases in various local authorities in the UK. The simulation analysis shows that the proposed algorithm provides comparable estimates of observed case fluctuations compared with those of Koyama et al. [10]. The method of Koyama et al. [10] and EpiEstim provide similar estimates of the reproduction number to the proposed algorithm.
The simulation analysis shows that working with daily reported infections leads to better Effective Sample Sizes using a smaller number of particles, as the data spikes are reduced.
According to Cori et al. [18], the estimates of the instantaneous reproduction number are expected to be affected by the selection of the time window size. Large sizes result in more smoothing and reductions in statistical noise, whereas small sizes result in faster detection of transmission changes and more statistical noise. They suggest an appropriate way of choosing the time window size. We have selected a weekly time window to analyse the real data in line with Cori et al. [18].
Uncovering disease dynamics and tracing how and by whom an infected individual was infected is challenging due to unobservable transmission routes [39,40]. Modelling the infections via a Hawkes process allows us to model infection dynamics.
Isham and Medley [41]; Wallinga et al. [42] contend that it is necessary to account for individual heterogeneities while modelling the transmission of an infectious disease. Individuals vary in their tendency to interact with others; personal hygiene is a key factor in the propagation of diseases; individuals' community structure and location might be significant in spreading epidemics. The proposed epidemic model can be viewed as a turning point in deriving epidemic models that consider individual heterogeneities and provide insight into underlying dynamics that is the subject of our future work. Future work also considers the inference of ascertainment rate (β), using various transition kernels for modelling the latent and reported infection cases, as well as more sophisticated ways for initializing the set of infectious triggering the epidemic process, H 0 .