Fig 1.
EpiFusion particle filter structure, with the particle states per unit time (green outlined boxes) driven by the parameters of the process model, evaluated at resampling steps by epidemiological and phylodynamic observation models against case incidence and phylogenetic tree segments respectively per unit time (orange and purple circles). All models in this manuscript use daily time units.
Table 1.
A. Explanation of the data points used by EpiFusion; B. the key parameters of the EpiFusion particle filter.
Gamma, phi and psi are fit by MCMC, either as constant values over time or in epochs by either fixing or fitting change times and interval values. Beta must vary over time and can either be fit using (i) a random walk within the particle filter, (ii) linear splines within the particle filter, (iii) MCMC fitting in epochs by fixing or fitting change times and interval values, or (iv) MCMC fitting the parameters of a logistic function which defines beta over time; C. Other key terms in the EpiFusion particle filtering algorithm, in order of appearance in the text.
Fig 2.
Example ReMASTER epidemic simulation and resulting data used for the EpiFusion program (specifically, according to the “Baseline Scenario” described in the section “Scenario Testing”).
(a) True number of people infectedover time, from which (b) weekly reported case incidence counts and a (c) phylogenetic tree of simulated samples were derived based on given sampling rates. Plots of the other simulated datasets are provided in S2–S4 Figs.
Table 2.
Statistics used to evaluate model performance under scenarios 1, 2, and 3 for analyses using case incidence only (epi), phylogenetic tree only (phylo), and both data sources combined (combo).
The best or joint-best result for each statistic for each scenario is highlighted in bold. Trajectory RMSE: root-mean-squared error. Calibrated Trajectory Coverage: proportion of true trajectory that falls within the 95% HPD, scaled by 0.95. Scaled HPD Width: mean width of the 95% highest posterior density interval, scaled by the true value. Continuous Ranked Probability Score: mean CRPS across the trajectory time series. Brier Score: Classification accuracy for transmission phase (Rt) being above or below 1. Further details on the calculation of these statistics are included in the Supplementary Information (S5 Table).
Fig 3.
Comparison of median log likelihoods generated by EpiFusion (green) and a birth-death skyline model implemented in the BEAST2 (50) package BDSky (14) for the parameters β,γ and ψ. The true value of the parameter is marked by the blue vertical line.
Fig 4.
(a) Proportion of replicates that capture the true value of the parameter within their HPD intervals (y-axis) of increasing credible mass alpha (x-axis), for the parameters: β infection parameter (green), γ recovery parameter (blue), φ case sampling rate (yellow) and ψ sequence sampling rate (orange). For the infection rate parameter β (which varies over time), the y-axis reflects the average proportion of the β trajectory captured in the HPD interval across all replicates (b) Mean inferred value and 95% HPD interval of the parameter (y-axis) plotted against the true value of the parameter (x-axis). For the infection parameter β, a subset of 1000 values of βt is shown for clarity in the plot as β varied over time in the simulations and models, so each replicate resulted in the inference of many βt values. For both graphs the grey dotted line indicates the ‘perfect’ result: perfect calibration for (a) and perfect agreement between true and inferred parameters for (b).
Fig 5.
Inferred mean infection count trajectories from EpiFusion using only case incidence (orange), only the phylogenetic tree (purple) and both data types combined (green) (columns) for the three scenarios tested (rows). The true number infected over time is represented by the black line. 95%, 80% and 66% highest posterior density intervals are represented by increasingly dark shaded regions. Times of step-changes are marked by the vertical dotted lines for the step-change in sampling and transmission scenarios: a 10-fold increase in case and genomic sequence sampling rates on day 35 for the ‘Sampling’ step-change scenario, and a 3-fold increase in transmission rate on day 100 for the ‘Transmission’ step-change scenario.
Fig 6.
Inferred Rt from EpiFusion using only case incidence (orange), only the phylogenetic tree (purple) and both data types combined (green) for the three scenarios tested (rows). True Rt is represented by the solid black line. 95%, 80% and 66% highest posterior density intervals are represented by increasingly dark shaded regions. Times of step-changes are marked by vertical dotted lines: a 10-fold increase in case and genomic sequence sampling rates on day 35 for the ‘Sampling’ step-change scenario, and a 3-fold increase in transmission rate on day 100 for the ‘Transmission’ step-change scenario. An Rt of 1 is marked by the dashed horizontal line. The true Rt fluctuates at the end of the sampling step-change scenario due to very low prevalence as the outbreak ends.
Fig 7.
Rt trajectory RMSE, CRPS and Brier Score (y-axes) for case incidence only (orange), tree only (purple) and combined (green) EpiFusion approaches on scenarios with increasing noise (x-axes). For each of these metrics, a value closer to 0 reflects a better score. Noise is quantified as the standard deviation divided by the mean of the distribution from which the transmission or sampling rates were drawn. The general trend is shown by linear regression lines of the corresponding colour.
Fig 8.
Estimated mean Rt and 95% HPD intervals for the three validation scenarios from EpiFusion (green), EpiNow2 (blue), BDSky (red) and TimTam (yellow).
Table 3.
Model Benchmarking.
Fig 9.
(a) Phylogenetic tree of Ebola virus sequences in Sierra Leone consisting of a subclade of the MCC tree obtained from Dellicour et. al [50], with tips coloured by region at a 1st administrative unit level. (b) Weekly case incidence of Ebola virus disease in Sierra Leone obtained from Fang et. al [49], stratified by region. (c) Inferred median effective reproduction number (solid line) of Ebola virus disease in Sierra Leone from an EpiFusion combined model. 95%, 80% and 66% highest posterior density intervals are represented by increasingly dark shaded regions. Two key dates in the epidemic are labelled: (i) Declaration of a national state of emergency on August 6th 2014, and (ii) national three day quarantine beginning on September 19th 2014.
Table 4.
Comparison of the key characteristics of EpiFusion compared to the tools and literature referred to in this manuscript.
Rasmussen’11 denotes Rasmussen et. al (2011), which was referenced in the introduction. However, the model is not distributed for use as a software or program, so we were unable to assess its computational efficiency (*). (BD–birth death).