Conceived and designed the experiments: JL NMF. Performed the experiments: JL JRE IMH SC SL NMF. Analyzed the data: JL JRE IMH SC NMF. Contributed reagents/materials/analysis tools: JL SC NMF. Wrote the paper: JL JRE IMH SC SL NMF.
The authors have declared that no competing interests exist.
Rapidly identifying the features of a covert release of an agent such as anthrax could help to inform the planning of public health mitigation strategies. Previous studies have sought to estimate the time and size of a bioterror attack based on the symptomatic onset dates of early cases. We extend the scope of these methods by proposing a method for characterizing the time, strength, and also the location of an aerosolized pathogen release. A back-calculation method is developed allowing the characterization of the release based on the data on the first few observed cases of the subsequent outbreak, meteorological data, population densities, and data on population travel patterns. We evaluate this method on small simulated anthrax outbreaks (about 25–35 cases) and show that it could date and localize a release after a few cases have been observed, although misspecifications of the spore dispersion model, or the within-host dynamics model, on which the method relies can bias the estimates. Our method could also provide an estimate of the outbreak's geographical extent and, as a consequence, could help to identify populations at risk and, therefore, requiring prophylactic treatment. Our analysis demonstrates that while estimates based on the first ten or 15 observed cases were more accurate and less sensitive to model misspecifications than those based on five cases, overall mortality is minimized by targeting prophylactic treatment early on the basis of estimates made using data on the first five cases. The method we propose could provide early estimates of the time, strength, and location of an aerosolized anthrax release and the geographical extent of the subsequent outbreak. In addition, estimates of release features could be used to parameterize more detailed models allowing the simulation of control strategies and intervention logistics.
Releasing highly pathogenic organisms into an urban population is a form of bioterrorism that could result in a large number of casualties. The first indication that a covert open-air release has occurred is quite likely to be individuals reporting for medical attention. If such an attack is suspected, then public health authorities would attempt to identify those individuals who have been infected in order to provide rapid treatment with the aim of reducing the possibility of disease and potential death. Aiming treatment at too small an area might miss individuals infected further down and/or up wind, whereas issues surrounding both treatment resources and serious side effects may rule out mass treatment campaigns of large sections of the population. Our work provides scientific robustness to firstly estimate where and when an aerosolized release has occurred and secondly identify the most critically affected geographic areas. In order to use this statistical tool during an outbreak, public health workers would only need to collect the time of symptomatic onset and the home and work locations of early cases; recent weather information would also be required. Although the accuracy of the estimates is likely to improve as more cases appear, treating individuals based on early estimates might prove more beneficial since time would be of the essence.
If clinical cases of anthrax were detected, public health decision makers would want to estimate as soon as possible the features of the exposure event leading to the outbreak in order to determine who has potentially been exposed and should receive prophylaxis
While the 2001 anthrax cases had been exposed through the US postal service
In this paper, we develop and evaluate the performance of a back-calculation method to characterize a release from the observation of the first few cases, population densities, meteorological conditions and population movements such as commuting data. We considered that the causative agent would have been identified from the first few cases and that the incubation period distribution of the disease would be known. We also explore the potential of our tool to inform the planning of mitigation strategies.
As a case study, we investigate a simulated release of
We developed a probabilistic model for an inhalational anthrax outbreak following an instantaneous point source release. This model has three components: 1) the dispersion of anthrax spores in the atmosphere; 2) the within-host dynamics of anthrax spores; 3) the spatio-temporal population dynamics. We did not take into account cutaneous or gastrointestinal forms of anthrax.
The airborne dispersion of anthrax spores following an instantaneous point source release was modeled using a puff model weighted by the viability of spore concentration
The within-host dynamics model describes the biological processes of clearance, germination and growth of anthrax spores within a host and was adapted from published models
Finally, the dispersion model and the within-host dynamics model are integrated with population density and movement data to model the spatio-temporal dynamics of the outbreak. Full details of the model are provided in
We used a Markov Chain Monte Carlo sampling algorithm
To study the performance of our back-calculation method, we simulated 40 anthrax outbreaks due to a release at time
Param. | Description | Units | Value in the ref. scenario | Ref |
γ | Decay rate | /sec | 1.67×10^{−4} | |
λ | Germination rate | /day | 1×10^{−5} | |
θ | Clearance rate | /day | 0.109 | |
r | Growth rate | /day | 11.7 | |
b | Breathing rate | m^{3}/min | 0.03 | |
k | Threshold for the number of bacilli before symptoms | bacilli | 10^{10} | |
Median period between germination and symptoms | days | 2 | ||
Wind direction | BNG | (1,0) | ||
u | Wind speed | m/s | 5.0 | |
T | Date of the release | days | 0 | |
S | Number of released spores | spores | 10^{10} | |
H | Height of the release | m | 100 | |
W | Source | - | W_{0} |
BNG: British National Grid System.
The growth rate was calibrated in order to have a median period between first germination and symptoms of 2 days according to equation (1.5) in
We used medians of posterior distributions for height and strength estimates. The posterior distribution of the time was sometimes multimodal with local minima for night periods (we simulated a release during the day) and the median could fall into one of those local minima. Hence, to conserve the day/night information provided by the posterior distribution, instead of the time median, we discretized its posterior distribution into day/night classes and chose the middle time of the mode class as the point estimate. To estimate the release location, we also used the mode of the posterior distribution. Root mean square errors (RMSE) were used to summarize the quality of estimates (see definitions in
In order to understand how misspecification of aspects of the model would impact estimation accuracy, we reproduced the estimation procedure but deliberately misspecified either parameter values, data or the model structure. We examined 5 scenarios (see
Scenario | Modified model/data | Misspecification type | Description |
A | Symptoms onset dates of the simulated sample | Uncertainty on data | Onset dates precision = 0.5 days rather than 1 hour . For cases developing symptoms between 9AM and 9PM the registered time is 9 AM. For other cases the registered time is 9 PM. |
B | Estimation | Parameter value of the within-host dynamics model | Median delay between germination and symptoms = 5 days rather than 2 days |
C | Simulation | Within-host dynamics model | Incubation period for low doses given by |
D | Simulation | Spore diffusion model | Spore concentration given by HPAC model |
E | Simulation | Population movements | Occasional movements added to daily commuting data |
Reference scenario dataset used, but the precision of the symptoms onset date was 0.5 days rather than one hour. We considered that the symptoms onset hour of patients developing symptoms between 9AM and 9PM would be registered as 9AM, and 9PM for patients developing symptoms between 9 PM and 9 AM.
Reference scenario dataset used, but median delay between germination and symptoms of 5 days assumed in the back-calculation model rather than the 2 days used to generate the data.
In scenarios C to E, we simulated 40 outbreaks with three modified versions of our model and then used the reference scenario back-calculation model to fit these data:
Modification of the within-host dynamics model. Datasets were generated using the reference scenario model but with the within-host dynamics component replaced by the model proposed by Brookmeyer
Modification of the spore dispersion model. Datasets were generated using the reference scenario model but with the puff model of airborne dispersion replaced by the Hazard Prediction and Assessment Capability (HPAC) model
Modification of population movement assumptions. Instead of considering only commuting data, we considered that due to non-commuter travel, 10% of individuals could be exposed during the day in wards different from the ward where they would otherwise work. We considered that the pattern of these occasional movements was similar to the pattern of commuting movements. Hence, for 10% of cases, we considered that the original workplace was actually an occasional destination. The workplace of each of these cases was then drawn from the distribution of workplaces of people living in the individual's ward of residence. With this dataset, we also tested a modified version of the reference scenario back-calculation model by considering that people would have a small probability per day (set at 0.1) to travel away from their working place, with destinations being chosen based on ward sizes and distance to usual workplace (see
Past studies
In terms of helping to plan mitigation strategies, the first issue we examined was whether our estimates would allow the prediction of the outbreak extent from data on the first few cases. We also examined whether the model could accurately infer the geographical extent of the outbreak,
Finally, we compared the efficiency of the strategy described above with a “ring strategy” not requiring sophisticated analytical and computational methods. For this “ring strategy”, the wards considered at risk were located in the neighborhood of wards where the greatest number of cases had been detected (workplaces and residences were included). We selected as neighbors all wards having its centroid within a given distance of at least one of the centroids of the J most affected wards.
Although we simulated outbreaks following a release in a populated area, the set of parameters we used lead to relatively small simulated outbreaks (average size = 27, range = 19–39, see the risk map in
The cross on the main map represents the location of all simulated releases. The inset map represents population-weighted ward centroids (crosses) and their Voronoi diagram (polygons).
The release location is represented by the distance to the real source. For the date estimates, breaks were set at 9 AM and 7 PM and counts are represented by bar heights rather than bar surfaces. For two outbreaks of scenario D, the source location estimated with 5 observed cases was further than 12 kilometers (14.3 and 18.2 km). For scenario E, the source location estimated with 5 observed cases was further than 35 kilometers for two outbreaks (57 and 68 km), the source location estimated with 10 observed cases was further than 35 kilometers for one outbreak (57 km), the source location estimated with 15 observed cases was further than 35 kilometers for two outbreaks (45 km and 117 km).
Coverage |
RMSE1 (%) | Error range |
|||||||
# observed cases | 5 | 10 | 15 | 5 | 10 | 15 | 5 | 10 | 15 |
Reference | 95 | 95 | 95 | 71 | 45 | 32 | 0–215 | 1–137 | 3–83 |
A |
80 | 95 | 95 | 103 | 50 | 33 | 2–237 | 0–143 | 2–102 |
B |
95 | 95 | 95 | 71 | 45 | 32 | 0–212 | 1–141 | 2–83 |
C |
90 | 97.5 | 100 | 89 | 32 | 22 | 4–478 | 1–82 | 1–72 |
D |
80 | 87.5 | 85 | 141 | 72 | 54 | 2–427 | 1–232 | 3–201 |
E |
87.5 | 90 | 90 | 90 | 52 | 35 | 2–412 | 1–189 | 1–119 |
RMSE1 = Relative root mean square error (see definition in
The coverage is defined as the probability that the real value falls in the (2.5^{th}, 97.5^{th}) percentiles interval of the posterior distribution.
Range of the absolute relative error (%).
See
With scenario A, although estimates of the timing of release were slightly modified (for estimates based on 5 cases, difference ranged 0–2 days), the bias was below 10 hours for 80% of the simulated outbreaks. The accuracy of the source location estimates was not affected (see
Similarly, increasing the median delay between spore germination and symptoms from 2 to 5 days in the estimation algorithm (scenario B) modified estimates of the time of release by 71 hours on average (compared to the reference scenario estimates) but it did not modify the performance of the method to characterize the release location. Misspecifying further the within-host dynamics model (scenario C) by simulating symptomatic onset dates with the incubation period distribution for low doses proposed by Brookmeyer and colleagues
When we used a different spore dispersion model (HPAC) to simulate outbreaks (scenario D), the source location estimates based on 5, 10 and 15 observed cases were somewhat (though not catastrophically) impaired (RMSE = 4.6, 1.2, 0.6 km respectively versus 2.0, 0.9, 0.7 km with the reference scenario). Release height and strength estimates were also biased (see
Finally, if some of the observed cases had been exposed during an occasional stay in a ward different from their home (for night release) or workplace (for day release) as in scenario E, our back-calculation method could fail to identify the actual source location. The release date estimates remained accurate (RMSE was about 9 hours for
Estimates of the height (A), strength (B) and location (C) of the source for outbreaks simulated with Scenario E, based on the first 5 (blue), 10 (red), 15 (green) observed cases. (D) Ratio of the number of individuals inaccurately targeted (IC) by the mitigation strategy for a risk threshold of 1/100,000 relative to the theoretical number of individuals at risk (%). Triangles indicate estimates for simulations in which there is no observed case infected during an occasional movement. Rectangles indicate estimates for simulations in which there is at least one observed case infected during an occasional movement. The horizontal and vertical lines indicate the true values. The third line is the bisector.
The comparison of the release date and outbreak size estimates provided by previously published methods with our results shows that performance of the three methods were similar (see
Each box-plot represents the distribution (minimum, maximum, percentiles 2.5,25,50,75,97.5) of the predicted outbreak size relative bias based on the 5, 10 and 15 first cases on 40 simulated outbreaks per scenario.
Regarding mitigation policies, key is how many people might be missed by a risk-targeted strategy guided by the model estimates, and how many would be inaccurately considered at risk. Both of these numbers varied substantially from one simulated outbreak to another (see
(A) Ratio of the number of individuals missed by the targeting mitigation strategy for a risk threshold of 1/100,000 relative to the theoretical number of individuals at risk. (B) Ratio of the number of individuals inaccurately targeted by the mitigation strategy for a risk threshold of 1/100 000 relative to the theoretical number of individuals at risk. (C) Number of individuals at risk according to the model used to generate the data. (D) Impact of administrating treatments to individuals living or working in a ward exposed to a risk of at least 1/100 ,000 inhabitants: outbreak size when there is no treatment and when prophylactic treatment compliance and efficacy is 100% prior to the onset of symptoms and administered 4 days after the first 5, 10 or 15 cases occurred. Each box-plot represents the distribution (minimum, maximum, percentiles 2.5, 25, 50, 75, 97.5) of the total number of cases.
On average, the impact of the targeting strategy on outbreak size was greater when applied after the 5 first cases have occurred (see
As shown in
Here we have developed and tested a back-calculation model to characterize an airborne release of anthrax spores from data on the first observed cases, meteorological conditions, population density and movement data. Our simulation study shows that this method could provide accurate results even after only a few cases of a small outbreak have been observed.
Overall, in the event of an outdoor airborne release, the source location could accurately be identified although misspecifications of the spore dispersion model (scenario D) might slightly affect the quality of the estimates. Indeed, for a given dose, the HPAC model gave a larger geographical extent of the release than the puff model (see
In the spore dispersion model we used, we set the wind direction and speed at a fixed value both in the outbreak simulations and the back-calculation algorithms. However, our method could be refined to integrate more sophisticated datasets allowing the meteorological conditions to vary with time and to be imperfectly recorded.
The source location estimate would also probably be affected by misspecifications of population movements (scenario E). Indeed, if one or more observed cases had been exposed during local (or long) distance occasional movements then the quality of estimates would be impaired. We therefore developed a modified model that allowed for exposure due to occasional movements. Including this model in the back-calculation algorithm improved the location estimates when occasional movements were included in the simulated data, although the computational time required for estimation increased markedly. Hence, the standard model could provide a first set of estimates which could then be refined using the more elaborate model with occasional movements included.
The release date estimate might be biased if the within-host dynamics, and consequently the incubation period, were misspecified (scenarios B and C): different incubation period distributions could also be tested in further uncertainty analyses. Also, the within-host model used here could be extended to deal with continuous, rather than instantaneous releases, though this would require further development of the incubation period models which have been proposed for inhalation anthrax
Our analysis shows that characterizing an outbreak would help to predict its final size and to assist in targeting the exposed population requiring prophylactic treatment. Although the exposed population cannot be precisely estimated (both the number of missed individuals and inaccurately targeted individuals could be substantial), treating the population estimated to be at risk using our back-calculation method could substantially reduce the number of symptomatic cases, and therefore deaths. However, our estimates of the number of cases which might be prevented represent a best case: we assumed that both compliance with treatment and its efficacy were 100% prior to the onset of symptoms. Further analysis should be carried out to take into account the impact of sub-optimal compliance and lower treatment efficacy
A limitation of our method is the assumption of a common single source outbreak. If the outbreak was due to multiple releases, the spore dispersion component of our model could be modified to account for several sources. However, this would increase the number of parameters to estimate (four for each source) and could make the estimation based on a small number of observed cases less accurate or impossible. In addition, determining the number of sources could also be challenging. This problem might depend on the spatial separation of the sources; very widely spaced and more discrete “clusters” of cases might be quite obvious allowing their independent analysis. Some epidemiological oversight would obviously be key in such circumstances.
Our estimates could be used to parameterize models which have been developed to estimate the optimum duration of antibiotic treatments
Comparing our model with others in the literature, previous models provided estimates of the release date and the outbreak size but not the location
We have focused on evaluating our spatial back-calculation model for small outbreaks. In the case of a large outbreak, the rapid accumulation of cases and their locations would probably allow localization of the exposure event without the need for sophisticated methods. Nonetheless, the methods we developed here could be used for large outbreaks if statistical rigor was a key requirement for any analysis and to help with the early identification of the spatial extent of the release and the geographical targeting of antibiotic therapy. Application of this type of model to the airborne release of an agent capable of being transmitted from person-to-person (e.g. smallpox or pneumonic plague) would be feasible at the very beginning of an epidemic (before any transmission is likely to have occurred). But if secondary cases were suspected, our method would need further development to take into account the transmission process.
SUPPLEMENTARY MATERIAL
(1.23 MB DOC)
We thank Matt Keeling, Thomas House, Leon Danon, Philip Sansom and three anonymous reviewers for their useful comments and suggestions.