Electronic event-based biosurveillance systems (EEBS’s) that use near real-time information from the internet are an increasingly important source of epidemiologic intelligence. However, there has not been a systematic assessment of EEBS evaluations, which could identify key uncertainties about current systems and guide EEBS development to most effectively exploit web-based information for biosurveillance. To conduct this assessment, we searched PubMed and Google Scholar to identify peer-reviewed evaluations of EEBS’s. We included EEBS’s that use publicly available internet information sources, cover events that are relevant to human health, and have global scope. To assess the publications using a common framework, we constructed a list of 17 EEBS attributes from published guidelines for evaluating health surveillance systems. We identified 11 EEBS’s and 20 evaluations of these EEBS’s. The number of published evaluations per EEBS ranged from 1 (Gen-Db, GODsN, MiTAP) to 8 (GPHIN, HealthMap). The median number of evaluation variables assessed per EEBS was 8 (range, 3–15). Ten published evaluations contained quantitative assessments of at least one key variable. No evaluations examined usefulness by identifying specific public health decisions, actions, or outcomes resulting from EEBS outputs. Future EEBS assessments should identify and discuss critical indicators of public health utility, especially the impact of EEBS’s on public health response.
Citation: Gajewski KN, Peterson AE, Chitale RA, Pavlin JA, Russell KL, Chretien J-P (2014) A Review of Evaluations of Electronic Event-Based Biosurveillance Systems. PLoS ONE 9(10): e111222. https://doi.org/10.1371/journal.pone.0111222
Editor: Joel M. Montgomery, Global Disease Detection-Kenya, United States of America
Received: February 20, 2014; Accepted: September 12, 2014; Published: October 20, 2014
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Funding: This project was funded using internal operational funds of the Armed Forces Health Surveillance Center. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Approximately 65% of the world’s first news about infectious disease events comes from informal sources, such as the internet, and almost all major outbreaks investigated by the World Health Organization (WHO) are first identified through these informal sources –. Electronic event-based biosurveillance uses information on events impacting human health or the economy from internet sources, simultaneously incorporating diverse streams of data . Electronic event-based biosurveillance systems (EEBS’s) are an increasingly important source of epidemiologic intelligence –.
Rapidly expanding worldwide access to the internet has fueled an increase in the number, and popularity, of EEBS’s. There are several benefits to these new forms of surveillance. Many EEBS’s allow citizens to report public health events via social media platforms or electronic communication channels independently of governments. Governments are no longer in sole control of their public health information, making it substantially harder to hide or delay outbreak or event reports . Additionally, since protocols and confirmatory testing requirements do not delay the reports, they are considerably timelier than traditional surveillance sources . Another beneficial aspect of EEBS’s is that many are publicly accessible. All subscribers have equally timely access to breaking reports regardless of their public health affiliations.
However, the same aspects of EEBS’s that make them important new surveillance tools also may make them less reliable tools. Because many sources of data are not verified by public health professionals, these systems are prone to noise and false alarms . Several researchers have commented that EEBS’s especially tend to lack specificity in their alerts and reports –. Many EEBS’s also face challenges in interoperability, scalability, population coverage, and interface customizability .
There have been evaluations of individual EEBS’s, but there have not been structured evaluations of multiple EEBS’s, or a comprehensive assessment of all EEBS evaluations. Our objective was to assess evaluations of EEBS’s, and to recommend criteria for future evaluations. Our findings may help guide future EEBS development to most effectively exploit web-based information for biosurveillance.
We consulted an EEBS inventory  and biosurveillance experts (via informal queries to staff within our organization) to identify EEBS’s that use publicly available internet information sources, include events that impact human health, and have global scope. We excluded systems that did not include infectious disease events.
To construct an evaluation framework for EEBS’s, we reviewed the Centers for Disease Control and Prevention (CDC) surveillance system evaluation guidelines  and CDC evaluation guidelines for outbreak detection systems . From those guidelines we selected evaluation variables highlighted as of primary importance in one of the guidelines, or mentioned as of secondary importance in both guidelines. We combined variables that are highly similar, narrowing the list to 17 variables: acceptability, accessibility, cost, data quality, flexibility, population coverage, predictive value positive, purpose, portability, representativeness, resources needed, sensitivity, simplicity, stability, timeliness, usefulness, and validity.
We searched PubMed and Google Scholar for publications with the name of one or more of the included EEBS’s in any search field. We included structured evaluations of the systems as well as system descriptions if they discussed the system’s performance with respect to one of the evaluation variables, even if that discussion was not a structured evaluation. For each evaluation, we recorded the evaluation variables discussed for each EEBS, and determined whether the evaluation assessed the variable quantitatively or qualitatively.
We identified 11 EEBS’s meeting the inclusion criteria (Table 1) –,–,–. The oldest system was ProMed, founded in 1994 and the newest system was Geni-Db, founded in 2012. The systems used automation to varying degrees in extracting information from the internet, processing it, and producing reports or alerts. For example, some EEBS’s relied heavily on subject matter experts to assess reports from various sources (e.g., ProMED) or on manual translation by linguists with regional expertise (e.g., Argus); others used automated procedures for posting and mapping (e.g., HealthMap) or translation (e.g., GPHIN).
Older systems had more evaluations than the newer systems, with the exception of HealthMap, which ranked second for the most evaluations despite being founded in 2006. The median number of key variables assessed per EEBS was 8 (range, 3–15), with 6 evaluations assessing 7 or more key variables. Older systems were more likely to be reviewed in parallel with each other. There were two or fewer published evaluations on the GODsN, EpiSpider, MiTAP and Geni-Db systems.
Ten of 20 published evaluations contained quantitative assessments of at least one key variable, while the others mentioned evaluation variables but did not provide results reflective of a systematic assessment of those variables. Timeliness and purpose were assessed for 10 of 11 EEBS’s, while data quality and validity were assessed for 4 of 11 EEBS’s (Figure 1).
Nine evaluations assessed usefulness for 7 EEBS’s by citing instances where the EEBS detected an outbreak earlier than other surveillance systems, or by eliciting user feedback, but none identified specific public health decisions, actions, or outcomes resulting from EEBS outputs. No evaluations examined system stability, and only two systems were evaluated on cost.
Because of the lack of detail provided for evaluations on key variables, it was not possible to determine which EEBS’s have been most useful or which EEBS approaches are most promising.
We found a paucity of evaluation results for EEBS’s on key evaluation variables, with only half of published evaluations reporting quantitative assessments of at least one key variable. Many evaluations mentioned key variables only in passing, and did not present results suggesting that a systematic quantitative or qualitative assessment was performed.
Timeliness, a possible advantage of EEBS’s compared to traditional surveillance systems, was assessed for 10 of 11 EEBS’s, but data quality and validity, for which EEBS’s may face more challenges, were infrequently assessed (4 out of the 11 EEBS’s). Perhaps most importantly, no evaluations cited specific examples of public health decisions, actions, or outcomes resulting from EEBS alerts. To our knowledge, this is the first comprehensive assessment of the evaluation literature for EEBS’s, and provides a snapshot of the current knowledge of EEBS performance characteristics and overall usefulness.
We note two important limitations of this study. First, we focused on global-scale EEBS’s. While these may be of broad interest to the public health community, we cannot comment on the extent of local or regional-scale EEBS evaluation. Evaluations of these systems may provide useful lessons for global-scale EEBS’s. Second, we limited the assessment to peer-reviewed, published evaluations. This approach likely does not capture all EEBS evaluations, though some excluded evaluations may be difficult to access or of less-certain quality.
Future EEBS evaluations should identify and discuss critical indicators of public health utility, using quantitative or qualitative approaches to assess the usefulness of EEBS’s in guiding public health action. They should also assess the novel aspects of EEBS’s compared to traditional surveillance approaches, and include variables of special interest to potential EEBS users such as policy readiness, number and geographic profiles of users, number of sources, system redundancy, and input/output geography ; explore benefits of participatory biosurveillance and analytical tools, which some systems offer ; and consider ways of integrating outputs of various EEBS’s to combine their respective strengths. Initial investigations into the effects of combining systems by Barboza et al.  have concluded that significant value can be added and synergistic effects can be observed. Further investigations into the value added by combining systems need to be explored, particularly any improvements in sensitivity, predictive value positive and usefulness.
We urge developers and users to conduct and publish evaluations of EEBS’s. While they clearly offer powerful biosurveillance capabilities complementing traditional surveillance approaches, further indications of how and under what circumstances they are most useful, based on real-world experience, could advance EEBS development and effective integration into public health programs.
Disclaimer: The views are those of the authors, and do not necessarily reflect those of the Department of Defense or the Federal Emergency Management Agency.
Conceived and designed the experiments: KNG AEP JPC. Performed the experiments: KNG JPC. Analyzed the data: KNG JAP JPC. Contributed reagents/materials/analysis tools: RAC JAP KLR. Wrote the paper: KNG AEP RAC JAP KLR JPC.
- 1. Heymann DL, Rodier GR (2001) Hot spots in a wired world: WHO surveillance of emerging and re-emerging infectious diseases. Lancet Infect Dis 1: 345.
- 2. Keller M, Blench M, Tolentino H, Freifeld CC, Mandl KD, et al. (2009) Use of unstructured event-based reports for global infectious disease surveillance. Emerg Infect Dis 5: 689.
- 3. Tsai FJ, Tseng E, Chan CC, Tamashiro H, Motamed S, et al. (2013) Is the reporting timeliness gap for avian flu and H1N1 outbreaks in global health surveillance systems associated with country transparency? Global Health 9: 14.
- 4. Hartley D, Nelson N, Walters R, Arthur R, Yangarber R, et al. (2010) Landscape of international event-based biosurveillance. Emerg Health Threats J 3: 19.
- 5. Chan EH, Keller M, Sonricker AL, Freifeld CC, Brownstein JS (2011) Large-Scale Evaluation of Informal Online Reporting for Outbreak Detection. Children’s Hospital Informatics Program, Children’s Hospital Boston, Boston, United States.
- 6. Woodall JP (2001) Global surveillance of emerging diseases: the ProMED-mail perspective. Cad Saude Publica 17: 1037.
- 7. Deshpande A, Brown MG, Castro LA, Daniel WB, Generous EN, et al.. (2013) A systematic evaluation of traditional and non-traditional data systems for integrated global biosurveillance – final report. Los Alamos National Laboratory.
- 8. German RR, Lee LM, Horan JM, Milstein RL, Pertowski CA, et al. (2001) Updated guidelines for evaluating public health surveillance systems: Recommendations from the Guidelines Working Group. MMWR Recomm Rep 50 (RR-13): 1.
- 9. Buehler JW, Hopkins RS, Overhage JM, Sosin DM, Tong V (2004) Framework for evaluating public health surveillance systems for early detection of outbreaks: Recommendations from the CDC working group. MMWR Recomm Rep 53 (RR-5): 1.
- 10. Morse SS (2012) Public health surveillance and infectious disease detection. Biosecur Bioterror 10: 6.
- 11. Khan SA, Patel CO, Kukafka R (2006) GODSN: Global News Driven Disease Outbreak and Surveillance. AMIA Annu Symp Proc 2006: 983.
- 12. Torii M, Yin L, Nguyen T, Mazumdar CT, Liu H, et al. (2011) An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics. Int J Med Inform 80: 56.
- 13. Collier N (2012) Uncovering text mining: a survey of current work on web-based epidemic intelligence. Glob Public Health 7: 731.
- 14. Collier N, Doan S, Kawazoe A, Goodwin RM, Conway M, et al. (2008) BioCaster: detecting public health rumors with a Web-based text mining system. Bioinformatics 24: 2940.
- 15. Lyon A, Nunn M, Grossel G, Burgman M (2012) Comparison of web-based biosecurity intelligence systems: BioCaster, EpiSpider and HealthMap. Transbound Emerg Dis 59: 223.
- 16. Collier N, Doan S (2012) GENI-DB: a database of global events for epidemic intelligence. Bioinformatics 28: 1186.
- 17. Mykhalovskiy E, Weir L (2006) The Global Public Health Intelligence Network and early warning outbreak detection: a Canadian contribution to global public health. Can J Public Health 97: 42.
- 18. Wilson K, Brownstein JS (2009) Early detection of disease outbreaks using the Internet. CMAJ 180: 829.
- 19. Brownstein JS, Freifeld CC, Reis BY, Mandl KD (2008) Surveillance Sans Frontieres: Internet-based emerging infectious disease intelligence and the HealthMap project. PLoS Med 5: e151.
- 20. Freifeld CC, Mandl KD, Reis BY, Brownstein JS (2008) HealthMap: global infectious disease monitoring through automated classification and visualization of Internet media reports. J Am Med Inform Assoc 15: 150.
- 21. Yangarber R, Steinberger R (2009) Automatic epidemiological surveillance from on-line news in MedISys and PULS. Proceedings of IMED-2009: International Meeting on Emerging Diseases and Surveillance.
- 22. Damianos L, Ponte J, Wohlever S, Reeder F, Day D, et al. (2002) MiTAP for biosecurity: a case study. Al Magazine 23: 13.
- 23. Corley CD, Lancaster MJ, Brigantic RT, Chung JS, Walters RA, et al. (2012) Assessing the continuum of event-based biosurveillance through an operational lens. Biosecur Bioterror 10: 131.
- 24. Hartley DM, Nelson NP, Arthur RR, Barboza P, Collier N, et al. (2013) An overview of internet biosurveillance. Clin Microbiol Infect 19: 1006.
- 25. Barboza P, Vaillant L, Mawudeku A, Nelson NP, Hartley DM, et al. (2013) Evaluation of epidemic intelligence systems integrated in the early alerting and reporting project for the detection of A/H5N1 influenza events. PLoS ONE 8: e57252.