The Availability and Consistency of Dengue Surveillance Data Provided Online by the World Health Organization

Background The use of high quality disease surveillance data has become increasingly important for public health action against new threats. In response, countries have developed a wide range of disease surveillance systems enabled by technological advancements. The heterogeneity and complexity of country data systems have caused a growing need for international organizations such as the World Health Organization (WHO) to coordinate the standardization, integration, and dissemination of country disease data at the global level for research and policy. The availability and consistency of currently available disease surveillance data at the global level are unclear. We investigated this for dengue surveillance data provided online by the WHO. Methods and Findings We extracted all dengue surveillance data provided online by WHO Headquarters and Regional Offices (RO’s). We assessed the availability and consistency of these data by comparing indicators within and between sources. We also assessed the consistency of dengue data provided online by two example countries (Brazil and Indonesia). Data were available from WHO for 100 countries since 1955 representing a total of 23 million dengue cases and 82 thousand deaths ever reported to WHO. The availability of data on DengueNet and some RO’s declined dramatically after 2005. Consistency was lacking between sources (84% across all indicators representing a discrepancy of almost half a million cases). Within sources, data at high spatial resolution were often incomplete. Conclusions The decline of publicly available, integrated dengue surveillance data at the global level will limit opportunities for research, policy, and advocacy. A new financial and operational framework will be necessary for innovation and for the continued availability of integrated country disease data at the global level.


Introduction
Threats to public health around the world have become increasingly complex and the importance of high quality disease surveillance for preparedness and disease control will continue to grow [1].Scientific progress and global cooperation against emerging threats will depend on the availability and sharing of disease surveillance data between countries.Global health and funding agencies emphasized this in an appeal for greater availability and use of data for global health [2,3].Formally, the 2005 International Health Regulations require the use and sharing of data in response to new threats [4,5].The central role of the World Health Organization (WHO) in global disease surveillance and data dissemination has been stated in World Health Assembly resolutions for specific diseases [6].
The WHO has developed various data systems to integrate and disseminate country surveillance data such as the Global Health Observatory [7], the Global TB Database [8], DengueNet [9], RabNet [10] and FluNet [11].In addition to these global databases, WHO Regional Offices (RO's) also provide disease surveillance data through their websites to inform member countries on disease patterns and trends in their region.Increasingly, country Ministries of Health post their own disease surveillance data online for their constituency, mostly in the form of epidemiological bulletins but sometimes using sophisticated online data repositories such as those developed by Brazil and Indonesia [12,13].The public availability of disease surveillance data from various heterogeneous sources provides new opportunities for research, training, and policy making but can also lead to confusion on data discrepancies between sources.Limited information on methodology used at various steps along the data trail from within countries to the global level has further complicated this data landscape.Although it is generally known that surveillance data reported by different agencies may not be identical due to reporting methods and definitions, few studies have quantified the availability and consistency of publicly available disease surveillance data across sources.This information can guide policy makers, scientists, students, and others to use available data more effectively.We used the example of dengue to assess the availability and consistency of surveillance data provided online by WHO.We also provided examples of online data provided by the Ministry of Health of Brazil and Indonesia.

Methods
We extracted all online dengue surveillance data from WHO (WHO DengueNet [9] and from the websites of the Pan American Health Organization (PAHO) [14], the WHO Southeast Asia Regional Office (SEARO) [15] and the WHO Western Pacific Regional Office (WPRO) [16,17]), and by the Ministries of Health (MOH) of Brazil and Indonesia [12,13].Brazil and Indonesia were selected as examples because they provided open access to detailed dengue surveillance data online in computer readable format.All available data up to April 12 th 2013 were extracted at the highest possible spatiotemporal resolution.To obtain standardized data across these sources, we extracted data for all ages and for both genders combined.We did not extract serotype specific data because these were minimally available.We standardized indicators reported by different sources across spatial and temporal scales and also harmonized country names using the United Nations ISO country name standard (ISO 3166) [18].We assessed the availability of dengue data from each source and also measured data consistency between and within sources.We defined consistency between sources as the percent agreement of data reported for overlapping countries and time periods.We defined consistency within a source by the percent agreement of indicators that were recomputed by us from data within the source and the corresponding indicators provided by the same source.
All data used in this study are made publicly available through the University of Pittsburgh Project Tycho online data system (www.tycho.pitt.edu).
Each source provided counts for a range of different indicators (Fig 2).Data for "all" dengue cases (dengue fever and dengue hemorrhagic fever combined) and "all" dengue deaths were available from DengueNet and all RO's.Data for DHF cases were predominantly from PAHO, few counts were from WPRO and none from SEARO.Across time, DengueNet provided counts for the longest time period (1955-2011) compared to SEARO (1985SEARO ( -2006))

Consistency across data sources
We compared data from DengueNet and RO's to assess consistency across sources (Table 2 and S2 Fig) .The overall percent agreement was 83.8% across all indicators.Data from SEARO were the most consistent with a percent agreement of 92.2% and data from WPRO were the least consistent at 72.3%.Data for DHF cases were more consistent compared to the other indicators at 92.4% compared to 76.1% for "all" cases and 89.4% for "all" deaths.DengueNet values for all indicators were generally lower compared to values from RO's.In total, DengueNet reported 426,808 fewer "all cases", 17,854 fewer DHF cases, and 245 fewer deaths compared to RO's (Table 2).

Internal consistency of DengueNet data
We recomputed the number of "all" cases for DengueNet from separately reported dengue fever (DF) and DHF cases.We also recomputed the case fatality rate (CFR) for DengueNet from reported cases and deaths.Our recomputed data for "all" cases corresponded with 98.9% of original values and for CFR with 99.5% (Table 3).We also recomputed the annual number of dengue cases at country level from monthly cases at the provincial level (in DengueNet, data were either reported at the country level by year or at the provincial level by month).Our recomputed annual country level data for "all" cases was > 3 million cases lower compared to reported data at that level.The recomputed values for "all" deaths were about 2000 deaths lower compared to reported mortality at country level by year.This discrepancy was likely due to missing data at the lower administrative levels.We found that provincial level data were not available for all calendar months in years before 1997 and after 2004 (S3 Fig) .In addition, provincial level data for countries were only available for a median of 3.5% of provinces before 1996 and for 85.7% of provinces after 1996 (using The Second Administrative Level Boundaries data set project (SALB) [19] for the expected number of provinces per country).

Data provided online by countries
We also assessed dengue surveillance data provided online by the Ministries of Health of Brazil and Indonesia (Fig 4).Both these countries are dengue endemic and have developed online databases that provide publicly available dengue surveillance data.The annual number of "all" cases reported by the Brazil and Indonesia MOH corresponded to WHO data for most years except 2008 (Brazil) and 2000/2004 (Indonesia).No WHO data were available for Indonesia after 2005.We found discrepancies within the data provided by the MOH of Indonesia for years after 2007.Our recomputed number of cases per year at the country level from reported provincial data (1 st administrative level) was higher than country level values recomputed from district data (2 nd administrative level).This suggested that data from lower administrative levels were incomplete.

Discussion
We integrated publicly available online dengue surveillance data from various WHO and country sources to describe the availability and consistency of globally available dengue surveillance data.We found that consistency of overlapping data between DengueNet and WHO Regional Offices was lacking and that data at subnational levels were often incomplete.This incompleteness was difficult to recognize since the absence of data for provinces or districts was not indicated explicitly.DengueNet systematically reported lower values compared to the RO's.This may be due to a difference in timing of data reports made by countries and a lack of updating DengueNet as countries updated their figures.
DengueNet was created by the WHO Headquarters in 2002 as part of the Global Health Atlas [9,20].Focal points were appointed and trained in every country to upload standardized reports into the DengueNet repository [21].This has successfully led to public sharing of dengue data across countries through a central global repository.In addition to DengueNet, RO's also routinely release dengue surveillance data from member countries through their websites.
PAHO and SEARO provide links to surveillance data sheets in PDF format and WPRO has developed an online Health Information and Intelligence Platform (HIIP).The WHO is the only source of integrated disease surveillance data across countries.Numerous studies have used WHO dengue surveillance data to describe trends and patterns of this disease at the global [22,23,24] and regional level [25,26,27].Despite their role as a core resource for international dengue surveillance data, DengueNet and some RO data have not been regularly updated over the past decade, most likely due to capacity and funding constraints.With the decline of WHO as a central global resource for dengue surveillance data, the data landscape will become increasingly scattered and difficult to navigate.Other agencies or institutes can contribute additional capacity or alternative frameworks for global disease surveillance data may be needed, such as a distributed network instead of centralized databases.Increasingly, individual countries disseminate their own disease surveillance data online in various formats ranging from epidemiological bulletins to sophisticated databases.This has greatly advanced the availability of disease data at the global level.In 2010 the 63 rd World Health Assembly stated that "the WHO urges member states to improve the collection of reliable health information and data and to maximize, where appropriate, their free and unrestricted availability in the public domain" [28].Country data systems however use a large diversity of surveillance methodology and definitions that often lack detailed documentation.The potential biases and lack of comparability of data across countries are limiting the efficient use of these data.The reporting process of dengue surveillance data from countries to WHO also lacks detailed documentation and may vary across countries.Future research should formally compare country data systems and country vs. WHO data to gain more insight in potential biases of the various sources.A standardized and curated global data system can maximize opportunities for the efficient use of country disease data for science and policy.Data standardization and curation are essential for a global data system.For example we found that ~16% of country names in DengueNet were different from country names used by the RO's (S1 Table ).Across all WHO sources, ~19% of country names were different from the UN ISO standard for country names [19].In the absence of up-to-date global platforms for disease surveillance data, alternative data systems have emerged such as Google Dengue and Flu Trends and the HealthMap project that automatically integrate data from search queries or online news items respectively [29,30,31].Innovative technological solutions and capacity used by these projects should be applied to integrate country disease surveillance data as well to establish a state-of-the-art 21 st century global data system.This system can be coordinated by WHO but can be implemented by external institutes that have already created large scale public health data systems such as the Institute of Health Metrics and Evaluation, the Malaria Atlas Project, or Project Tycho.
A new and sustainable framework will be required to ensure that integrated and curated disease surveillance data from countries around the world will continue to be available to stakeholders at all levels.Innovative technology should be used for data integration that minimizes the burden on countries but maximizes data availability and use.Academic and private sector partners should step up to support international agencies with this increasingly complex mandate.
, WPRO (2000-2011), and PAHO (1995-2012) (Fig 3 and S1 Fig).Across sources, data for "all" cases were provided for the longest time periods, followed by mortality data.Data for DHF counts were available for the shortest time periods (Fig 3 and S1B Fig).In general, many counts were missing across years and countries.

Fig 1 .
Fig 1. Number of counts per country available from online WHO sources: 1955-2012.A count was defined as a reported value for an indicator, e.g. the reported number of cases for one month and location would be one count.The number of counts available per country (indicated in different colors) was determined by the spatiotemporal resolution of data, the number of indicators reported, and the length of the time period reported.doi:10.1371/journal.pntd.0003511.g001

Table 1 .
The cumulative number (in thousands) of "all" dengue cases, DHF cases and "all" deaths per WHO Region.

Table 2 .
Consistency between DengueNet and WHO Regional Office data.
* Percent of pairs with identical values between DengueNet and Regional Office.† Number of matched pairs excluding missing values.‡ Sum of differences between pairs (DengueNet minus RO).doi:10.1371/journal.pntd.0003511.t002

Table 3 .
Consistency of data within DengueNet, measured by comparing recomputed with reported indicators.
‡ Recomputed data minus reported data.doi:10.1371/journal.pntd.0003511.t003Fig4. The total number of dengue cases reported for Brazil (A) and Indonesia (B) by different sources.The annual number of "all" cases reported for the entire country from DengueNet, the RO, and the Ministry of Health (MOH).Country level data reported by the MOH was derived from provincial (admin1) and district (admin2) level data provided online.doi:10.1371/journal.pntd.0003511.g004