Guidelines for Accurate and Transparent Health Estimates Reporting: the GATHER statement

Gretchen Stevens and colleagues present the GATHER statement, which seeks to promote good practice in the reporting of global health estimates.


Introduction
Global, regional, national, and subnational data for population health indicators are needed to monitor health and to guide resource allocation.However, health data are rarely available for every population and year, and in some cases there are discrepancies in available measurements.Additionally, differences in measurement methods mean that data might not be comparable over time or across populations.
Because of these data gaps and measurement challenges, incomplete data together with statistical or mathematical models are often used to calculate estimates of health indicators.These estimates are used by government officials, non-governmental organizations and funding agencies to make comparisons among populations, to track changes over time-e.g., to monitor progress toward targets such as the Sustainable Development Goals-and to obtain a comprehensive picture of causes of death, burden of disease, or risks to health [1,2].The available data and analysis methods used to produce estimates often have features or assumptions that affect their interpretation.In recent years, diverse data sources and statistical models of increasing flexibility and sophistication have been used to calculate health estimates.Some have raised questions about whether the data search, access, and inclusion process is sufficiently rigorous [3], and whether users understand the complex methods often used to derive estimates [4].Others have argued that discrepancies in estimates can lead to confusion-e.g., whether changes were a result of true epidemiological change or of a new method of analysis-and might lead to rejection of estimates [1].
Accurate interpretation and responsible use of health estimates requires understanding of the input data on which estimates were based, including their quality, and of the methods used to derive the estimates from the input data [4][5][6][7].The need for guidelines for reporting of health estimates was a key conclusion of World Health Organization (WHO) expert meetings in February and December, 2013, which were the impetus for the present set of guidelines [8].

Development of the Guidelines for Accurate and Transparent Health Estimates Reporting
To meet this challenge, the GATHER working group was convened by WHO in 2014, with the aim to define and promote good practice in reporting of global health estimates.The working group's approach was based on published guidance for developing reporting guidelines [9].All members of WHO's Reference Group on Global Health Statistics were invited to join the working group; other experts and journal editors with complementary expertise were sought and invited to join.The working group consists of practitioners, including statisticians, from academia and WHO, journal editors, representatives of the EQUATOR network [10], and members of existing guideline steering groups.The working group reviewed existing reporting guidelines for relevance to global health estimates and sought guidance from experts who had previously developed reporting guidelines.The group determined that existing reporting guidelines would not ensure adequate reporting of global health estimates.
On the basis of the review of existing guidance and reporting guidelines [11][12][13][14][15][16] and of input from working group members, we generated a comprehensive list of potential reporting items.We subsequently sought feedback from a broader community of researchers and users of estimates through an online survey between January and February, 2015.Working group members distributed the survey to their respective networks.We received 118 responses (further details are available on the GATHER website: gather-statement.org).The responses were compiled, summarized and presented at a 2-day consensus meeting held in London, UK, in February, 2015.
The primary objective of the working group consensus meeting was to agree on the list of items that should be reported whenever health estimates are published.During the meeting, reporting items were evaluated in light of the responses to the online survey, and working group members agreed to retain, omit, or combine items to generate the checklist in Table 1.
The GATHER working group and the responses to the online survey, both drawn from our networks of collaborators, were dominated by residents of high-income countries.We therefore sought additional feedback from a geographically diverse group of stakeholders-including 130 country focal points for WHO mortality estimates-by sharing an earlier version of this statement before publication.We revised this statement based on the feedback received.

Aim of GATHER
GATHER aims to define and promote good practice in reporting of global health estimates.Reporting of estimates should serve the needs of their two primary audiences: decision makers Table 1.GATHER checklist of information that should be included in reports of global health estimates.

# Checklist item
Objectives and funding 1 Define the indicator(s), populations (including age, sex, and geographic entities), and time period(s) for which estimates were made.

2
List the funding sources for the work.

Data inputs
For all data inputs from multiple sources that are synthesized as part of the study: 3 Describe how the data were identified and how the data were accessed.

4
Specify the inclusion and exclusion criteria.Identify all ad-hoc exclusions. 5 Provide information about all included data sources and their main characteristics.For each data source used, report reference information or contact name/institution, population represented, data collection method, year(s) of data collection, sex and age range, diagnostic criteria or measurement method, and sample size, as relevant. 6 Identify and describe any categories of input data that have potentially important biases (e.g., based on characteristics listed in item 5).
For data inputs that contribute to the analysis but were not synthesized as part of the study: Describe and give sources for any other data inputs.

For all data inputs:
8 Provide all data inputs in a file format from which data can be efficiently extracted (e.g., a spreadsheet rather than a PDF), including all relevant meta-data listed in item 5.For any data inputs that cannot be shared because of ethical or legal reasons, such as third-party ownership, provide a contact name or the name of the institution that retains the right to the data.

Data analysis 9
Provide a conceptual overview of the data analysis method.A diagram may be helpful.
10 Provide a detailed description of all steps of the analysis, including mathematical formulae.This description should cover, as relevant, data cleaning, data pre-processing, data adjustments and weighting of data sources, and mathematical or statistical model(s).
11 Describe how candidate models were evaluated and how the final model(s) were selected.
12 Provide the results of an evaluation of model performance, if done, as well as the results of any relevant sensitivity analysis.
13 Describe methods of calculating uncertainty of the estimates.State which sources of uncertainty were, and were not, accounted for in the uncertainty analysis.
14 State how analytic or statistical source code used to generate estimates can be accessed.

Results and discussion
15 Provide published estimates in a file format from which data can be efficiently extracted.
16 Report a quantitative measure of the uncertainty of the estimates (e.g., uncertainty intervals).17 Interpret results in light of existing evidence.If updating a previous set of estimates, describe the reasons for changes in estimates.
18 Discuss limitations of the estimates.Include a discussion of any modelling assumptions or data limitations that affect interpretation of the estimates. doi:10.1371/journal.pmed.1002056.t001 and researchers.Decision makers include planners, policy makers, and monitoring staff in governments, as well as global, regional, and national public health experts, funding agencies, and civil society organizations.These users need information about data sources and analysis methods, including key assumptions and limitations, in a way that is accessible without advanced training in statistics.They also need an explanation of how new estimates compare to previously published estimates, including why they differ.Researchers require a higher degree of detail about methods, so that they can fully understand and potentially reproduce studies and advance methods.The GATHER checklist includes only the minimum essential reporting items to serve these audiences; other good practices in reporting are recommended in the accompanying explanation and elaboration document available on the GATHER website.Compliance with GATHER is not an indicator of a study's quality [12,17,18].Rather, it ensures that key information is available so that an informed researcher can judge the study's quality and increases the chance that the study results will be used appropriately by decision makers.Improvements in reporting may incidentally improve quality, because the reporting required for compliance with GATHER could assist analysts in identifying errors or improving methods.

Scope of GATHER
GATHER defines best practices for documenting studies that report global health estimates.Global health estimates include all quantitative population-level estimates (including global, regional, national, or subnational estimates) of health indicators, including indicators of health status such as estimates of total and cause-specific mortality, incidence and prevalence of diseases, injuries, and disability and functioning; and indicators of health determinants, including health behaviours and health exposures (Box 1).
GATHER aims to define best practices for reporting of studies that synthesize information from multiple sources to quantitatively describe past and current population health and its determinants.These studies include comparisons among multiple populations, over time or by place of residence.GATHER covers reporting of studies that disaggregate disease and injuries by underlying cause as defined by a classification system such as the International Classification of Disease (ICD) as well as those that attribute disease and injury to their determinants, e.g., the number of deaths attributable to tobacco smoking.These reporting guidelines were not designed for reports of a health indicator from a single study or data source, such as a health survey or health service records for a single period.
Health determinants can range from proximal determinants of health, such as behaviours like tobacco smoking that have a direct effect on incidence of disease and mortality, to intermediate determinants of health, such as availability of essential medicines, to distal determinants of population health, such as wealth inequality.Of the universe of health determinants, these reporting guidelines were developed for estimates of health behaviours and health exposures [19].They were not designed for service coverage indicators, nor were they designed for health systems indicators, such as those related to health financing or health workforce.The guidelines were also not designed for estimates of distal determinants of health, such as average educational attainment or wealth inequality.Nevertheless, researchers preparing health estimates that do not fall in the scope of GATHER might find GATHER useful when documenting their study.In particular, a commitment to documenting all data inputs and analysis methods should be a universal feature of published reports providing estimates designed for policy planning.

Overview of GATHER
GATHER comprises a checklist of 18 items that are essential for best reporting practice (Table 1).An electronic version of the checklist and a more detailed explanation and elaboration document, describing the interpretation and rationale of each reporting item along with examples of good reporting, are available on the GATHER website.
Global health estimates are regularly published in scientific journals and in reports of intergovernmental agencies and non-governmental organizations.The GATHER checklist is designed to be flexible enough to be used for both types of publication.The items in the checklist are organized into four sections: (1) objectives and funding, (2) data inputs, (3) data analysis, and (4) results and discussion.Data inputs are further disaggregated into two groups: data inputs that were synthesized as part of the study (usually the health indicator being estimated), and data inputs from another source or study that contributed to the analysis, but were used without modification (if any; common data inputs of this type are population data or covariates such as average educational attainment or per capita gross domestic product).Methods of data Box 1: Definitions of technical terms.
Health indicator: A measureable quantity that can be used to describe a population's health or its determinants.Indicators can be categorized into four domains: health status (e.g., life expectancy, HIV prevalence), risk factors (e.g., childhood stunting, prevalence of smoking), service coverage (e.g., immunization coverage rate), or health systems (e.g., hospital bed density, death registration coverage).[19] Health estimates: Quantitative population-level estimates (including global, regional, national, or subnational estimates) of health indicators, including indicators of health status such as estimates of total and cause-specific mortality, incidence and prevalence of diseases, injuries, and disability and functioning; and indicators of health determinants, including health behaviours and health exposures.Examples of health indicators that fall within the scope of GATHER include life expectancy, disability-adjusted life-years by cause, under-five mortality rate, maternal mortality ratio, mortality rate from road traffic injuries, HIV incidence, prevalence of stunting in children younger than 5 years, prevalence of current tobacco use, prevalence of obesity in adults, and condom use among sex workers.Data inputs: All numerical inputs to mathematical or statistical models that are used to generate global health estimates.Model inputs may include raw health data, processed health data, covariates, and other parameters.Raw health data are measures derived from primary data collection with no adjustments or corrections.Processed health data are health statistics that have been calculated from raw health data, but which are not the result of synthesizing multiple data sources.Examples of processing raw health data include cleaning data by removing implausible values, calculating an indicator with an algorithm, or adjusting a statistic for bias.
Covariates: Data, including non-health data, which are used in a statistical model to improve the estimation of the health indicator of interest.These data are population-specific and are available for every population included in the analysis.A common covariate is gross domestic product per capita.analysis range from a simple averaging of available data to computationally intensive multistep processes that cannot be run on a standard desktop computer.The reporting items described here are appropriate for all data analysis methods, regardless of their complexity.Importantly, any method of synthesizing available data to make estimates for a population relies on a model and should be reported accordingly.
In most cases, full reporting of a new set of estimates will not be possible in the main text of an article or a report.Rather, authors will have to make use of online appendices to ensure complete reporting as prescribed by the GATHER Statement.Whether the required materials appear in the main text or in an appendix will depend on the purpose and audience of the report, and we therefore leave this decision to the authors' and editors' discretion.

Implications and limitations
We propose the GATHER checklist as a tool to be used by authors, reviewers, and journal editors, in order to promote best practices in reporting global health estimates.In this statement, we have presented the development, aim, and an overview of the guidelines.Users of the guidelines should refer to the GATHER website for further explanation and examples of good reporting for specific items.
GATHER considers open access to data inputs and access to analytical or statistical source code to be best practice in reporting.Recent reports on waste in research have highlighted that full documentation of research, including protocols for sharing data and code, increase the value of research that is undertaken [20,21].Funding agencies [22,23] and journals [24,25] are increasingly requiring that researchers make input data and, in some cases, source code available.In line with these requirements, GATHER considers that data underlying health estimates should be accessible online, except in situations, such as third party ownership, when this is not possible.We nonetheless acknowledge that requiring open access to data inputs might require additional resources for documentation and archiving of data resources.
Sharing source code also involves an investment of resources, especially if the code is fully documented and available online for off-the-shelf use.Sharing code often leads to requests for technical assistance from users, which are time-consuming and typically unfunded.Despite these challenges, in view of the use of global health estimates for policy prioritization and funding allocation, we consider availability of code to be essential.Given that researchers are not necessarily resourced for sharing code, we consider that a minimum would be for researchers to share key segments of code and that they should not be held responsible for providing user support.Moving forwards, we hope that funding agencies and researchers will consider open access to data and code to be an integral part of any project, and that future studies will be planned and funded accordingly.
GATHER also requires that authors report a quantitative measure of the uncertainty associated with global health estimates, such as uncertainty intervals.Global health estimates are usually affected by multiple sources of error, such as measurement error during data collection, inability to register all cases or obtain a truly random sample, errors in adjusting input data for sources of bias, and the use of a model to calculate estimates [26,27].Users of these estimates should be informed about their overall uncertainty.Best practices for calculation of uncertainty intervals, and especially for combining multiple sources of uncertainty, are an area of active research.By requiring that researchers report a quantitative measure of uncertainty, and that they state which sources of uncertainty are accounted for, we aim to advance science in this area.
The field of global health estimates is rapidly evolving because of increasing availability of health data and innovation in statistical methods.The reporting guidelines presented here are designed to be flexible enough to guide reporting of estimates regardless of the underlying data availability and the complexity of the statistical methods.We anticipate that, as experience with these guidelines accumulates, methods and data evolve, and suggestions for improvements are made, GATHER will evolve as well.The explanation and elaboration document, available on the GATHER website, will be a living document that will be updated and clarified as needed, based on accumulated experience using the guidelines.We encourage submission of users' comments via our website.