Are cause of death data fit for purpose? evidence from 20 countries at different levels of socio-economic development

Background and objective Many countries have used the new ANACONDA (Analysis of Causes of National Death for Action) tool to assess the quality of their cause of death data (COD), but no cross-country analysis has been done to verify how different or similar patterns of diagnostic errors and data quality are in countries or how they are related to the local cultural or epidemiological environment or to levels of development. Our objective is to measure whether the usability of COD data and the patterns of unusable codes are related to a country’s level of socio-economic development. Methods We have assessed the quality of 20 national COD datasets from the WHO Mortality Database by assessing their completeness of COD reporting and the extent, pattern and severity of garbage codes, i.e. codes that provide little or no information about the true underlying COD. Garbage codes were classified into four groups based on the severity of the error in the code. The Vital Statistics Performance Index for Quality (VSPI(Q)) was used to measure the overall quality of each country’s mortality surveillance system. Findings The proportion of ‘garbage codes’ varied from 7 to 66% across the 20 countries. Countries with a high SDI generally had a lower proportion of high impact (i.e. more severe) garbage codes than countries with low SDI. While the magnitude and pattern of garbage codes differed among countries, the specific codes commonly used did not. Conclusions There is an inverse relationship between a country’s socio-demographic development and the overall quality of its cause of death data, but with important exceptions. In particular, some low SDI countries have vital statistics systems that are as reliable as more developed countries. However, in low-income countries, where most people die at home, the proportion of unusable codes often exceeds 50%, implying that half of all cause-specific mortality data collected is of little or no use in guiding public policy. Moreover, the cause of death pattern identified from the data is likely to seriously under-represent the true extent of the leading causes of death in the population, with very significant consequences for health priority setting. Garbage codes are prevalent at all ages, contrary to expectations. Further research into effective strategies deployed in these countries to improve data quality can inform efforts elsewhere to improve COD reporting systems.


Introduction
A key source of evidence for targeting health interventions to improve population health is high quality cause of death (COD) data that reliably document trends in mortality for different diseases and injuries [1]. However, several studies have demonstrated that policy and practice in many countries are based on data that are far from accurate [2][3][4][5][6]. In order to target efforts to improve the utility of COD data for policy, it is important to first understand what the key diagnostic errors are.
A major problem with COD data is poor cause of death certification practices that result in 'garbage codes', i.e. codes that provide little or no information about the true underlying cause of death [7]. Garbage codes include what are often called 'Ill-defined" causes, but encompass a larger universe of uninformative diagnoses. The major consequence of garbage codes is that they obscure the true mortality pattern in a population. For example, if a death certificate only states septicemia as the cause of death, there is no way of knowing whether this resulted, for example, from an infected wound following an accident, from a post-operative amputation due to diabetes, or from meningitis or pneumonia, each of which would require different preventive strategies. If the underlying cause that led to septicemia is not indicated on the death certificate, public policy to prevent these deaths would be misinformed, potentially leading to inefficient resource allocation to prevent them.
COD data provide the essential health intelligence for health policies across countries at various levels of socio-economic development. Our premise is that a better understanding of garbage codes, i.e. their levels and patterns in countries at different stages of socio-economic development, will help to target improvements in COD reporting systems. In this study, we investigate whether the usability of COD data and the patterns of garbage codes are related to a country's socio-economic development using the ANACONDA software tool [8,9,S1 File] to assess the quality of 20 national COD datasets. Several countries have used this tool to assess how fit for purpose their data are [10][11][12], but there has not been any cross-country analysis of data quality across a range of socio-economic development levels and COD reporting systems using the common ANACONDA framework.
The implication of our findings for public policy to improve population health is that if the relationship is found to be very weak, or non-existent, then efforts to improve national Civil Registration and Vital Statistics systems can expect to make significant progress towards improving the evidence base for policy without depending on further socio-economic development.

Data and methods
We carried out a cross-sectional study using publicly available data from the WHO Mortality Database [13], which contains COD data reported by its Member States. The 20 countries were selected on the basis that they used ICD-10 [14], had provided data to WHO for a relatively recent year (2012- 16), were located in all major regions of the world and differed in levels of socio-economic development. Population data were taken from the UN World Population Prospects 2017 [15], with the youngest age group of 0-4 years divided into 0-and 1-4-years age groups using Sprague's interpolation [16].
We classified a country's level of development, using the Socio Demographic Index (SDI) score from the Global Burden of Disease Study (GBD), into three levels: High, Middle and Low. The SDI is a summary measure of development expressed on a scale from 0 to 1 taking into account the total fertility rate, years of schooling, and gross national income [17]. For the lowest SDI level, the WHO Mortality Database only contained the few countries we selected; for the Middle and High SDI levels we selected countries with recent data from different regions of the world. Our study included 4 Low SDI countries, 10 Middle SDI, and 6 High SDI countries.
On the basis of the country specific ICD-10 codes used in GBD 2017 [18], the most severe certification and coding errors that can mislead policy and public health planning were identified and categorized into four groups. The four-tier garbage code typology used in ANA-CONDA is based on the premise that some garbage codes are worse than others depending on how serious their impact is for guiding or misguiding policy debates and will thus likely impact disease and injury control strategies differently [19]: • Level 1 (very high)-codes with serious policy implications. These are causes for which the true underlying COD could in fact belong to any of three broad cause group (i.e. it is impossible to establish whether the true cause was a communicable disease, a non-communicable disease or because of an injury, a good example being 'septicaemia' reported as the underlying cause of death). These are the most serious mis-diagnoses among the universe of unusable codes, since they could potentially grossly misinform understanding of the extent of epidemiological transition in the population.
• Level 2 (high)-codes with substantial implications for policy. These are causes for which the true underlying COD is likely to belong to one or two of the three broad cause groups; for example, 'essential (primary) hypertension'.
• Level 3 (medium)-codes with important implications for policy. These are causes for which the true underlying COD is likely to be within the same ICD chapter, for example, 'unspecified cancer', and thus are of some policy value.
• Level 4 (low)-codes with limited implications for policy. These are diagnoses for which the true underlying COD is likely to be confined to a single disease or injury category (e.g. unspecified stroke would still be assigned as a stroke death, and not to some other disease category). The implications of unusable causes classified at this level will therefore be much less important for public policy, but a more specific code would have increased their utility for specific public health analyses.
A full list of the composition of specific ICD-10 garbage codes for each of the four severity levels is given in S2 File.
Given the considerable differences in population age structure between countries at high and low levels of socio-economic development, we age standardised the pattern of garbage codes. The point of age-standardising was to investigate whether countries with a comparatively old age structure, and hence relatively high average age at death, might expect to have a greater fraction of garbage codes simply because of the higher likelihood of multiple co-morbidity in the elderly. We used the global age distribution of deaths from the latest GBD Study as the standard [20].
In addition to diagnostic accuracy, the ability of any dataset to describe the true mortality pattern in a population also depends on how complete it is, both in terms of capturing all deaths that occur, and in assigning each a COD. Completeness of the COD reporting (i.e. the percentage of actual deaths in a population that are assigned a COD) for each of the 20 countries was calculated using the Adair-Lopez empirical method incorporated into ANACONDA [21]. The empirical method models the relationship between the Crude Death Rate (CDR) and its principal determinants, namely the age structure of the population and the overall level of mortality, as reflected by the level of child mortality. The predicted CDR based on these input variables for a population is then compared with the observed CDR to estimate death registration completeness. Given that the model was built largely from historical data where the levels of adult mortality and child mortality are closely correlated, the predictions of completeness for populations where this assumption is not valid, such as those severely affected by HIV, should be interpreted cautiously.
Datasets that are both incomplete and have a high proportion of garbage codes provide limited insight into the true health status of a population. We combined the proportion of unrecorded deaths with the amount of garbage codes to provide a summary measure of the utility of the data for policy. This indicator is particularly important when investigating data quality in countries with low completeness where the data available may only come from hospitals and other health facilities where diagnostic facilities and physician availability is greater, potentially over-stating the policy utility of the data.
A key output of any mortality surveillance system is a table showing the leading causes of death for the population. In countries where garbage codes are commonly assigned, they frequently appear among the 10 or 20 leading causes and can seriously impact the overall utility of the COD data. This is particularly the case when they permeate the top 10 leading causes and are "high impact", providing little or no useful information for policy.
The ANACONDA software tool specifically developed for assessing quality of mortality and COD data, was used to investigate each dataset (S1 File) to identify the pattern and extent of garbage codes in the data, their frequency among the leading causes, the completeness of the dataset and to provide an overall summary index of the quality of the output of the mortality data system, namely the Vital Statistics Performance Index for Quality (VSPI(Q)) [22].

Results
The proportion of garbage codes in the 20 country datasets varied substantially by country, ranging from 7% to 66% (see Fig 1). While the relationship between SDI and amount of garbage codes in the data is broadly apparent, the relatively low R 2 (0.17) arises from the presence of outliers, particularly Uzbekistan, Kyrgyzstan, Nicaragua and Colombia. It is quite possible that specific certification and coding procedures have been introduced in these countries to avoid the use of garbage codes. If these countries are omitted, the strength of the inverse relationship is much more apparent. Further insights into the general characteristics of population and mortality of the selected countries can be found in S2 File.
The mean values for each SDI group indicate that, on average, countries of High SDI status had a lower (23.8%) proportion of garbage codes in their data than Middle (39.7%) and Low (40.0%) SDI countries (Table 1). Interestingly, there was large inter-country variation in the use of garbage codes within each SDI level. For example, of the High SDI countries, Finland had the lowest amount of garbage codes (7%) in their data, whereas in Japan (36%) and France (34%), the level was five times higher, affecting about one in three deaths. For the Middle SDI group, Argentina, Thailand and Tunisia had more than half of all CODs coded to a garbage code (in Thailand, close to 80% of these were high impact errors), while for other countries in this group, notably Uzbekistan, Turkey and Colombia, the use of garbage codes was much less prominent. Surprisingly, among the Low SDI countries, Kyrgyzstan and Nicaragua had comparatively low levels of garbage, (18% and 25%, respectively), whereas in Egypt, at a similar SDI level, two-thirds of all deaths are coded to garbage codes and of these, over 80% were of high impact (Table 1).
Importantly, the fraction of high impact garbage codes ranged from a low of 6% (Finland) to 57% (Egypt), and for low impact codes, from 1% (Finland) to 18% (Brazil). Countries with high socio-demographic development had a lower proportion of high impact garbage codes (mean 16%) compared with Middle (26%) and Low (29%) SDI countries (Table 1). Ranking countries according to their percent of high impact garbage codes reveals a very substantial gap between the best and worst performing COD information systems and a surprising mixture of SDI levels (Fig 2). Kyrgyzstan has the second-best performing system, after Finland, with COD data in Colombia of almost equal quality to the UK, and that in Nicaragua falling between Australia and Canada. Uzbekistan, Turkey, Brazil and Jordan all assign less causes to high impact garbage codes than both Japan and France.
The impact of population age structure on the overall level of garbage codes varied across countries but was generally small, contrary to what might have been expected (Table 1). In Japan and Argentina, the age-adjusted fraction of garbage codes was 4-6% points lower than the un-adjusted fraction, while in France and South Africa it increased by a similar amount (column a). Age-standardisation, therefore, had no impact on the mean level of garbage codes in each development category.
Age-standardisation, however, masks the age pattern of garbage coding, particularly its relative importance at younger adult ages where accurate and specific diagnoses are critical for guiding policies designed to prevent premature deaths. The perception that garbage codes are largely confined to deaths among the elderly due to the presence of co-morbidities at or around the time of death is not confirmed by the age-specific fractions of garbage codes shown

PLOS ONE
Are cause of death data fit for purpose?
in Jordan and Kyrgyzstan for both sexes, this pattern is already evident from age 5; in Uzbekistan, garbage coding is more common for deaths of children and adolescents than at ages 70 and above. Given that the utility of a country's mortality information system is reduced not only by the amount of garbage codes but also by how complete it is in terms of capturing all deaths, we measured their combined impact. For two countries, namely Jordan, and Tajikistan, the consolidated indicator of registration completeness and fraction of garbage codes was close to 75%, suggesting that useful information on only about one quarter of all deaths that occur in Jordan and Tajikistan is available for policy purposes (Col. C Table 1). Overall, the combined indicator showed the proportion of deaths for which information was either missing or of little or no value for guiding health policy was much higher in Low (49%) and Middle (48%) SDI countries compared to High (24%) SDI countries. Table 2 shows the strong influence of garbage codes on the leading cause distribution when these are ranked, with high impact garbage codes marked in red and low impact in orange. The presence of High impact garbage codes among the 10 leading causes of death will substantially distort the true picture of what are the common COD that most people die from. Among High SDI countries only Japan and France have high impact garbage codes (for males) among the ten leading causes of death. Canada, Australia and UK had only low impact causes in this category, and Finland had neither. For the Middle and Low SDI countries there were many more high impact garbage codes listed among the leading COD, with Egypt having seven, Tajikistan five, and Thailand and Tunisia each with four. Colombia and Nicaragua had none. For females, Egypt had seven, Iran, Tunisia and Tajikistan five, with the remaining countries having between one and three.
Notwithstanding some variation in patterns and volume of garbage coding across countries, the most common garbage codes were remarkably similar. In particular, Other ill-defined and unspecified causes of death, Senility, Heart failure, Unspecified neoplasm, Septicemia, Respiratory failure, Unknown cause of death, Hypertension and Unspecified diabetes were observed across all SDI levels. In addition, Low SDI countries tended to report Atherosclerosis, Hepatic failure, Intracerebral hemorrhage and Unattended deaths, all garbage codes, as leading causes.
Fig 4 ranks countries according to a single consolidated summary measure of system performance, namely the Vital Statistics Performance Index for Quality (VSPI(Q)). Eleven countries scored 70 and above, a level where they could be considered as having well-functioning systems. Six of the remaining countries achieved scores that would classify them as having medium performing systems, with lower scores mostly arising from the high proportion of garbage codes that bias their COD distributions. Of the 20 countries, only Jordan, Tajikistan, and Tunisia were classified as having poorly functioning systems.

Discussion
In general, one would expect that in High SDI countries, where all deaths are medically certified with good quality clinical and diagnostic services, comparatively few deaths would be assigned unusable (garbage) codes, and certainly much less than in countries where such services are less common. Our findings based on an analysis of 20 COD datasets from across the world produced evidence both for and against this hypothesis. France and Japan do not seem to have more reliable COD data to guide policy than Turkey, Colombia, Kyrgyzstan and Nicaragua. In France, 1 in 3 deaths is assigned a garbage code, with 9% of all deaths being certified as due to "Other ill-defined and unspecified causes of death" (R99), Heart failure (I50.9) or Respiratory arrest (R09.2). Similarly, in Japan, "old age" (Senility R54) is a commonly assigned cause of death accounting for one fifth of all garbage codes. Other research has found that

PLOS ONE
Are cause of death data fit for purpose?
dementia death rates at 85 years and above in reported COD data are, improbably, six times higher in Australia than in Japan, most likely resulting from different certification practices favoring the use of garbage coding in Japan [23]. Although High SDI countries on average had a lower proportion of garbage codes than Middle and Low SDI countries, the correlation was not very strong because of a number of outlier countries. For example, among the Middle and Low SDI countries Colombia, Kyrgyzstan and Nicaragua did better or as well as some of the High SDI countries. Furthermore, when we distinguished the garbage codes into high and low impact codes, surprisingly some low SDI countries (Kyrgyzstan and Nicaragua) had much lower (9-13%) high impact codes than several of the developed countries. The implication of this important finding is that targeted efforts to improve death registration completeness and minimize the use of garbage codes in COD data are possible to implement at comparatively low or medium levels of development, making the mortality data system much more useful for guiding public policy.
Both the pattern and volume of garbage codes among the 10 Middle SDI countries varied substantially. In Argentina, Tunisia, Thailand and South Africa, half of all deaths are being assigned to garbage codes, compared with less than one quarter in Colombia and Uzbekistan. A similar variation was found between the four Low SDI countries where Kyrgyzstan and Nicaragua had less than 25% of their deaths being assigned a garbage code compared to two thirds (65%) in Egypt and half (51%) for Tajikistan. Importantly, when garbage codes commonly appear among the top 10-20 leading causes of deaths, they diminish the policy value of the data by underestimating the true impact of other leading causes. Our findings reveal that in some countries up to 7 out of 10 leading causes are in fact high impact garbage codes, providing no useful information for guiding policy.
These findings are at the same time surprising and alarming. Countries spend considerable resources on maintaining their routine mortality surveillance systems. Improving completeness of death registration to ensure accurate all-cause mortality data is important to reliably monitor trends in mortality by age and sex. At the same time, accurate cause-specific mortality data are fundamental for guiding the formulation and evaluation of interventions to reduce mortality and premature deaths. Only by investigating the specific diagnostic practices of individual countries, along with knowledge of the proportions of hospital and community deaths, would it be possible to comment on what leads to poor diagnostic practices in many countries. As our analysis demonstrates, many countries do not derive maximum policy benefit from these data due to the high, and in some cases, very high, prevalence of severe garbage codes. This, in part, could be due to lack of understanding and appreciation of the importance among certifying doctors of the public health value of correctly certified cause of death data, reflecting in turn the inadequate training many of them are receiving in how to certify correctly causes of deaths. To decrease the amount of garbage codes, effective strategies are required to train doctors in correct medical certification and in understanding why doing so is critically important for improving the population's health, as well as using automated verbal autopsy for those community deaths that cannot be medically certified. Another contributing factor to the higher proportions of garbage codes observed in some countries is likely to be the higher proportion of community deaths occurring without medical assistance, particularly in Low and Middle SDI countries. This need not automatically be the case, however. In Greenland, where about 10% of all deaths occur in remote small settlements with no physician present, these deaths are certified by a nurse, a health worker or another official and reported to the Chief Medical Officer who assigns the final ICD code. As a result, the VSPI for Greenland was higher than one might have expected, with a medium quality performance score of 66% [24]. However globally, the proportion of community deaths is estimated to be about 2/3rds of all deaths, most of which in low-income countries are not medically certified and therefore more likely to end up with a garbage code [25].
Although our results confirmed that there is an inverse relationship between the SDI level and amount of garbage codes, they also showed that some countries, despite relatively low socio-economic development, have managed to develop their vital statistics systems sufficiently to provide data that are fit for purpose. Five countries classified as being of middle or low SDI (Turkey, Brazil, Colombia, Kyrgyzstan and Nicaragua) had VSPI(Q) scores high enough to be considered to have well performing reporting systems. These five countries have invested in improving the quality of their mortality reporting systems [4]. Much could be learned from their experiences about what strategies were used to ensure that their systems provide policy relevant data to improve population health and survival. Conversely, Jordan, Tunisia and Tajikistan returned the lowest VSPI(Q) scores suggesting that their systems will need considerable improvement, both in COD quality and in completeness.
Interestingly, although there was some variation across countries in the number and ranking of garbage codes, they were remarkable similar. For instance, all were misdiagnosed noncommunicable diseases, suggesting that the countries in our sample were all reasonably well advanced in their epidemiological transition. Heart failure, Senility and Other ill-defined causes were commonly used garbage codes in all countries, irrespective of their SDI level, with the only difference being where they appeared in the ranking among the leading causes of death. For example, Heart Failure was often ranked as the top leading cause in Low SDI countries, while in the Middle and High SDI countries it appeared at the 3 rd or 4 th rank. But the most notable difference was that Low and Middle SDI countries had many more garbage codes among the leading causes, and particularly those having the greatest impact for policy such as Senility, Hypertension and Other ill-defined.
A limitation of this study comes from the fact that it only investigates the output of the mortality system and not the amalgam of procedures and practices that collectively produce the data; hence conclusions about what underlies the observed differences are mostly speculative.
Another limitation was the small number of Low SDI countries included, which was due to lack of publicly available COD data. Further, data sets for most countries are at least 5 years old (2012-16) and may not adequately reflect improvements in the interim in both the mortality and the socio-economic situation of the country, which was assessed based on 2017 data.
An unexpected finding was that garbage codes were common, not only at the older ages, but worryingly constituted sizeable proportions in most age groups including children and adolescents under 20 years. The implication is that the evidence base for correctly understanding which are the leading causes of deaths in different age groups and for guiding health interventions designed to prevent premature deaths is likely to be significantly distorted by poor diagnostic practices. More reliable cause of death data will better inform debates about health sector priorities and strengthened health system responses, which can be expected to lead to better health and survival.