Reader Comments

Post a new comment on this article

Response to Murray and King

Posted by plosmedicine on 31 Mar 2009 at 00:28 GMT

Author: David Stuckler
Position: Response to Murray and King
Institution: University of Cambridge
E-mail: ds450@cam.ac.uk
Additional Authors: Lawrence P. King, Sanjay Basu
Submitted Date: July 22, 2008
Published Date: July 22, 2008
This comment was originally posted as a “Reader Response” on the publication date indicated above. All Reader Responses are now available as comments.

We thank Megan Murray and Gary King for their Perspective on our article and their endorsement of the need for serious evaluation of the health effects of IMF loans (1). As they note, such research can only be undertaken using observational data, which poses challenges that differ from those posed by randomised controlled trials. Although Murray and King state that our study “went very far” to address these well-known limitations and that “the study and its conclusion should be taken seriously”, their commentary seems to have overlooked or misread some of the study's most important features. As a result, their comments on our design, methodology and findings are often incorrect or misleading, particularly when discussing the study's limitations, and the issues of randomization and controls.

First, Murray and King appear to have misunderstood the hypothesized pathway we are testing. They state that our estimates “did not change after adjusting for factors expected to mediate the impact of the loans, such as HIV prevalence, incarceration rates, and variables reflecting macroeconomic changes.” This is incorrect. We used data on HIV and financial health to check that our findings were not biased by an unaccounted for third factor (confounding), not as factors expected to mediate the IMF program’s effect. The hypothesis we tested was the commonly proposed pathway whereby IMF programs negatively affect government spending, which translates into reduced health system capacity and consequently worse tuberculosis outcomes (2).

Second, Murray and King misreport some of our study’s basic findings. They state:“Although the IMF loans were associated with a fall in DOTS, controlling for this variable had no effect on the strength of the association between loans and TB deaths. This result emphasizes the complex and confusing pathways by which macroeconomic policies lead to specific health effects." This is also incorrect. As shown in the manuscript's Table 1 and detailed in SI text 5, the effect size of IMF programs dropped when controlling for the DOTS pathway by close to 25%. We also replicated the findings of Christopher Murray and the WHO, who independently note that DOTS does not yet explain significant cross-country variations in tuberculosis rates (3,4). This is why we undertook the further mechanistic investigation detailed in the study. It is perfectly sensible for IMF programs to affect DOTS, but also to affect TB outcomes independent of DOTS, via the mechanisms we tested.

Third, Murray and King suggest a major limitation of the study is that "IMF loans are not randomly assigned”, but overlook the past half-century of methodological developments for addressing this traditional issue as well as some of our more novel epidemiological approaches. Because we cannot randomize countries’ participation in economic policy experimentation, for practical and ethical reasons, there is of course always a possibility that an observational study may be biased by an unaccounted for third factor (confounding). However, in labelling the inability to randomize as a major study limitation, they greatly mislead the reader. Numerous statistical models replicate a random-intervention-assignment pattern by observing, and holding constant, the key differences between units of analysis. By holding constant all the fixed differences within countries and seeking to correct for all those that are changing over time, we can achieve a random treatment-assignment pattern between control and experimental groups needed to isolate the specific IMF effect. Experiments of this type are similar in nature to those employed by John Snow in his legendary cholera study (5), and actually are the basis for the assessments that the IMF does itself, as well as for nearly all of the empirical economics literature (See for example IMF staff papers, http://www.imf.org/Extern...). We dedicated a lengthy discussion in supporting information text S7 to the issues associated with observational studies, and some of the statistical approaches available for achieving the benefits of randomization.

Fourth, Murray and King cite an obvious potential confounder to the study but neglect how we have explicitly considered and evaluated it in the study using multiple approaches. They state that “countries receive IMF loans because of pressing financial problems that may affect both short- and long-term health status, quite apart from the conditionality imposed by the IMF" (“confounding by indication”). As explained in the paper, we considered the prospect that the IMF might have been an innocent bystander or trying to offset economic breakdown in some countries. We included a number of methods to avoid falsely linking the IMF to tuberculosis rises caused by macroeconomic or other societal problems (6, p. 3). If the story about variations in tuberculosis rates was simply economic collapse, then the biggest determinant would be changes in GDP. Instead we found that changes in GDP, while significant, could not account for the TB trend differences seen across countries. Many of the countries in the region did not face economic collapse and were not part of the former Soviet Union. In some countries like Estonia, tuberculosis was rising even as real GDP was rising (6, see SI Text 1). Therefore, other mechanisms must have been at play. Nonetheless, we corrected for over ten different indicators of economic downturn – including high inflation, bank crisis, plummeting GDP, and more – and found a robust effect of IMF loans independent of these measures of financial health. Table 1 further summarizes the multiple ways in which our study addressed the possibility of an unaccounted for confounding factor beyond these traditional approaches.
We also exploited the temporal dimension of the data to evaluate which came first, the tuberculosis rises or the IMF program. If the IMF was an innocent bystander, then tuberculosis rates would have been on the rise prior to their arrival. Instead, we found evidence that tuberculosis rates took off either when or after the IMF came onto the scene, not before (6, see Granger- and Sims-causality tests, SI Text 4). When countries left the IMF program, tuberculosis rates fell by close to the same amount as they had risen.

Fifth, Murray and King misrepresent our measurement of IMF programs. They state that the“measure of the size of loans for conditionality is an imperfect proxy”, as though it was the only measure we used. We explicitly agreed with Murray and King’s point in the paper, which is why we tested our hypothesis using multiple measures of IMF programs: an indicator for IMF participation, a measure of the size of the IMF loan, and a measure for the number of years exposed to the IMF loans.

Sixth, Murray and King’s suggest that another study limitation is that we “have many fewer independent pieces of information than the raw number of observations reported.” We reported our observations in the same way as any longitudinal study. However, if Murray and King propose that our analysis of two decades of data from all countries in the region has too few observations, then we would have not had sufficient power for the results to be statistically significant, which they were. We corrected our statistical tests for the fact that the observations were not independent of each other by clustering the standard errors for each country, which was described in our methods and tables.

Seventh, we were surprised to see Murray and King confuse important statistical techniques with limitations of the study. They write that “in addition to this lack of randomization, the investigators included all of the Eastern European and former Soviet Union countries in their study, rather than comparing countries that received an IMF loan with, say, otherwise similar countries that just missed the threshold to qualify for a loan.” This is not a limitation, but a standard feature of cross-national observational studies. Such studies include all countries in a region, and through the statistical analysis separate control from experimental groups in order to identify the ‘average causal effects’. Virtually all empirical public policy analyses proceed in this manner (see for example Christopher Murray et al. (7)).

The selection of the EE/FSU countries as a quasi-natural experimental setting was also carefully thought through. The EE/FSU countries started with low tuberculosis levels, a set of similar health infrastructure and legacies, and took on a uniquely similar set of IMF loans and textbook reform programs. This makes the experimental study more credible than, say, comparing an eastern European country to an African, Asian, or Latin American one. We went further, using a conservative modelling technique called ‘fixed effects’, which holds constant any historical differences between countries, including historical legacies, membership in the former Soviet Union and national surveillance systems, for developing a before- and after-comparison of the IMF program’s effect within countries.

Another type of statistical control group, or ‘counterfactual’, we developed was a ‘hazard of IMF participation’ statistic. We modelled what factors caused countries to borrow from the IMF (checking whether financial health really was the reason). This offers a statistical equivalent to the call that Murray and King make to include similar countries that just miss the threshold for loans, which would be impossible to do in practice given that the IMF's loan evaluations are kept secret. Our models revealed that the countries that took on IMF programs actually had characteristics that would have predisposed them to tuberculosis declines, had they not participated in the program.
A further ‘control group’ we used was to assess the effect of non-IMF loans of similar size, duration, timing, and locale. If the IMF was an innocent bystander, we would have observed a similar effect with non-IMF loans after holding constant financial health. Instead, we found these "control loans", which thus differed only insofar as they did not require the strict IMF conditionalities, had the opposite impact to those of the IMF (6, Figures 2-4).

Eighth, Murray and King also confuse other aspects of the study in their 'study limitations' section. They state that "IMF loans are highly heterogeneous, and each type of loan may have massively different effects across countries and time periods." Actually, as we discuss in the manuscript, the IMF programs were, according to considerable social science evidence, surprisingly homogeneous in the post-Communist context (8-10). This relates to Murray and King's misleading statement that while we use “special ‘robust standard errors’”...if this approach makes a difference, it also indicates that an aspect of their model was misspecified." In fact, use of robust standards errors is a powerful way to make sure the significance tests for the average estimated effect of the intervention are robust to potential heterogeneity and misspecification. As King himself has written, “to avoid inconsistent standard errors, I report only robust standard errors, an alternative standard error estimator that is consistent in the presence of many types of misspecification" [italics added] (11, p. 168). Murray and King seem to think we applied this technique in an ad hoc manner, when in fact all of our models use robust standard errors. However, we detailed that none of the models’ results changed when adding or removing this method. Throughout the paper, we reported the confidence intervals and effect sizes that reflected the most conservative results, making it more difficult to reject the null hypothesis of no IMF effect. This is simply good practice and good science.

Lastly, Murray and King offer a misleading comment that, “The scientific status of the authors’ conclusions necessarily remains uncertain.” Of course, any study results have uncertainty (hence, we report confidence intervals, and perform multiple checks and models), but the "scientific status" is not uncertain. Rather, our study follows the best practices in epidemiologic and econometric methods. Our study was reviewed by six economists and statisticians during the peer review process, who all agreed with the robustness of the study's methods and results. As summarized by an anonymous statistics reviewer: "The statistical analyses are thoughtfully done and reported - possibly to a fault. Some of the statistical methods used are extremely complex, but the authors have done about the best job possible in explaining and justifying them."

In an environment that should encourage evidence-based public health policy, observational studies of actual policy implementation are crucial. Economic policies are experiments too, and their impacts on health – potentially positive or negative – are very real for millions of people. Not to perform these studies would—at best—be unscientific. At worst, it would risk unintended public health disasters. Our study is, we hope, just the beginning.

Table 1: Study methods and findings

Please access Table 1 via the following link: http://www.plos.org/press...

References

1. Murray M, King G (2008) The effects of international monetary fund loans on health outcomes. PLoS Med 5(7): e162.

2. Kim JY and Millen J (2000) Dying for Growth: Global inequality and the health of the poor. Common Courage Press.

3. Obermeyer Z, Abbott-Klafter J, Murray CJL (2008) Has the DOTS Strategy Improved Case Finding or Treatment Success? An Empirical Assessment. PLoS ONE 3(3): e1721. doi:10.1371/journal.pone.0001721

4. World Health Organization (WHO) 2008 Global tuberculosis control— surveillance, planning, financing. Available: http://www.who.int/tb/ publications/global_report/2008/en/index.html. Accessed 31 March 2008.
From the report: “this ecological analysis provides no evidence that the standard, direct measures of DOTS implementation - case detection and treatment success in various combinations - can yet explain the variation in incidence trends among countries, despite the wide variation in DOTS implementation among countries. This observation suggests - subject to further investigation - that DOTS programmes have not yet had a major impact on TB transmission and incidence around the world.”

5. Snow, J. 2004. On the mode of communication of cholera. Delta Omega Society.
Available at http://www.deltaomega.org... accessed on July 15th 2008.
See also http://www.ph.ucla.edu/ep...

6. Stuckler D, King LP, Basu S (2008) International Monetary Fund programs and tuberculosis outcomes in post-communist countries. PLoS Med 5(7): e143. doi:10.1371/journal.pmed.0050143.

7. Lu C, Michaud CM, Gakidou E, Khan K, Murray CJ (2006) Effect of the Global Alliance for Vaccines and Immunisation on diphtheria, tetanus, and pertussis vaccine coverage: an independent assessment. Lancet 368: 1088–1095.

8. Wedel JR (2003) Collision and collusion: the strange case of Western aid to
Eastern Europe, 1989–1998. London: MacMillan.

9. Stefanov SF (2004) The neoliberal platform of the transition to market economy—specifics and consequences. ¿¿¿¿¿¿¿¿¿¿¿¿ ¿¿¿¿¿ (in Russian).

10. Stiglitz J (2003) Globalization and its discontents. New York: WW Norton & Company

11. King G (1990) Electoral responsiveness and partisan bias in multiparty democracies. Legislative Studies Quarterly. 15(2):159-81.

No competing interests declared.