Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Verbal Autopsy: Evaluation of Methods to Certify Causes of Death in Uganda

  • Arthur Mpimbaza ,

    Affiliations Child Health & Development Centre, College of Health Sciences, Makerere University, Kampala, Uganda, Infectious Diseases Research Collaboration, Kampala, Uganda

  • Scott Filler,

    Affiliation Global Fund to Fight AIDS, Tuberculosis and Malaria, Geneva, Switzerland

  • Agaba Katureebe,

    Affiliation Infectious Diseases Research Collaboration, Kampala, Uganda

  • Linda Quick,

    Affiliation Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America

  • Daniel Chandramohan,

    Affiliation London School of Hygiene & Tropical Medicine, London, United Kingdom

  • Sarah G. Staedke

    Affiliations Infectious Diseases Research Collaboration, Kampala, Uganda, London School of Hygiene & Tropical Medicine, London, United Kingdom

Verbal Autopsy: Evaluation of Methods to Certify Causes of Death in Uganda

  • Arthur Mpimbaza, 
  • Scott Filler, 
  • Agaba Katureebe, 
  • Linda Quick, 
  • Daniel Chandramohan, 
  • Sarah G. Staedke


To assess different methods for determining cause of death from verbal autopsy (VA) questionnaire data, the intra-rater reliability of Physician-Certified Verbal Autopsy (PCVA) and the accuracy of PCVA, expert-derived (non-hierarchical) and data-driven (hierarchal) algorithms were assessed for determining common causes of death in Ugandan children. A verbal autopsy validation study was conducted from 2008-2009 in three different sites in Uganda. The dataset included 104 neonatal deaths (0-27 days) and 615 childhood deaths (1-59 months) with the cause(s) of death classified by PCVA and physician review of hospital medical records (the ‘reference standard’). Of the original 719 questionnaires, 141 (20%) were selected for a second review by the same physicians; the repeat cause(s) of death were compared to the original,and agreement assessed using the Kappa statistic.Physician reviewers’ refined non-hierarchical algorithms for common causes of death from existing expert algorithms, from which, hierarchal algorithms were developed. The accuracy of PCVA, non-hierarchical, and hierarchical algorithms for determining cause(s) of death from all 719 VA questionnaires was determined using the reference standard. Overall, intra-rater repeatability was high (83% agreement, Kappa 0.79 [95% CI 0.76-0.82]). PCVA performed well, with high specificity for determining cause of neonatal (>67%), and childhood (>83%) deaths, resulting in fairly accurate cause-specific mortality fraction (CSMF) estimates. For most causes of death in children, non-hierarchical algorithms had higher sensitivity, but correspondingly lower specificity, than PCVA and hierarchical algorithms, resulting in inaccurate CSMF estimates. Hierarchical algorithms were specific for most causes of death, and CSMF estimates were comparable to the reference standard and PCVA. Inter-rater reliability of PCVA was high, and overall PCVA performed well. Hierarchical algorithms performed better than non-hierarchical algorithms due to higher specificity and more accurate CSMF estimates. Use of PCVA to determine cause of death from VA questionnaire data is reasonable while automated data-driven algorithms are improved.


Verbal autopsy (VA) is an indirect method of determining cause of death based on an interview with the caretakers of a deceased individual, which has been widely used to collect information on cause-specific mortality where vital registration systems are lacking and medical information on deaths is incomplete [1]. Different approaches of determining cause of death from VA interview information exist, including physician review, algorithms, and more recently, computerized coding of VA (CCVA) which can either be algorithmic or probabilistic in approach [24]. However, the optimal approach for determining causes of death from VA data is unclear, and has been the subject of debate [1,3,5].

The most widely used method is physician review, known as physician-certified VA (PCVA), in which physicians are trained to review questionnaire data and determine cause of death. Although the validity of PCVA has been evaluated [6], concerns about the repeatability of PCVA have been raised [4,7]. The level of agreement on causes of death certified by two independent physicians from VA (inter-rater repeatability) has been extensively studied [5,815]. However, very few published studies have assessed the repeatability of causes of death certified from VA by the same physicians at different time points (intra-rater reliability) [16].

An alternative to PCVA for determining cause(s) of death from VA data are algorithms. Algorithms can be expert-derived or data-driven. Expert algorithms include a set of pre-defined diagnostic criteria developed by a panel of physicians,based on experience or review of existing literature [2]. Alternatively, data-driven algorithms are derived from existing data using standard statistical techniques including logistic regression, decision tree algorithms, and bayesian classification, which identify discriminatory functions of indicators to be included in an algorithm [2]. Algorithms can be used to guide physicians as they review VA questionnaires and classify cause(s) of death; alternatively, algorithms may be computerized to automate the process [3,17]. Several algorithms based on expert opinion or derived from data have been developed, but their accuracy has been shown to vary widely, and may be lower than that of PCVA [1822]. In addition to algorithms, probabilistic approaches have been developed [3]. Unlike algorithmic approaches that assess the presence or absence of single cause of death based on positive or negative responses to symptom-related questions, automated methods apply probabilistic reasoning adjusting the probability of a range of multiple possible outcomes simultaneously [2]. Like algorithms, probabilistic methods can be expert driven or data driven [23]. Recent reports suggest that automated probabilistic approaches outperformor are equivalent to PCVA [24, 25], but these results have been disputed [23,26].

Data from a VA validation study conducted in three epidemiological settings in Uganda were used to investigate the performance of different methods for determining causes of death from VA data. We evaluated the intra-rater reliability of PCVA, and also compared the accuracy of PCVA to that of two algorithms; one developed with the input of expert physicians (non-hierarchical) and another data-driven (hierarchical).

Materials and Methods

The VA data-set used to investigate the performance of different methods for determining causes of death was obtained from a VA validation study that was approved by the Ugandan National Council for Science and Technology, the Centers for Disease Control and Prevention, and the ethics committees of Makerere University Faculty of Medicine, and the London School of Hygiene and Tropical Medicine. Details of the VA validation study method are published elsewhere [27]. Briefly, the study was conducted from 2008–2009 in selected public hospitals located in three districts; Tororo (high malaria transmission) Kampala (medium transmission) and Kisoro (low transmission). Deaths among hospitalized children aged less than five years, including neonatal deaths were registered over a period of one year and VA interviews were conducted with appropriate caretaker of children. PCVA was used for determining cause of death following World Health Organization (WHO) standards at the time [28]. The reference standard for assessing the accuracy of PCVA was the cause of death determined by physician review of hospital medical records at each site. The sensitivity, specificity and positive predictive value, and accuracy of cause specific mortality fraction (CSMF) estimates of the PCVA method for determining cause of death were computed for a select group of common causes of childhood death for each site. Analysis and presentation of results was stratified by two age groups: 1) Neonatal deaths (0–28 days), and 2) Childhood deaths (1–59 months)

Intra-rater reliability of PCVA

Twenty percent of VA questionnaires were systematically sampled for assessment of intra-rater reliability. Using a list of sequentially ordered identification numbers for each site, we systematically selected every fifth VA questionnaires with the corresponding COD originally determined by physician review of the data. VA questionnaires were re-evaluated by the original physician a second time. Re-determination of causes of death from VA questionnaires occurred 3–9 months after the original assessment, and physicians were blinded to the causes of death recorded in the original VA death certificate.

Development of non-hierarchical algorithms

The non-hierarchical algorithms were based on previously published expert algorithms [19,21,2931]. Seven physicians who reviewed the original VA questionnaires were asked to review existing algorithms and develop a refined algorithm (including the criteria for diagnosis) taking into account diagnostic criteria that they used to attribute malaria and other common childhood illness as cause of death when originally reviewing VA questionnaires. The non-hierarchical algorithms underwent a final round of review by a team of the investigators, including a pediatrician and three epidemiologists. Each algorithm consisting of a pre-determined set of diagnostic criteria to be applied to VA questionnaire data; specific combinations of the presence or absence of certain signs and symptoms experienced prior to death indicating different causes of death. For neonatal causes of death, non-hierarchical algorithms were developed for the following causes of death: 1) septicemia, 2) meningitis, 3) pneumonia, and 4) congenital malformation. Final non-hierarchical algorithms for childhood deaths were limited to the most common causes of death, including 1) malaria, 2) pneumonia, 3) meningitis, 4) diarrheal illnesses, 5) malnutrition, and HIV/AIDS (Table 1).

Table 1. Algorithms used for determining cause(s) of death from verbal autopsy questionnaires.

Development of hierarchical algorithms

Hierarchical algorithms were developed by ranking the performance of the non-hierarchical algorithms to reach common causes of childhood deaths, including neonatal deaths. Ranking was prioritized based on specificity of causes of death as determined using expert algorithms. The cause of death with the highest specificity wasplaced at the top of the hierarchy while the least specific was placed at the bottom (Fig 1). Neonatal deaths were ranked in the following order: (1) septicemia, (2) meningitis, (3) pneumonia, and (4) congenital malformations. Childhood causes of death were ranked as follows: (1) meningitis, (2) pneumonia, (3) malnutrition, (4) diarrhea, (5) HIV, and (6) malaria.

Fig 1. Ranking order for Hierarchical algorithms for childhood and neonatal deaths.

Data analysis

Intra-rater reliability of PCVA.

The cause of death determined by physicians upon repeat review of VA questionnaires was compared to the cause of death originally determined by the same physician. The percentage level of agreement and Kappa statistic was calculated using Stata 12 (StataCorp, College Station, Texas, USA) for each physician. Interpretation of Kappa values was based according to the criteria of Landis and Kock [32], who recommended that a Kappa value greater than 0.8 be considered ‘almost perfect’, between 0.6 and 0.8 ‘substantial’, between 0.4 and 0.6 ‘moderate’, between 0.2 and 0.4 ‘fair’, between 0 and 0.2 ‘slight’, and between 0 and -1 ‘poor.’ Furthermore, to assess the impact of re-determination of cause of death on the CSMF attributable to malaria and other common illness at the population level we compared the CSMF (CSMF Original) to the re-determined CSMF (CSMF Repeat).

Validation of algorithms

A database comprised of responses to closed-ended sections of VA questionnaires, and the reference causes of death derived from medical records were generated. Causes of death determined by non-hierarchal algorithms were derived by applying non-hierarchal algorithms to the closed-ended sections of VA questionnaires. Non-hierarchal algorithms were capable of classifying more than one cause of death. Hierarchal algorithms were also applied to the same VA questionnaire database, generating a single cause of death for each questionnaire.

The sensitivity and specificity of each method for determining cause of death were calculated by comparing the cause of death assigned by each method to the ‘reference standard’ for causes of death derived from hospital medical records, including malaria, pneumonia, diarrhea, meningitis, malnutrition, and HIV. CSMF estimates of the leading causes of death were also calculated for PCVA (CSMFPCVA), non- hierarchical algorithms (CSMFNHA) and hierarchal algorithms (CSMFHA). The difference between the CSMF determined using each of the three methods and the ‘reference standard’ (CSMFMR) was calculated for the common causes of death. For neonatal and childhood deaths, where algorithms were developed for five and four commonest causes of death respectively, causes of death that did not fit the commonest cause of death list were categorized as ‘others’ and were factored in all analysis.


Intra-rater reliability of PCVA

A total of 149 VA questionnaires were selected for re-determining cause of death by four physician reviewers, each with a different number of VA questionnaires (Fig 2). Although the performance of individual physicians varied, intra-rater reliability was almost perfect for physician reviewer ‘2’ (Kappa statistic = 0.87) and substantial for physician reviewer ‘1’ and ‘3’ (Kappa statistic = 0.77, respectively) and moderate for physician reviewer ‘4’ (Kappa statistic = 0.52). Overall, the level of agreement was substantial (Kappa statistic = 0.79) (Table 2). The repeat estimates of CSMF for the different causes of death did not differ substantially (< 10%) when compared to the original CSMF estimated by the same reviewer (Table 3).

Fig 2. Trial profile: selection of VA questionnaires for re-assigning cause of death by physician reviewers.

Table 3. Level of agreement in CSMF upon repeat determination of cause of death by physician reviewers.

Accuracy of PCVA, non-hierarchical algorithms and hierarchal algorithms for neonatal deaths

A total of 104 questionnaires representing neonatal deaths were evaluated using algorithms (Fig 3). Based on PCVA, common causes of death among neonates included septicemia (29%), meningitis (38%), pneumonia (8%), and congenital malformations (6%). Sensitivity of PCVA, non-hierarchical algorithms, and hierarchical were generally low (<50%) for the four major causes of neonatal deaths, with exception of the sensitivity of non-hierarchical algorithms (76%) for septicemia deaths, and PCVA (61%) for meningitis deaths. For congenital malformation, pneumonia, and septicemia deaths, specificity of PCVA was high (97%, 93%, and 78% respectively), and comparable to that of hierarchical algorithms (94%, 88%, and 52% respectively). With the exception meningitis deaths where the specificity score of non-hierarchical algorithms (79%) was high, for the other causes of neonatal deaths the specificity of non-hierarchical algorithms (<20%) was very low (Table 4).

Fig 3. Trial profile: selection of VA questionnaires to be assigned cause of death using algorithms.

Table 4. Sensitivity and specificity of different methods of determining cause of death from VA questionnaires.

CSMF estimates for congenital malformation and pneumonia deaths were accurate and comparable for PCVA (0%, and -3% difference respectively), non-hierarchical algorithms (1%, and 2% difference respectively), and hierarchical algorithms (1% and 2% difference respectively). Non-hierarchical algorithms (50% difference), and hierarchical algorithms (16% difference) overestimated the CSMF for septicemia deaths compared to PCVA (-3% difference) that performed best. On the contrary non-hierarchical algorithms (5% difference), and hierarchical algorithms (-4% difference) had better CSMF estimates for meningitis deaths compared to PCVA (-16% difference, Table 5).

Table 5. CSMF and level of agreement of different methods of determining cause of death from VA questionnaires.

Accuracy of PCVA, non-hierarchical algorithms and hierarchal algorithms for causes of childhood deaths

A total of 615 questionnaires representing childhood deaths were evaluated using algorithms (Fig 3). The accuracy of PCVA, non-hierarchical algorithms and hierarchical algorithms ranged widely depending on the cause of death and the site (Table 4). For malaria deaths, the sensitivity of non-hierarchical algorithms (84%) was higher than that of PCVA (61%) and hierarchical algorithms (16%). This pattern was consistent in Kampala and Tororo. In contrast, the specificity of non-hierarchical algorithms for determining malaria deaths was low in Kampala (34%) and Tororo (39%), and much lower than the specificity of PCVA (84–88%) and hierarchal algorithms (93–94%) in determining malaria deaths (Table 4). Sensitivity and specificity of all methods for determining diarrheal deaths followed a pattern similar to that observed in determining malaria deaths. Sensitivity and specificity of non-hierarchical algorithms in determining pneumonia and meningitis deaths were comparable to hierarchal algorithms but lower when compared to PCVA at all sites (Table 4).

CSMF estimates of non-hierarchical algorithms (CSMFNHA) deviated greatly from the reference standard (CSMFMR; difference > 10%), with a tendency to overestimate the CSMF for the leading causes of death across all sites. The CSMF estimated by PCVA (CSMFPCVA) and the hierarchal algorithms (CSMFHA) approximated that of the reference standard (CSMFMR) for all cause(s) of death, performing far better than non-hierarchical algorithms. However, overall CSMF estimates of malaria deaths were best approximated by hierarchal algorithms (0% difference), exceeding performance of both PCVA (6% difference) and non-hierarchical algorithms (56% difference), which both overestimated the fraction of deaths attributable to malaria when compared to the reference standard (Table 5). This pattern was consistent across all sites with the exception of Tororo, where PCVA was more accurate.


To investigate the performance of different methods for determining causes of death from previously collected VA data, we evaluated the intra-rater reliability of PCVA, and compared the accuracy of PCVA and two algorithms, using physician review of hospital medical records as a reference standard. Contrary to prior reports, our findings suggest that the intra-rater reliability for classifying cause of death using PCVA is high [7,33]. Reliability of 3 out of 4 physicians was classified as ‘substantial’, and repeat CSMF estimates for common causes of death were similar to the original estimates. One physician’s score was sub-optimal possibly due to low number of records reviewed by the physician. Regardless, the overall performance was good with a Kappa score indicating ‘substantial’ agreement between reviews. The physicians’ prior knowledge of local epidemiology likely contributed to the good performance by three physicians [2]. Although prior knowledge and subjective application of clinical judgment may be considered as ‘biases’, they are likely to have had a positive impact on the physicians’ ability to correctly identify cause of death [34]. However, the subjectivity of the PCVA method may limit the ability to apply temporal and spatial comparisons of mortality data. Standardized training of physician reviewers addresses this concern to an extent [11].

Although use of algorithms has been advocated to overcome the issue of subjectivity, the accuracy of algorithms remains a concern [4]. For neonatal deaths, sensitivity of PCVA, non-hierarchical algorithms, and hierarchical algorithms was low (<50%) for all the causes of neonatal deaths, with exception of meningitis with PCVA (61%). On the contrary, specificity of PCVA and hierarchical algorithms performed well compared to non- hierarchical algorithms, although specificity was relatively low for meningitis with PCVA (68%) and for septicemia with hierarchical algorithms (52%). In terms of estimating CSMF, all three methods were relatively accurate with exception of non-hierarchical algorithms and hierarchical algorithms which overestimated the CSMF for septicemia deaths, a fact probably attributed to the low specificity of non-hierarchical algorithms and hierarchical algorithms in determining septicemia deaths.

For childhood deaths, compared to PCVA, sensitivity of non-hierarchical algorithms was impressive, particularly for classification of malaria, diarrheal and malnutrition deaths. However, sensitivity was gained at the expense of specificity. This imbalance between sensitivity and specificity undermined the performance of the non-hierarchical algorithms when estimating CSMF for common causes of death resulting in gross overestimation of the CSMF for respective causes of death. Importantly, we note that the degree of error in estimating the CSMF was inversely proportional to the specificity level attained, implying that error in estimating CSMF reduced as specificity increased. With exception of septicemia deaths, this phenomenon was not observed with neonatal deaths. Overlap of signs and symptoms of common illnesses used to develop diagnostic criteria for these diseases could have limited the ability of the algorithms to distinguish between illnesses resulting in assignment of multiple cause(s) of death and a marked decline in specificity.

Hierarchical algorithms assigning a single cause of death from each VA questionnaire resulted in an increase in specificity of the algorithm in determining causes of death, but at the expense of sensitivity which declined. However, compared to the non- hierarchal algorithms, hierarchal algorithm estimates of the reference CSMF were accurate and as good as those of PVCA for all the common causes of death; a fact attributed to the high specificity levels of hierarchal algorithms. This finding, previously described by Anker et al [35], demonstrated that specificity is an important driver of the accuracy of CSMF estimates determined by these methods. However, superiority was apparent only when the reference CSMF level was low (~ < 10%) for a particular disease [35]. In Tororo and Kisoro, the reference CSMF levels for malaria and pneumonia deaths were very high and hierarchal algorithms, despite low specificity, greatly underestimated the CSMF attributable to malaria and pneumonia deaths at these sites suggesting that benefits of increased specificity in estimating the CSMF are only applicable when the true CSMF is low. Indeed, this may explain why non-hierarchal algorithms and hierarchal algorithms overestimated septicemia deaths among neonates. The primary limitation of either algorithm is their inflexibility. Unlike physicians, algorithms lack ‘clinical acumen’ and are not capable of interpreting the potential contribution of multiple disease processes ultimately leading to death. This limitation of algorithms is well-recognized, and has been cited as the primary disadvantage of algorithms and other automated methods for determining cause(s) of death from VA data [4].

Several computerized methods premised on different algorithmic methods (expert driven, data driven; Tariff, Artificial Neural Network, and Random Forest), probabilistic (expert driven; InterVA, Data drive; King-Lu, and Simplified Symptom Pattern) approaches have been developed as alternative methods of determining cause(s) of death from VA questionnaires [23,28,30,33,3638]. The dataset used to validate the Tariff, Random Forest, King-Lu and Simplified Symptom Pattern methods was comprised of a randomly selected number of gold standard hospital deaths that formed part of a larger multi-country verbal autopsy validation study [39]. In these validation studies, all three methods were more accurate than PCVA for most of the causes of death [36,37,40]. However these results have been disputed, with a systemic review of 19 studies finding that no single VA method outperformed the other across selected CODs for both individual and population-level COD assignment [23].

InterVA uses a probability matrix, which was derived from clinical knowledge of group of physicians [41], and in addition to the TARIFF method, has been recommended by the World Health Organization in their 2012 VA guidelines as one of preferred methods for determining cause(s) of death [42]. However, two studies validating the performance of InterVA compared to PCVA against a gold standard based on rigorously defined clinical criteria yielded conflicting results; one study conducted in Kilifi on the coast of Kenya showed that InterVA performed as well as PCVA in determining the top five underlying causes of death in a rural community, the other study based on a multisite validation study showed that InterVA performance was suboptimal compared to PCVA [5,43]. Although InterVA has been widely implemented [4447], inconsistent reports of the performance of this method, as well as alternative CCVA approaches, should not be overlooked. Until CCVA methods are improved and evaluated, consistently yielding more accurate results than PCVA, it is likely that PCVA will continue to be used widely to determine causes of death from verbal autopsy questionnaires [23].

Our study is not without limitation. Internal evaluation of the performance of the hierarchical algorithm may have biased results, showing good performance of the hierarchical algorithms. However, the results of our analysis are strengthened by the inclusion of three different study sites. Furthermore, the small sample of deaths among some of the causes of the death in both neonates and children, especially when stratified by site, may have undermined our ability to detect representative estimates of measures of performance.


Our study provides insights into the performance of different methods for determining cause(s) of death from VA questionnaire data collected in three sites. Importantly, we demonstrate that repeatability of PCVA is high, contrary to expectation, and that overall PCVA performed well. Thus, based on our results and available evidence so far, PCVA remains a reliable method for determining cause of death from VA questionnaire data. Given the lack of consensus on the accuracy of recently developed CCVA methods, PCVA still has a place in determining cause of death in VA, while existing and newer automated data-driven algorithms, which undoubtedly would be more efficient, are further developed, refined, and evaluated.


This research was made possible through support provided by the President’s Malaria Initiative via the Office of Health, Infectious Diseases, and Nutrition, Bureau for Global Health, U.S. Agency for International Development, under the terms of an Interagency Agreement with the Centers for Disease Control and Prevention. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The opinions expressed herein are those of the author(s) and do not necessarily reflect the views of the Centers for Disease Control and Prevention or the U.S. Agency for International Development. The authors would also like to thank the clinical study team of Claire Katabazi, Jonathan Musinguzi, Steven Kyaligonza, Grace Nyabolo, Richard Male, Dickens Atwongire, Gladys Mbabazi, Francis Masereka, and Deus Bareke. We would also like to thank all the health workers in Mulago Hospital, Tororo Hospital, St Anthony’s Hospital, Kisoro Hospital, and St Francis Hospital for their efforts in improving the quality of medical records at the sites. We are indebted to the administrative support of Catherine Tugaineyo, Richard Oluga, Nicholas Wandera and the driver TemaKizito and to the data management team of Geoff Lavoy, Jacob Odeke, Dickens Mugwanya and David Masiga. Finally we are grateful to the parents, guardians, and caretakers who agreed to take part in this study.

Author Contributions

Conceived and designed the experiments: AM SF LQ DC SS. Performed the experiments: AM DC SS. Analyzed the data: AM SS. Contributed reagents/materials/analysis tools: AM DC SS. Wrote the paper: AM SF AK LQ DC SS.


  1. 1. Murray CJ, Lopez AD, Shibuya K, Lozano R. Verbal autopsy: advancing science, facilitating application. Popul Health Metr. 2011;9:18. pmid:21794169
  2. 2. Fottrell E, Byass P. Verbal autopsy: methods in transition. EpidemiolRev.32(1):38–55.
  3. 3. Leitao J, Chandramohan D, Byass P, Jakob R, Bundhamcharoen K, Choprapawon C, et al. Revising the WHO verbal autopsy instrument to facilitate routine cause-of-death monitoring. Glob Health Action. 2013;6:21518. pmid:24041439
  4. 4. Soleman N, Chandramohan D, Shibuya K. Verbal autopsy: current practices and challenges. Bull World Health Organ. 2006;84(3):239–45. pmid:16583084
  5. 5. Lozano R, Freeman MK, James SL, Campbell B, Lopez AD, Flaxman AD, et al. Performance ofInterVA for assigning causes of death to verbal autopsies: multisite validation study using clinical diagnostic gold standards. Popul Health Metr. 2011;9:50. pmid:21819580
  6. 6. Setel PW, Whiting DR, Hemed Y, Chandramohan D, Wolfson LJ, Alberti KG, et al. Validity of verbal autopsy procedures for determining cause of death in Tanzania.Trop Med Int Health. 2006;11(5):681–96. pmid:16640621
  7. 7. Todd JE, De Francisco A, O'Dempsey TJ, Greenwood BM.The limitations of verbal autopsy in a malaria-endemic region. Ann Trop Paediatr. 1994;14(1):31–6. pmid:7516132
  8. 8. Byass P. Patterns of mortality in Bavi, Vietnam, 1999–2001. Scand J Public Health Suppl. 2003;62:8–11. pmid:14578074
  9. 9. Edmond KM, Quigley MA, Zandoh C, Danso S, Hurt C, OwusuAgyei S, et al. Diagnostic accuracy of verbal autopsies in ascertaining the causes of stillbirths and neonatal deaths in rural Ghana. PaediatrPerinatEpidemiol. 2008;22(5):417–29.
  10. 10. Fantahun M, Fottrell E, Berhane Y, Wall S, Hogberg U, Byass P. Assessing a new approach to verbal autopsy interpretation in a rural Ethiopian community: the InterVA model. Bull World Health Organ. 2006;84(3):204–10. pmid:16583079
  11. 11. Joshi R, Lopez AD, MacMahon S, Reddy S, Dandona R, Dandona L, et al. Verbal autopsy coding: are multiple coders better than one? Bull World Health Organ. 2009;87(1):51–7. pmid:19197404
  12. 12. Montgomery AL, Morris SK, Bassani DG, Kumar R, Jotkar R, Jha P. Factors associated with physician agreement and coding choices of cause of death using verbal autopsies for 1130 maternal deaths in India. PLoS One. 2012;7(3):e33075. pmid:22470436
  13. 13. Morris SK, Bassani DG, Kumar R, Awasthi S, Paul VK, Jha P. Factors associated with physician agreement on verbal autopsy of over 27000 childhood deaths in India. PLoSOne.5(3):e9583. pmid:20221398
  14. 14. Weldearegawi B, Ashebir Y, Gebeye E, Gebregziabiher T, Yohannes M, Mussa S, et al. Emerging chronic non-communicable diseases in rural communities of Northern Ethiopia: evidence using population-based verbal autopsy method in KiliteAwlaelo surveillance site. Health Policy Plan. 2013.
  15. 15. Ye M, Diboulo E, Niamba L, Sie A, Coulibaly B, Bagagnan C, et al. An improved method for physician-certified verbal autopsy reduces the rate of discrepancy: experiences in the Nouna Health and Demographic Surveillance Site (NHDSS), Burkina Faso. Popul Health Metr. 2011;9:34. pmid:21816102
  16. 16. Khademi H, Etemadi A, Kamangar F, Nouraie M, Shakeri R, Abaie B, et al. Verbal autopsy: reliability and validity estimates for causes of death in the Golestan Cohort Study in Iran. PLoSOne. 2010;5(6):e11183.
  17. 17. Chandramohan D, Maude GH, Rodrigues LC, Hayes RJ. Verbal autopsies for adult deaths: their development and validation in a multicentre study. Trop Med Int Health. 1998;3(6):436–46. pmid:9657505
  18. 18. Chandramohan D, Maude GH, Rodrigues LC, Hayes RJ. Verbal autopsies for adult deaths: issues in their development and validation. Int J Epidemiol. 1994;23(2):213–22. pmid:8082945
  19. 19. Freeman JV, Christian P, Khatry SK, Adhikari RK, LeClerq SC, Katz J, et al. Evaluation of neonatal verbal autopsy using physician review versus algorithm-based cause-of-death assignment in rural Nepal. PaediatrPerinatEpidemiol. 2005;19(4):323–31.
  20. 20. Quigley MA, Armstrong Schellenberg JR, Snow RW. Algorithms for verbal autopsies: a validation study in Kenyan children. Bull World Health Organ. 1996;74(2):147–54. pmid:8706229
  21. 21. Quigley MA, Chandramohan D, Rodrigues LC. Diagnostic accuracy of physician review, expert algorithms and data-derived algorithms in adult verbal autopsies.Int J Epidemiol. 1999;28(6):1081–7. pmid:10661651
  22. 22. Quigley MA, Chandramohan D, Setel P, Binka F, Rodrigues LC. Validity of data-derived algorithms for ascertaining causes of adult death in two African sites using verbal autopsy. Trop Med Int Health. 2000;5(1):33–9. pmid:10672203
  23. 23. Leitao J, Desai N, Aleksandrowicz L, Byass P, Miasnikof P, Tollman S, et al. Comparison of physician-certified verbal autopsy with computer-coded verbal autopsy for cause of death assignment in hospitalized patients in low- and middle-income countries: systematic review. BMC Med. 2014;12:22. pmid:24495312
  24. 24. Murray CJ, Lozano R, Flaxman AD, Serina P, Phillips D, Stewart A, et al. Using verbal autopsy to measure causes of death: the comparative performance of existing methods. BMC Med. 2014;12:5. pmid:24405531
  25. 25. Byass P, Herbst K, Fottrell E, Ali MM, Odhiambo F, Amek N, et al. Comparing verbal autopsy cause of death findings as determined by physician coding and probabilistic modelling: a public health analysis of 54 000 deaths in Africa and Asia. Journal of global health. 2015;5(1):010402. pmid:25734004
  26. 26. Desai N, Aleksandrowicz L, Miasnikof P, Lu Y, Leitao J, Byass P, et al. Performance of four computer-coded verbal autopsy methods for cause of death assignment compared with physician coding on 24,000 deaths in low- and middle-income countries. BMC Med. 2014;12:20. pmid:24495855
  27. 27. Mpimbaza A, Filler S, Katureebe A, Kinara SO, Nzabandora E, Quick L, et al. Validity of verbal autopsy procedures for determining malaria deaths in different epidemiological settings in Uganda. PLoS One. 2011;6(10):e26892. pmid:22046397
  28. 28. World Health Organization. Verbal autopsy standards:ascertaining and attributing cause of death. 2007. Availbale from:
  29. 29. Lee AC, Mullany LC, Tielsch JM, Katz J, Khatry SK, LeClerq SC, et al. Verbal autopsy methods to ascertain birth asphyxia deaths in a community-based setting in southern Nepal. Pediatrics. 2008;121(5):e1372–80. pmid:18450880
  30. 30. Baqui AH, Darmstadt GL, Williams EK, Kumar V, Kiran TU, Panwar D, et al. Rates, timing and causes of neonatal deaths in rural India: implications for neonatal health programmes. Bull World Health Organ. 2006;84(9):706–13. pmid:17128340
  31. 31. Lopman BA, Barnabas RV, Boerma JT, Chawira G, Gaitskell K, Harrop T, et al. Creating and validating an algorithm to measure AIDS mortality in the adult population using verbal autopsy. PLoS Med. 2006;3(8):e312. pmid:16881730
  32. 32. Landis JR, Koch GG. The measurement of observer agreement for categorical data.Biometrics. 1977;33(1):159–74. pmid:843571
  33. 33. Boulle A, Chandramohan D, Weller P. A case study of using artificial neural networks for classifying cause of death from verbal autopsy.Int J Epidemiol. 2001;30(3):515–20. pmid:11416074
  34. 34. Butler D. Verbal autopsy methods questioned. Nature.467(7319):1015. pmid:20981062
  35. 35. Anker M. The effect of misclassification error on reported cause-specific mortality fractions from verbal autopsy. Int J Epidemiol. 1997;26(5):1090–6. pmid:9363532
  36. 36. Murray CJ, Lopez AD, Feehan DM, Peter ST, Yang G. Validation of the symptom pattern method for analyzing verbal autopsy data. PLoS Med. 2007;4(11):e327. pmid:18031196
  37. 37. Flaxman AD, Vahdatpour A, Green S, James SL, Murray CJ. Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards. Popul Health Metr. 2011;9:29. pmid:21816105
  38. 38. Byass P, Chandramohan D, Clark SJ, D'Ambruoso L, Fottrell E, Graham WJ, et al. Strengthening standardised interpretation of verbal autopsy data: the new InterVA-4 tool. Glob Health Action. 2012;5:1–8. pmid:23331992
  39. 39. Murray CJ, Lopez AD, Black R, Ahuja R, Ali SM, Baqui A, et al. Population Health Metrics Research Consortium gold standard verbal autopsy validation study: design, implementation, and development of analysis datasets. Popul Health Metr. 2011;9:27. pmid:21816095
  40. 40. James SL, Flaxman AD, Murray CJ. Performance of the Tariff Method: validation of a simple additive algorithm for analysis of verbal autopsies. Popul Health Metr. 2011;9:31. pmid:21816107
  41. 41. Byass P, Fottrell E, Dao LH, Berhane Y, Corrah T, Kahn K, et al. Refining a probabilistic model for interpreting verbal autopsy data. Scand J Public Health. 2006;34(1):26–31. pmid:16449041
  42. 42. World Health Organization. Verbal Autopsy Standards:Verbal Autopsy lnstrument. 2012. Available from:
  43. 43. Bauni E, Ndila C, Mochamah G, Nyutu G, Matata L, Ondieki C, et al. Validating physician-certified verbal autopsy and probabilistic modeling (InterVA) approaches to verbal autopsy interpretation using hospital causes of adult deaths. Popul Health Metr. 2011;9:49. pmid:21819603
  44. 44. Ndila C, Bauni E, Mochamah G, Nyirongo V, Makazi A, Kosgei P, et al. Causes of death among persons of all ages within the Kilifi Health and Demographic Surveillance System, Kenya, determined from verbal autopsies interpreted using the InterVA-4 model. Glob Health Action. 2014;7:25593. pmid:25377342
  45. 45. Amek NO, Odhiambo FO, Khagayi S, Moige H, Orwa G, Hamel MJ, et al. Childhood cause-specific mortality in rural Western Kenya: application of the InterVA-4 model. Glob Health Action. 2014;7:25581. pmid:25377340
  46. 46. Rai SK, Kant S, Misra P, Srivastava R, Pandav CS. Cause of death during 2009–2012, using a probabilistic model (InterVA-4): an experience from Ballabgarh Health and Demographic Surveillance System in India. Glob Health Action. 2014;7:25573. pmid:25377339
  47. 47. Weldearegawi B, Melaku YA, Spigt M, Dinant GJ. Applying the InterVA-4 model to determine causes of death in rural Ethiopia.Glob Health Action. 2014;7:25550. pmid:25377338