Measuring mortality due to HIV-associated tuberculosis among adults in South Africa: Comparing verbal autopsy, minimally-invasive autopsy, and research data

Background The World Health Organization (WHO) aims to reduce tuberculosis (TB) deaths by 95% by 2035; tracking progress requires accurate measurement of TB mortality. International Classification of Diseases (ICD) codes do not differentiate between HIV-associated TB and HIV more generally. Verbal autopsy (VA) is used to estimate cause of death (CoD) patterns but has mostly been validated against a suboptimal gold standard for HIV and TB. This study, conducted among HIV-positive adults, aimed to estimate the accuracy of VA in ascertaining TB and HIV CoD when compared to a reference standard derived from a variety of clinical sources including, in some, minimally-invasive autopsy (MIA). Methods and findings Decedents were enrolled into a trial of empirical TB treatment or a cohort exploring diagnostic algorithms for TB in South Africa. The WHO 2012 instrument was used; VA CoD were assigned using physician-certified VA (PCVA), InterVA-4, and SmartVA-Analyze. Reference CoD were assigned using MIA, research, and health facility data, as available. 259 VAs were completed: 147 (57%) decedents were female; median age was 39 (interquartile range [IQR] 33–47) years and CD4 count 51 (IQR 22–102) cells/μL. Compared to reference CoD that included MIA (n = 34), VA underestimated mortality due to HIV/AIDS (94% reference, 74% PCVA, 47% InterVA-4, and 41% SmartVA-Analyze; chance-corrected concordance [CCC] 0.71, 0.42, and 0.31, respectively) and HIV-associated TB (41% reference, 32% PCVA; CCC 0.23). For individual decedents, all VA methods agreed poorly with reference CoD that did not include MIA (n = 259; overall CCC 0.14, 0.06, and 0.15 for PCVA, InterVA-4, and SmartVA-Analyze); agreement was better at population level (cause-specific mortality fraction accuracy 0.78, 0.61, and 0.57, for the three methods, respectively). Conclusions Current VA methods underestimate mortality due to HIV-associated TB. ICD and VA methods need modifications that allow for more specific evaluation of HIV-related deaths and direct estimation of mortality due to HIV-associated TB.

Introduction hospital or clinic records, a gold standard of variable quality and consistency [21][22][23][24]. There have been few attempts to compare VA CoD to CoD derived from research-quality data or from pathological autopsy, which remains the highest standard for assigning CoD.
More accurate estimates of HIV-associated mortality are particularly needed in areas of high HIV prevalence, where civil registration systems are often weak [14]. This study, nested within two large studies of HIV-positive adults in South Africa ('TB Fast Track' [25] and 'XPHACTOR'), aimed to compare VA CoD to reference-standard CoD derived from clinical, research, and minimally-invasive autopsy (MIA) data.

Setting
HIV-positive adults (aged !18 years) were recruited to one of two studies of TB/HIV conducted in South Africa. The first, 'TB Fast Track', was a pragmatic trial of empirical TB treatment in ambulant HIV-positive adults enrolled in primary care with CD4 count 150 cells/ μL. Participants were eligible if not on antiretroviral therapy (ART) or TB treatment at the point of enrolment and were followed up for a minimum of six months [25]; MIA and VAs were conducted for as many decedents as possible. The second, 'XPHACTOR', was an interventional cohort study investigating the use of Xpert 1 MTB/RIF in a systematic sample of HIV-positive adults attending out-patient clinics for HIV care; participants were followed up for at least three months and VAs were conducted for as many decedents as possible, beginning about half-way through the study's follow-up.

Data collection
Verbal autopsy. All VA interviews were conducted by trained lay researchers at the home of the family/carers, or at another location of their choosing, one to twelve months after the death of the study participant, as recommended by WHO. Written informed consent for VA was obtained from respondents; all interviews were conducted using the WHO 2012 VA instrument, with additional questions around treatment for TB and HIV, health beliefs, and health service use added by the study team.
Clinical. Data used to inform reference-standard CoD were separated into three categories: those collected from routine clinic and/or hospital records by a trained lay researcher, research nurse, or physician using standardised paper forms, labelled operational; those collected by members of a clinical research team due to involvement in a parent study, including results of investigations retrieved from the national health laboratory service (NHLS) database, labelled research; and those collected through MIA, carried out on TB Fast Track decedents as soon as possible after death, labelled autopsy (S1 Fig). Detailed MIA methods and a description of the consent process have previously been described [26]. The procedure involved core biopsies of liver, spleen, and lungs; aspiration of cerebrospinal fluid (CSF), blood, and urine; sampling of naso-and oro-pharyngeal secretions; and broncho-alveolar lavage (BAL) by insertion of a nasogastric tube into the trachea directed toward the lungs through a cricothyroid incision. Laboratory testing of post-mortem specimens included liquid culture for mycobacteria; Xpert 1 MTB/RIF; microscopy and aerobic culture; molecular testing for a range of bacteria and viruses; and histological examination. health service use) were interpreted using both physician-certified verbal autopsy (PCVA) and computer-coded verbal autopsy (CCVA) methods. Using a PCVA method based on WHO recommendations, similar to that used at the Medical Research Council/Wits-Agincourt HDSS site, South Africa [21], two physicians, blinded to all other clinical information, independently reviewed all available VA data, including the free narratives, and separately assigned CoD using ICD-10 and study-defined codes (detailed below). Assigned CoD were compared and, where there were discrepancies in either immediate, underlying, or study-defined CoD, the cases were discussed by the two physicians, aiming for consensus. If a consensus could not be reached, the data were provided to a third physician who reviewed them independently. If the CoD assigned by physician 3 matched that assigned by physicians 1 or 2, it was considered the final CoD; if no consensus was reached after review by three physicians, the individual was assigned an 'indeterminate' CoD.
VA data were also processed by two CCVA methods. The first, InterVA-4 (www.interva. net), uses Bayesian probabilities to assign each decedent up to three CoD, each with an associated 'likelihood' expressed as a percentage probability; cause-specific mortality fractions (CSMFs) can be generated that combine all individual causes and likelihoods [27,28]. The model allows for user modification of two baseline variables: prevalence of malaria and HIV, each of which can be set to 'very low', 'low', or 'high'. The second method, Smart-VA-Analyze (http://www.healthdata.org/verbal-autopsy/tools), uses the Tariff 2.0 system to assign each decedent one of 34 CoD [29,30]. SmartVA-Analyze also allows for the inclusion of data from the narrative section of the VA instrument and from healthcare records examined during the interview. Data were mapped from the WHO 2012 instrument to the 2014 framework for InterVA-4 and to the Population Health Metrics Research Consortium (PHMRC) full instrument for SmartVA-Analyze; free narrative and healthcare data were not provided to either software. Both CCVA methods assign CoD to lists of grouped ICD-10 codes [31,32]; these were further grouped into seven major categories for analysis (S1 Table).
Clinical. In order to meaningfully compare results with other VA validation studies, at least two sets of reference CoD were assigned to each participant. Operational data were used to generate level one (L1) CoD, comparable to the gold standard used in the majority of VA validation studies (S1 Fig). Both operational and research data were used to generate level two (L2) CoD for all decedents, representing a higher gold standard than would normally be available for comparison to VA. Finally, operational, research, and autopsy data were used to generate level three (L3) CoD for decedents for whom these data were available. All parties involved in the assignment of reference CoD were blinded to VA data, parent study arm, and any narratives around death obtained from family members as part of the research process, which were considered too similar to VA. L1 and L2 CoD were assigned using the same method as PCVA but involved different physicians; cases were processed in batches of 40-50. For each batch, all decedents were required to have finalised L1 CoD before the physicians were exposed to any higher level data.
L3 CoD were assigned in a different manner. Once all decedents with autopsy data had been assigned L1 and L2 CoD, a panel was convened to assign L3 CoD. The panel was made up of two infectious disease physicians, a microbiologist, and a pathologist, all of whom had extensive knowledge of local epidemiology; it was blinded to VA data, TB Fast Track study arm, and narratives from family members. The panel reviewed all available clinical data and attempted to reach consensus on CoD. In cases where full consensus could not be reached, consensus among three panel members was considered sufficient; if opinion was evenly split, the decedent was assigned an 'indeterminate' CoD.

Study-defined CoD
To differentiate HIV-associated TB from other HIV-associated causes, six broad studydefined CoD categories were constructed: TB in an HIV-positive individual; an HIV/AIDSrelated cause, excluding TB; a cause unrelated to HIV in an HIV-positive individual; TB in an HIV-negative individual; a cause other than TB in an HIV-negative individual; and an indeterminate cause. As part of both PCVA and reference CoD processes, reviewers assigned each decedent ICD-10 and study-defined CoD, along with a probability of 'definite', 'probable', or 'possible'. For reference CoD only, probabilities were based on predefined criteria (S2 Table). CCVA outputs were not reclassified into study-defined categories as they do not allow HIV-associated TB to be distinguished from other HIV/AIDSrelated CoD.

Data management and statistical analyses
VA quantitative data were entered directly into an online database (Mobenzi Technologies, Durban, South Africa) through a cell phone interface; narrative data were captured on paper. Data collected for reference CoD assignment were entered into EpiData v3.1 (The EpiData Association, Odense, Denmark) and data from the parent studies into a SQL database (Bytes Technology Group, Johannesburg, South Africa). InterVA-4.03 was used, with malaria prevalence set to 'Low' and HIV/AIDS to 'High'; InterVA-4 CSMFs were generated by dividing the sum of the likelihoods of each cause category by the sum of likelihoods for all causes [27]. SmartVA-Analyze v1.1.1 was used, with 'Malaria region', 'Health Care Experience', and 'Free text' options deselected; CSMFs, including deaths with 'Undetermined' cause, were calculated after outputs were grouped further (S1 Table). The Mortality Medical Data System (MMDS) 2011 software package [33] was used to generate a single 'underlying' CoD from ICD-10 codes assigned by PCVA and clinical panels; CSMFs were calculated using ACME/ TRANSAX output. All analyses were conducted using Stata v14 (StataCorp, College Station, TX, USA).
Two forms of agreement were measured: between CoD assigned to individual decedents; and between the proportion of deaths assigned to each cause category across the study population. Cohen's kappa (Κ) was used to measure agreement between individual decedents and overall chance-corrected concordance (CCC) used for agreement between cause categories; 1 equated to perfect agreement and 0 to agreement no greater than chance. Lin's concordance correlation coefficient (ρ C ) and CSMF accuracy were calculated for population-level comparisons [34]. In line with previous uses of ρ C , a value of less than 0.90 was considered 'poor' agreement [35].

Ethical considerations
Separate approvals were obtained for the parent studies and the sub-study from the human research ethics committees of the London School of Hygiene & Tropical Medicine and the University of the Witwatersrand. Beginning in August 2013, participants in TB Fast Track were asked to give written informed consent for MIA in the event of their death while undergoing follow-up as part of the parent study. If a participant who had given written consent died during follow-up, verbal agreement from the next of kin was obtained to proceed with MIA. For participants who were enrolled to TB Fast Track prior to August 2013 and died during follow-up, formal written consent to undertake MIA was sought from the next of kin. All VA respondents gave written informed consent for interview. Among the 259 decedents (Table 1), 147 (57%) were female; the median age was 39 (interquartile range [IQR] 33-47) years; the median CD4 count at enrolment into the parent study was 51 (IQR 22-102) cells/μL; 258 (99.6%) were black African; 248 (96%) were South African; and 203 (78%) were enrolled in a peri-urban area, as opposed to semi-rural. The median time from enrolment to death was 84 (IQR 39-184) days and from death to VA was 146 (IQR 82-290) days.

Data availability and consistency in physician assignment
Of the 212 TB Fast Track decedents, 196 (92%) had clinic files available; 122 (58%) had hospital files available; all had research data available; 207 (98%) had data from the NHLS database available; and 34 (26%) had MIA conducted, a median five (IQR 3-6) days after death. Of the 47 XPHACTOR decedents, all had clinic files available; none had hospital files available; all had research data available; and 33 (70%) had data from the NHLS database available (S1 Fig).

Cause-specific comparisons
Compared to L3 and L2, PCVA was more sensitive, but less specific, than both CCVA methods in assigning ICD-10 HIV/AIDS-related CoD (sensitivity against L3

Discussion
This study, conducted among HIV-positive adults in a setting of high TB prevalence, compared CoD assigned by VA to a robust gold standard, including MIA, and found the proportion of Mortality due to HIV-associated TB in South Africa: Comparing VA, MIA, and research data deaths attributable to TB was underestimated by methods that did not include data from pathological autopsy. Overall HIV-associated mortality was also underestimated by all VA methods when compared to the L3 (autopsy) reference standard, with poor agreement at an individual level; all methods performed better at a population level.

HIV-associated TB
A recent systematic review of autopsy studies in HIV-positive individuals found that 46% of TB diagnosed at autopsy had not been diagnosed ante-mortem and that almost 90% of HIVpositive individuals with evidence of TB at autopsy had disseminated disease [4]. In our study, every individual with MIA data assigned a reference-standard CoD of HIV-associated TB had evidence of extrapulmonary and/or disseminated disease at post-mortem (S3 Table); 6/16 (38%) individuals with evidence of active TB at autopsy had not been started on TB treatment between enrolment and death. The under-diagnosis of TB in the absence of autopsy data suggests that the long-standing emphasis on respiratory symptoms and sputum-based investigation make it less likely for physicians to consider a TB diagnosis in patients who do not report a cough. This has important implications for the development of TB diagnostics, the design of guidelines, and in the training of clinicians operating in areas of high HIV prevalence. It may also mean that current CCVA algorithms used to generate VA CoD need recalibration to account for those with advanced HIV and extrapulmonary or disseminated TB, who may have few or no respiratory symptoms.

ICD-10 coding of HIV-deaths
ICD-10 coding does not differentiate HIV-associated TB death from other HIV-associated causes, therefore deaths due to HIV-associated TB are effectively 'hidden' within HIV-related deaths a whole. TB is the leading cause of death in HIV-positive individuals and being unable to directly measure mortality attributable to it is an enormous disadvantage. As the global ART rollout continues and the HIV epidemic evolves, it is no longer sufficient to talk simply about deaths due to HIV; a more nuanced approach to disease and mortality measurement is needed [10,37] and central to this approach must be modifications to the ICD system to allow for differentiation of HIV-associated TB deaths from other HIV-related deaths. The current draft of ICD-11, due for release in 2018 [38,39], allows for the separation of HIV disease by clinical stage and for the inclusion of certain co-morbidities, such as TB and malaria. A separate three-character code denoting HIV-associated TB would be a welcome addition to these developments.

Comparison to previous studies
To our knowledge, only one other study, conducted in Kenya, has attempted to compare VA CoD to CoD derived from pathological autopsy. In this study both PCVA and InterVA-4, when compared to MIA, overestimated mortality due to TB in an area of high HIV prevalence [11]. However, in addition to HIV-positive adults, this study included children and HIV-negative individuals and selected only individuals who reported respiratory symptoms. This may have led to the exclusion of those with extrapulmonary TB and no respiratory symptoms and may account for the relatively low prevalence of TB seen at pathological autopsy. Our findings suggest the opposite, that VA underestimates deaths due to TB, but we would nevertheless agree with the authors of the Kenyan study that, at present, VA is not a suitable tool for assigning individual CoD in areas of high HIV prevalence and would be cautious about its use in registering individual deaths [40][41][42].
A number of studies have attempted to use VA to estimate CSMFs in areas of high HIV prevalence using a variety of gold standards. Most studies grouped HIV-associated TB deaths with other HIV/AIDS deaths, as per ICD-10 [10,22,[43][44][45][46][47][48][49][50][51], or did not clearly differentiate between 'TB' and 'HIV/AIDS' categories [52][53][54][55][56]. At least some of this inconsistency is due to the issues with ICD coding discussed above. For example, one of the largest studies, an analysis of 54,000 deaths from the International Network for the Demographic Evaluation of Populations and Their Health (INDEPTH) HDSS sites in Africa and Asia, compared InterVA-4 to a PCVA gold standard; ρ C was 0.831 overall and increased to 0.974 when HIV/AIDS and pulmonary TB CoD were combined for deaths from sub-Saharan Africa [57]. This high figure is misleading, however, as the two categories are intended to be mutually exclusive [58]; classifying HIV-associated TB deaths as 'pulmonary TB' will lead to the overall underestimation of HIV-associated deaths if current ICD rules are correctly applied. In our study, 35/41 (85.4%) individuals assigned a 'PTB' CoD by InterVA-4 were reported HIV-positive during the VA interview, but 32/35 (91%) did not have HIV/AIDS mentioned as a second or third CoD. Another important issue is that of extrapulmonary and disseminated disease: the WHO truncated CoD list classifies extrapulmonary and disseminated TB (ICD-10 codes A17-A19) under 'Other or unspecified infectious diseases' [59]; even if PTB and HIV/AIDS categories were combined for analysis purposes, the exclusion of these forms of TB would still result in the underestimation of TB-related deaths.
Only one previous study, conducted in 1998 across sites in Tanzania, Ethiopia, and Ghana, attempted to use VA to differentiate HIV-associated TB from other HIV-associated CoD, comparing CoD from PCVA and an early CCVA algorithm to CoD derived from hospital diagnoses [24]. Similar to our findings, both VA analysis methods showed low sensitivity and high specificity for 'TB + AIDS' diagnoses (respectively, 8% and 99% for PCVA and 35% and 95% for the CCVA algorithm), with PCVA detecting only 11/35 (31%) cases of 'TB + AIDS'. More recently, in the construction of the PHMRC gold standard dataset, CoD were initially classified as 'AIDS'; 'AIDS with TB'; or 'Pulmonary TB', with criteria explicitly stated for each [32]. The categories were consistent with ICD-10: inclusion in the 'Pulmonary TB' category required the individual to have tested HIV-negative. However, to be included in the 'AIDS with TB' category, an individual was required to have both a positive HIV test and a positive culture for M. tuberculosis, which likely led to the exclusion of individuals with disseminated TB and limited or no respiratory symptoms. To date, all comparisons of VA to the PHMRC dataset, including those conducted by the PHMRC team, have combined the 'AIDS' and 'AIDS with TB' categories, and have therefore not attempted to assess VA's ability to detect HIV-associated TB [19,20,30,[60][61][62][63][64]. The PHMRC gold standard dataset nevertheless remains a valuable resource; we would suggest that any future validation exercises use the differentiated, 'AIDS with TB' and 'AIDS' categories, rather than the combined 'AIDS' category, for comparison to VA.

Moving forward
In the absence of robust, validated CRVS data, there are few alternatives to VA that are both feasible and cost-effective in generating estimates of cause-specific mortality in countries with high HIV and TB prevalence [65,66]. Although, in this study, VA methods performed poorly in assigning individual CoD, it should be noted that VA is primarily intended to generate population-level estimates [18], and that performance in this regard was better. However, when using study-defined codes, which were designed to allow for the differentiation of HIV-associated TB from other HIV-associated causes, the population-level accuracy of PCVA was still sub-optimal (ρ C 0.70 and CSMF accuracy 0.71 compared to L2 standard [n = 259]; Table 3), confirming the difficulty of making this distinction.
The challenges of diagnosing HIV-associated TB disease are well documented [67,68] and, as found in the systematic review of autopsy studies [4], in the absence of new diagnostics it is likely that clinicians will continue to underdiagnose TB, which will have important implications for measuring progress towards the WHO targets described above [16]. Improvements are needed to TB surveillance methods, which, at present, consist mostly of enumerating individuals already diagnosed and started on treatment [69][70][71]. MIA is a useful technique for estimating the prevalence of infectious diseases [72,73], is acceptable to a high proportion of families [26,74], and could be used periodically for surveillance at sentinel sites [75], allowing for more accurate evaluation of the impact of disease-focused interventions.
Population-level estimates of cause-specific mortality are extremely valuable and improving the accuracy of VA-generated estimates would be of benefit, regardless of whether or not VA is used to assign individual CoD. The continued development and sharing of gold standard datasets that include pathological autopsy data, better reflecting the high proportions of HIV-associated mortality seen in high-burden countries and including both hospital and community deaths in different populations, would allow for greater standardisation in future validation studies. The parallel development of a structured, standardised process for CoD assignment, similar to that described in the Coding Causes of Death in HIV (CoDe) project [76], but assigning CoD matched to ICD codes [77], would increase the value of this exercise.

Limitations and strengths
This study had limitations: the median time from death to VA was slightly longer than the ideal three months that some recommend, but was well within the maximum 12 months recommended by WHO and was therefore considered unlikely to have had a substantial effect on VA-generated estimates [78,79]; physicians who reviewed clinical and VA data were aware that most decedents were likely HIV-positive and had been enrolled into TB-focused studies, which may have led to greater assignment of HIV-and TB-related CoD; missing operational and research data may have affected consistency of the reference standard; pathological autopsy data were available for a small number of decedents; and, although the reference CoD assigned represent our best estimates using the data available, the true CoD may still differ. Questions on ART and TB treatment, added to the VA instrument by the study team, may have led to changes in how events were reported in the free narrative section; the answers to the questions themselves, however, were not provided to reviewing physicians or to either software. InterVA-4 and SmartVA-Analyze are designed for use with the WHO 2014 and PHMRC VA instruments, respectively, therefore using the WHO 2012 instrument may have resulted in some missing variables; healthcare and narrative data were not provided to Smart-VA-Analyze, which may have affected its assignment of CoD. Individuals included in this analysis are likely representative of those with advanced HIV disease in resource-scarce settings, but may not necessarily represent the patterns of mortality seen in the wider community. This may have affected measures of agreement that are dependent on the composition of the gold standard CSMF and may, in turn, limit the generalisability of our findings. This study's strengths include: having recent, reliable information regarding CD4 count, ART status, and investigation and treatment of TB; using the same physicians to assign PCVA CoD for all decedents; using robust methods to assign reference CoD; comparing VA-assigned CoD to a reference standard that included MIA findings; comparing between CoD using a range of metrics, allowing for evaluation of different potential applications of VA; and classifying HIV-associated TB separately from other HIV-associated CoD, something made difficult by ICD-10 and generally neglected by previous VA studies.

Conclusions
Current VA methods underestimate mortality due to HIV-associated TB. At present, VA does not assign individual CoD in areas of high HIV prevalence with sufficient accuracy and, in part due to the limitations of ICD-10, does not distinguish between deaths due to HIV-associated TB and advanced HIV disease. More accurate methods are needed that allow for direct estimation of deaths due to HIV-associated TB; unless TB mortality is more accurately measured, it will be extremely difficult to track progress towards the goals set by the post-2015 global strategy.

Acknowledgments
This study would not have been possible without the generosity, understanding, and compassion of the individuals and families who gave their consent for autopsy. Our thanks also to staff at the mortuaries, clinics, hospitals, and laboratories, and to the TB Fast Track and XPHACTOR study teams.
Physician reviewers Prof Lucille Blumberg and Drs Kim Roberg, Sarah Stacey, Michelle Venter, Emily Wong, Evan Shoul, and David Spencer.
Thanks also to Prof. Sebastian Lucas and Dr Stephen Morris-Jones for their guidance in developing the MIA procedure, Dr Nicholas Maire and Vinit Mishra for their assistance in mapping the VA instrument to SmartVA-Analyze, and to Sizzy Ngobeni for her assistance in training research staff in VA methods.

Author Contributions
Conceptualization: ADG KK DC KM SC KLF ASK.