The Clinical Usefulness of Tuberculin Skin Test versus Interferon-Gamma Release Assays for Diagnosis of Latent Tuberculosis in HIV Patients: A Meta-Analysis

Background Accurate diagnosis of latent tuberculosis infection (LTBI) is becoming increasingly concerning due to the increasing the HIV epidemic, which have increased the risk for reactivation to active tuberculosis (TB) infection. LTBI is diagnosed by tuberculin skin test (TST) and interferon-gamma release assays (IGRAs). Objectives The aim of the present study was to conduct a meta-analysis of published papers on the agreement (kappa) between TST and QuantiFERON-TB Gold In-Tube (QFT-GIT) tests for diagnosis of LTBI in HIV patient. Methods Electronic databases including PubMed/Medline, Elsevier/Scopus and Embase/Ovid were reviewed up Jan. 2016. We performed a random effect model meta-analysis for estimation of pooled Kappa between the two methods of diagnosis. Meta regression was used for assessing potential heterogeneity and Egger’s test was used for assessing small study effect and publication bias. Results The initial search strategy produced 6744 records. Of them, 23 cross-sectional studies met the inclusion criteria and 20 studies entered in meta-analysis. The pooled kappa was and prevalence-adjusted and bias-adjusted kappa (PABAK) were 0.37 (95% CI: 0.28, 0.46) and 0.59 (0.49, 0.69). The discordance of TST-/QFT-GIT+ was more than TST+/QFT-GIT-. Kappa estimate between two tests was linearly associated with age and prevalence index and inversely associated with bias index. Conclusion Fair agreement between TST and QFT-GIT makes it difficult to know whether TST is as useful as the QFT-GIT in HIV-infected patients. The higher discordance of TST-/QFT-GIT+ in compared to TST+/QFT-GIT- can induce the higher sensitivity of QFT-GIT for diagnosis LTBI in HIV patients. Disagreement between two tests can be influenced by error in measurements and prevalence of HIV.


Introduction
Co-infection of Human immunodeficiency virus (HIV) and tuberculosis (TB) is a major public health concern. Recent estimates indicated about 9 million new cases of TB annually worldwide, of which 1.1 million are people living with HIV. Additionally, 1.5 million persons die from TB each year, 360,000 of whom are HIV-positive [1]. It has been shown that the risk of progression from Latent Tuberculosis Infection (LTBI) to active TB is 12 to 20 times greater for people living with HIV than for those without an HIV infection [2]. It has also been identified that most of the deaths resulting from TB in HIV/AIDS patients are preventable if there is an accurate diagnosis and treatment [3]. Tuberculin Skin Test (TST) and Interferon-Gamma Release Assays (IGRAs) are the main tests currently available for the diagnosis of LTBI.
The use of TST which called Mantoux tuberculin test or purified protein derivatives (PPDs), as a standard method of determining Mycobacterium tuberculosis infection has some disadvantages: false-positive reactions to infection with non-tuberculosis mycobacteria and history of BCG vaccination [4] as well as false-negative reactions in presence of weakened immune system, such as an HIV infection [5]. To deal with the challenges regarding TST, the IGRAs tests are introduced [6]. The QuantiFERON 1 -TB Gold In-Tube test (QFT-GIT) and T-SPOT 1 TB test (T-Spot) are two IFN-γ release assays (IGRAs) commercially available.
The result of some systematic reviews indicate that in comparison to TST, IGRAs can detect LTBI with a higher specificity, negative (NPV) and positive (PPV) predictive values [7,8]. The comparative performance of IGRAs and TST in LTBI detection is not clear especially in persons with HIV infection. Ample documents were published on the agreement (kappa) of the QFT-GIT and TST in HIV-positive people; the range of kappa were variable: 0.29 in one study in Georgia [9] and 0.59 in Chile [10]. On the other hand, some factors including gender, age, CD4+ T-cell count can influence the diagnosis of LTBI using TST and/or QFT-GIT [11] and finally on concordance or discordance between the two tests.
The most limitation of the data on the agreement of TST and QFT-GIT in HIV-infected persons is low small sample size [12,13]. Therefore a pooled study such as meta-analysis using a unique measure with high precision is needed. To the best or our knowledge, there have not been any systematic review and meta-analyses that evaluated the agreement (kappa) between TST and QFT-GIT in LTBI detection among HIV infected people. The aim of this study was to provide reliable evidence and clarify issues regarding this agreement using a systematic review and meta-analysis.
Reference lists of considered papers, reviews, meta-analyses, letters, and other relevant documents were searched and further communication with the authors of retrieved papers was done for additional information. Primary eligibility criteria for inclusion were: 1) studies that included HIV positive participants, 2) studies that had original data to calculate the kappa coefficient and its standard error. The cut-off value by the manufacturer for QFTGIT is ! 0.35 IU/ ml and finally blood sampling for QFT-GIT was done before the TST test. Papers were excluded if they: 1) studied HIV people with active TB, 2) studies on agreement between onestep TST with serial QFT-GIT, 3) reviews, cost analyses papers and letters without original data.

Data extraction and quality assessment
After eliminating duplicate records, two authors (EA and ADA) independently screened the titles for relevance and study selection. Abstracts from selected citations were independently reviewed for further relevance; in cases of disagreement, a third consultant (EM) acted as an arbitrator. The following items were extracted from the included studies and included in a checklist; first author, year of publication, study setting (country), gender, mean age, sample size, the history of BCG vaccination at infancy (yes, no, unknown, non-discrimination), TST cut-off (diameter of induration) as positive, mean/median T-cell CD4 count, and the number of subjects with positive and negative TST/QFT-GIT. Incidence rate of TB per 100, 000 in the study location was extracted from the WHO, Global Tuberculosis Report 2015 [15].
The reporting bias of included studies in the meta-analysis was assessed by a modified checklist from the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement [16]. The following criteria were assessed; (a) a clear definition of the study population; (b) description of the setting, locations, and relevant dates; (c) an exact definition of the outcome, such as LTBI diagnosis by the TST and/or the QFT-GIT; (d) eligibility criteria for the participants; (e) an explanation of how the study size was determined; (f) figures reflecting the number of outcomes associated with each test; and (g) an explanation of when each test was conducted, such as whether blood sampling for the QFT-GIT took place before the TST. Studies that fulfilled all of the above criteria were classified as having a low risk of bias. Studies that met one criteria were classified as having an intermediate risk of bias, and studies fulfilling more than one criteria were classified as having a high risk of bias.

Statistical methods
Cohen's kappa statistic, k, is a measure of agreement between categorical variables. Using this statistic, true agreement with accounting the agreement occurring by chance was achieved. Kappa expresses the proportion of agreement beyond that expected by chance. It is defined as the observed agreement not due to chance in relative to the maximum non-chance agreement [17]. Standard error (SE) and a 95% confidence interval (CI) for kappa were calculated using the methods described by J.L. Fleiss et al [18]. To calculate the kappa estimate, positive/negative results for two tests were considered and meaningful cases such as intermediate were discarded. Landis and Koch criteria were applied to interpret the degree of agreement; 0: poor, 0.1-0.2: slight, 0.21-0.40: fair, 41-0.60: moderate, 0.61-0.8: substantial and 0.81-1: almost perfect [19]. The inverse variance method was used to estimate the weighted pooled Cohen's kappa. The I 2 was used for evaluation between studies heterogeneity.
In the presence of heterogeneity, a random effects model was used to pool the effect estimates. Meta-analysis regression was applied to investigate which factors determine heterogeneity among included individual studies in the meta-analysis. A forest plot was used to show the kappa (95% CI) of individual studies that went into the meta-analysis. The publication bias was evaluated using a funnel plot with the test of Begg et al [20] and the linear regression asymmetry test of Egger et al [21].
Post hoc sensitivity analyses were conducted to estimate the adjusted kappa. It has been shown that the kappa is affected by effect of prevalence and bias. It has been identified that the prevalence of disease and the extent to which the tests disagree on the proportion of positive or negative cases (measurement bias) could influence the kappa coefficient. Prevalence and bias index were calculated as |a-d|/n and |b-c|/n, respectively. It is expressed that when prevalence or bias is low or absent or two latter indexes are large, the kappa will be biased, [22]. PABAK (prevalence-adjusted bias-adjusted kappa) is suggested to simulate when there is no prevalence or no bias effects. An adjustment for prevalence and bias is achieved by substituting 'the mean of cells a and d' instead of a and d and 'the mean of cells b and c' instead of b and c [23]. The data analyses were performed using Stata software (version 11/SE).

The characteristics of included studies
In total, 23 studies fulfilled the inclusion criteria. Twenty records were potentially available for meta-analysis [9][10][11][12][13][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38]. Three studies [39-41] fulfilled the eligibility criteria to be included in the meta-analysis but their data were not in usable format to calculate kappa estimate. A PRISMA flow chart, illustrating the details related to the selection process, is presented in Fig 1. The characteristics of all included studies are summarized in Table 1. The sample sizes of the included studies ranged from 16 to 553 and amounted to 4050 subjects in total. In 13 studies from 20 included studies in meta-analysis, value of TST-/QFT-GIT+ was higher than TST+/QFT-GIT-. The higher difference was observed between value of TST+/QFT-GIT + and TST-/QFT-GIT-. In One study [36] the values of contingency table was unreported and the S.E. was estimate from width of confidence interval. In the all included studies, a positive TST was defined as ! 5mm for HIV-infected patients except in Kabeer BSA et al study [35]. In our study the incidence rate of TB in the all included studies in the Meta analysis was under 22 per 100, 000 population expect for the studies that has been conducted in Tanzania [33], India [35] and Georgia [9] with incidence rate of 327, 167 and 106 per 100, 000 respectively. In line of the WHO high TB burden country lists [15], we divided some included studies in the our meta analysis [9,33,35] as studies in the high burden countries and other studies as studies in the low burden countries [10-13, 24-27, 29-31, 34, 37, 38].

The quality assessment and publication bias
Quality assessment of the studies showed four studies of low quality [26, 27,   discordance of TST-/QFT-GIT+ was more than TST+/QFT-GIT-. (Table 1). No evidence of publication bias was found (Egger's test: p = 0.48) (Fig 2).

Statistical analysis results
The pooled kappa coefficient between TST and QFT-GIT was 0.37 (95% CI: 0.28, 0.46) with the significant heterogeneity was found among studies (I 2 = 77.6%, p<0.001) (Fig 3). Stratified analysis by continents showed that kappa estimate (95%) equal to 0.24 (0.10, 0.36) for North America, 0.44 (0.32, 0.57) for Europe, and 0.52 (0.41, 0.63) and 0.30 (0.12, 0.48) for Africa and Asia. Among studies where some of subjects had a history of BCG vaccination, the kappa estimate was 0.41 (0.33, 0.49) while it was 0.37 (0.28, 0.46) for studies where history of BCG vaccination was unknown. The kappa estimate for high and low burden of TB was 0.36 (0.27, 0.45) and 0.37 (0.26, 0.48) respectively. Based on sub group of quality of studies the result showed that kappa (95%) for low, medium and high quality were 0.34 (0.26, 0.41), 0.80 (0.77, 0.83) and 0.79 (0.77, 0.82) respectively. Meta regression plot showed that age and prevalence index linearly related with kappa and bias index was inversely related (Fig 4A-4C). The results suggested

History of BCG vaccination
Median T-Cell CD4 that the kappa varied significantly by age, prevalence index and bias index.

Discussion
This meta-analysis of the 20 included studies showed that the kappa coefficient between TST and QFT-GIT was fair (0.37). Disagreement (kappa estimate) between TST/QFT-GIT could be attributed to age, country, the prevalence of HIV infection and the bias in measurements. The major limitation of most of the included studies was low sample size. It has been shown that an unreliable kappa estimate results and the pooled analysis such as meta-analysis was needed to increase the precision using a large sample size. The results of this study showed that the kappa estimate varied across the studies (0 to 0.71). It could be due to test characteristics, study settings and/or subject characteristics. It has been identified that in clinic-based studies the number of QFT-GIT and/or TST-positive subjects was limited [12]. In addition, true prevalence can affect the kappa estimate [42].
In this review, the number of positive QFT-GIT and negative TST, compared to negative QFT-GIT positive TST, was relatively high and this result is in line a meta-analysis that showed QFT-GIT is a more sensitive test in detecting LTBI [7]. One study showed that some factors explain the non-concordance between the two tests, such as co-infections and male gender, and lead to positive TST and positive QFT-GIT, respectively [9]. The high amount of concordance of a negative TST with a negative QFT-G in this study may be ascribed to the prevalence of HIV infection; here most included studies came from areas with low prevalence of infection.
Cattamanchi  included in our meta-analysis was trivial, the calculation of the weighted kappa because of loss precision was unwarranted. In this study, as the reversion rate from positive to negative in QFT-GIT serial in HIV-infected patients was identified [45], one step test was considered for two tests to get a valid estimate of agreement.
In this meta-analysis, CD4 counts in all subjects were high. It is identified that when the CD4 count is low, anergic TST responses and low-mitogen QFT-GIT responses occur [10]; the CD4 count may be an effect modifier and the kappa estimate is not uniform within its range. Some studies in countries with a high TB prevalence showed that in individuals with a low CD4+ T-cell count, increased rate of negative QFT occurred [40]. However, one study in Georgia showed that among patients with CD4 counts 100 μl the IGRAs was more sensitive than TST [9].
It is obvious that imbalance in actual cell or large differences between concordances/no concordance in the included studies was high. It leads to heterogeneity in marginal distribution, so the kappa estimate was far from real value [46]. Sensitivity analysis showed the kappa between the two tests in some area such as North America and Europe could be substantial. It has been criticized using PABAK to simulate the situation of no prevalence and no bias because PABAK can generate a value for kappa that is far from the situation in which the original tests are made [47].

Limitations of the study
Our study has some limitations that should be considered. The standard errors and confidence interval are generally not reported for kappa in all published studies; this deficit, along with unretrieved the gray literature, could affect real agreement. Uniform data for BCG vaccination and T-cell CD4 count in the included studies were not accessible and their effect on variation between studies were not clear.

Conclusion
In this study, the pooled kappa estimate was 0.37 (0.28, 0.46). The fair agreement between the two tests makes it unclear which test is optimal to detect LTBI. Age, the prevalence of HIV Agreement of QFT-GIT and TST among HIV Patients infection or bias in measurements may be related with agreement between two tests. Further studies are needed to assess the agreement of the two tests in detecting active TB. A network meta-analysis to get valid agreement among TST, QFT-GIT and T-SPOT is recommended.  (4)