Interferon-γ Release Assays for the Diagnosis of Tuberculosis and Tuberculosis Infection in HIV-Infected Adults: A Systematic Review and Meta-Analysis

Background Despite the widespread use of interferon-γ release assays (IGRAs), their role in diagnosing tuberculosis and targeting preventive therapy in HIV-infected patients remains unclear. We conducted a comprehensive systematic review to contribute to the evidence-based practice in HIV-infected people. Methodology/Principal Findings We searched MEDLINE, Cochrane, and Biomedicine databases to identify articles published between January 2005 and July 2011 that assessed QuantiFERON®-TB Gold In-Tube (QFT-GIT) and T-SPOT®.TB (T-SPOT.TB) in HIV-infected adults. We assessed their accuracy for the diagnosis of tuberculosis and incident active tuberculosis, and the proportion of indeterminate results. The search identified 38 evaluable studies covering a total of 6514 HIV-infected participants. The pooled sensitivity and specificity for tuberculosis were 61% and 72% for QFT-GIT, and 65% and 70% for T-SPOT.TB. The cumulative incidence of subsequent active tuberculosis was 8.3% for QFT-GIT and 10% for T-SPOT.TB in patients tested positive (one study each), and 0% for QFT-GIT (two studies) and T-SPOT.TB (one study) respectively in those tested negative. Pooled indeterminate rates were 8.2% for QFT-GIT and 5.9% for T-SPOT.TB. Rates were higher in high burden settings (12.0% for QFT-GIT and 7.7% for T-SPOT.TB) than in low-intermediate burden settings (3.9% for QFT-GIT and 4.3% for T-SPOT.TB). They were also higher in patients with CD4+ T-cell count <200 (11.6% for QFT-GIT and 11.4% for T-SPOT.TB) than in those with CD4+ T-cell count ≥200 (3.1% for QFT-GIT and 7.9% for T-SPOT.TB). Conclusions/Significance IGRAs have suboptimal accuracy for confirming or ruling out active tuberculosis disease in HIV-infected adults. While their predictive value for incident active tuberculosis is modest, a negative QFT-GIT implies a very low short- to medium-term risk. Identifying the factors associated with indeterminate results will help to optimize the use of IGRAs in clinical practice, particularly in resource-limited countries with a high prevalence of HIV-coinfection.


Introduction
Tuberculosis is one of the leading causes of mortality in people living with human immunodeficiency virus (HIV) worldwide, particularly in sub-Saharan Africa, where it is responsible for up to half of HIV-related deaths [1,2].
HIV co-infection increases the risk of tuberculosis either by facilitating reactivation of a remote latent infection (LTBI) or by favoring the progression of a recently acquired infection towards active disease. Therefore, rapid identification and early treatment of active tuberculosis cases in order to interrupt further transmission, as well as the detection and treatment of LTBI to prevent progression to active disease, are crucial for controlling HIV-associated tuberculosis [3]. However, the lack of accuracy of clinical and radiographic manifestations of tuberculosis in HIVinfected patients and the limitations of diagnostic tests pose great obstacles to rapid diagnosis and delay the initiation of specific treatment [4]. Furthermore, the well-known shortcomings of the tuberculin skin test (TST) for diagnosing LTBI hamper the accurate targeting of HIV-infected patients for isoniazid preventive therapy (IPT) [5].
T-cell-based interferon-gamma (IFN-c) release assays (IGRAs) constitute a promising alternative to TST for diagnosing tuberculosis infection. IGRAs use highly M. tuberculosis-specific antigens which are not present in most non-tuberculous mycobacteria or in the bacillus Calmette-Guérin vaccine [6]. Two commercial tests are available: the QuantiFERONH-TB Gold In-Tube (QFT-GIT) test (Cellestis Ltd, Carnegie, Australia), which uses ELISA to detect IFN-c in the culture supernatant, and the T-SPOTH.TB (Oxford Immunotec, Abingdon, UK), which is based on the enzyme-linked immunospot (ELISpot) assay.
In low-burden tuberculosis settings, IGRAs have shown better specificity and equal or greater sensitivity than TST for the detection of tuberculosis infection, and a better correlation with the intensity of exposure to a source of infection [7][8][9]. These advantages have raised great hopes for a better assessment of tuberculosis infection in people at risk, particularly in immunosuppressed and BCG-vaccinated individuals. Although in the absence of any supporting evidence, IGRAs have also been increasingly used as diagnostic tests for active tuberculosis. This practice has raised concern, particularly in high-burden and resource-limited countries, where the high background LTBI prevalence and the HIV-associated immunosuppression may limit their potential value as rule-in or rule-out tests. Based on recently published meta-analyses showing a suboptimal accuracy for either diagnosing or predicting subsequent active tuberculosis [10][11][12], the World Health Organization (WHO) issued a consensus statement in which an expert panel advised against the use of IGRAs for diagnosing active tuberculosis, irrespective of HIV status, or for identifying people at risk for active tuberculosis disease in low-and middle-income countries [13]. With regard to HIV-infected people, the WHO report stressed the very low quality of evidence for using IGRAs in these patients, and recommended that these tests should not replace TST for the assessment of LTBI [13].
Although IGRAs were not developed to replace conventional microbiological methods for the diagnosis of active tuberculosis disease, they may have an adjunctive role in symptomatic patients with suspicion of active disease by complementing clinicalradiographic and epidemiological data to guide diagnosis workup. Therefore, knowing how HIV infection compromises the IGRAs' ability to detect tuberculosis infection in patients with active disease is essential in order to determine their role in different clinical and epidemiological settings.
We conducted a comprehensive systematic review (SR) to assess the sensitivity and specificity of IGRAs for the diagnosis of active tuberculosis disease, their value to predict development of subsequent active tuberculosis, and the proportion of indeterminate results in HIV-infected adults. Whenever feasible, we assessed how HIV-associated CD4 + T-cell depletion affects IGRA performance, and tried to identify differences according to tuberculosis burden settings and HIV infection status.

Methods
This SR was conducted in accordance with the PRISMA statement [14]. Ethical approval was not required for this study.

Search
We systematically searched for studies published between 1 January 2005 and 31 July 2011 that evaluated the diagnostic performance of IGRAs for tuberculosis or LTBI in HIV-positive adult populations (or populations with at least five HIV-positive individuals). We searched MEDLINE, the Cochrane Central Register of Controlled Trials, and the Biomedicine Database (IME) of the Spanish National Research Council (CSIC). Searches comprised a combination of the following terms: ''HIV'', ''immunosuppressed patients'', ''tuberculosis'', ''latent tuberculosis infection'', ''QuantiFERON'', ''QuantiFERON-TB Gold'', ''T-SPOT.TB'', ''interferon-gamma release assays'', and ''T-cell assays'', as listed in titles, abstracts or text words. Searches were limited to studies published in English or Spanish. We also reviewed citations of the original and review articles, and guidelines for additional references. When necessary, we contacted the authors of the studies for additional information.

Selection
For our analysis, we selected only prospective studies that usedthe commercial tests QuantiFERONH-TB Gold In-Tube and T-SPOTH.TB performed in blood with 16-24 h of incubation. We excluded studies of non-commercial IGRAs or studies based on the old version of the ELISA assay (QuantiFERONH-TB Gold), as well as studies presenting non-original data, conference abstracts, editorials, reviews, guidelines, and studies conducted in animals.

Quality assessment
We checked the quality of the studies used to calculate assay accuracy with the QUADAS check list [15].
In the case of indeterminate results, we appraised the quality of the studies by assessing whether or not a definition was given in the methods section (''performed and interpreted according to the manufacturer's instructions'' was acceptable), and whether or not data for the two types of indeterminate tests (low IFN-c production in the positive control or high IFN-c production in the negative control) were reported separately. In addition, since an insufficient number of peripheral blood mononuclear cells (PBMCs) precludes performance of the T-SPOT.TB test, we also checked whether or not these unsuccessful test attempts (failure tests) had been reported.
To evaluate the quality of the studies that assessed the risk of subsequent tuberculosis according to the result of an IGRA assay, we used the Newcastle-Ottawa Scale (NOS) for non-randomized cohort studies [16].

Data extraction
Two researchers (M.S. and L.M.) independently compiled the data using a standardized data extraction sheet. Discrepancies were resolved by discussion and consensus. The following data were extracted: year of publication, period and country, number of participants, gender, test evaluated, CD4 + cell count, development of active tuberculosis, TST results, indeterminate test results (overall and by two CD4 + cell count thresholds) and fraction of individuals with true positive, false negative, true negative and false positive results for the calculation of the test sensitivity and specificity.

Quantitative data synthesis and analysis
We assessed the following outcomes for each study and pooled them when feasible: sensitivity and specificity for active tuberculosis, predictive value for incident active tuberculosis, and rates of indeterminate results. The following definitions were used: sensitivity refers to the proportion of culture-proven tuberculosis patients who had a positive IGRA test, and specificity refers to the proportion of symptomatic non-tuberculosis patients who had a negative IGRA test. For the sensitivity calculation, we included only patients with confirmed tuberculosis (either with a positive culture for M. tuberculosis, a positive nucleic acid amplification test, or characteristic histopathological findings and response to specific treatment) that was still untreated or had been treated for less than two weeks. Indeterminate results were included as false negatives. For the specificity calculation, we selected studies that had enrolled patients with suspected active tuberculosis (either symptoms potentially caused by tuberculosis or a clinical and radiographic picture suggestive of tuberculosis). Results due to low IFN-c production in the phytohaemagglutinin (PHA)-stimulated well or high background IFN-c production were defined as indeterminate.
Since T-SPOT.TB tests not performed due to insufficient PBMC cells are usually excluded from the analyses, we used the term ''failure tests'' for unrealizable tests. We assessed the effect of immunosuppression on sensitivity and indeterminate result rates by pooling and stratifying the results for 200 CD4 + T-cell count threshold.
Results are presented for each IGRA assay and for countries grouped by tuberculosis burden: high-burden (.40 cases per 100,000 population), low-to intermediate-burden (,40 cases per 100,000 population) [17]. Head-to-head comparisons between HIV-infected and HIV-uninfected individuals, as well as between the two IGRA assays and TST, were performed whenever possible.
We calculated combined estimates of pooled sensitivity, specificity and the 95% confidence interval (CI). The pooled effect for binary outcomes was presented as the difference with the 95% CI. A random-effects synthesis model meta-analysis was used to pool the effect across the studies. Inconsistency was quantified by the I 2 statistic. Forest plots were constructed to show the effect size of all the studies and the variability of the pooled estimates. The analyses were performed with MetaAnalyst software [18].

Characteristics of the studies
Of the total of 677 citations identified, 38 were eligible for analysis ( Figure 1) . The studies included were conducted in 17 different countries, 19 (50%) in high-burden countries, 18 (47.4%) in low-burden tuberculosis countries, and one (2.6%) included participants from both settings. Thirty-seven were published in English and one in Spanish. Some industry involvement was reported in 11 studies (28.9%), mainly in the form of donation of IGRA kits to the researchers. Of the 20 studies used to calculate sensitivity/specificity, 13 (65%) corresponded to high-burden countries. A summary of the 38 studies included is provided in Table 1. Detailed information on the studies included in the review is available upon request.

Quality of the studies
Eight of the 20 studies (40%) used to calculate sensitivity and specificity met all the quality indicators, ten (50%) met between 75% and 100%, and two (10%) met less than 75%.
Indeterminate results due to high production of IFN-c in the negative control were either not defined as such or excluded from the analysis in 32% of studies with QFT-GIT and in 28% with T-SPOT.TB. The results for the two types of indeterminate results were reported separately in 27% of studies with QFT-GIT and in 32% with T-SPOT.TB. Only three studies (16%) provided data on the T-SPOT.TB tests not performed because of insufficient quantities of cells.

Predictive value of IGRAs for incident active tuberculosis
Three longitudinal studies assessed incident active tuberculosis [22,43,52]; all of them were conducted in low-burden countries. In a prospective cohort study, 830 HIV-infected patients who underwent QFT-GIT testing were left untreated and were followed periodically [22]. Of 822 individuals without active tuberculosis at baseline, 36 were positive. After a median followup of 19 months, three (8.3%) patients with positive QFT-GIT developed tuberculosis, but none of the 705 patients with a negative QFT-GIT developed active disease. In another study with 201 HIV-seropositive individuals, two out of 20 infected patients with positive T-SPOT.TB who did not receive preventive treatment developed active tuberculosis during the first year [52]. In a third study assessing 135 HIV-infected individuals, none of the 103 patients who had a negative or indeterminate QFT-GIT result and negative TST at baseline developed tuberculosis after a median follow-up of 20 months [43] (Table 5).

Discussion
This SR provides a comprehensive summary of the current evidence on the performance of the two commercial IFN-c-based assays for the immunodiagnosis of tuberculosis and tuberculosis infection in HIV-infected adults. The main results can be summarized as follows. First, the sensitivity and specificity of either IGRA in HIV-infected people is suboptimal for being used alone to rule in or rule out active tuberculosis disease. Second, the risk of tuberculosis in the short-to medium-term in HIV-infected adults with a negative QFT-GIT seems to be low. Third, indeterminate results of IGRAs were more frequent in HIV-infected patients with active tuberculosis from high-burden tuberculosis countries. Fourth, HIV-associated immunosuppression, measured by circulating CD4 + T-lymphocytes, negatively affects the performance of QFT-GIT, and to a lesser extent, T-SPOT.TB.
The sensitivity of IGRAs for culture-confirmed tuberculosis in the current SR was lower than that reported in three metaanalyses including predominantly immunocompetent people [8,9,11], and similar to that reported for HIV-infected patients in the three previous SRs [10,11,12]. Taken together, the results of the previous SRs and our own show that the sensitivity of QFT-GIT is roughly 65%, ranging between 61% reported by Cattamanchi et al. [10] in low-income countries and 68% reported by Chen et al. [12] in both high and low-income countries. For T-SPOT.TB, the sensitivity was close to 70%, ranging between 65% obtained in the current SR and 72% reported by Cattamanchi et al. [10] in low-income countries. These figures mean that, at best, IGRAs will miss one in three cases of active tuberculosis (Table 7).
HIV-associated immunosuppression, measured by circulating CD4 + T-cells, weakens the ability of IGRAs to detect tuberculosis infection. A previous SR [10] explored the impact of immunosuppression on the proportion of positive results according to a 200 CD4 + T-cell threshold, regardless of whether they had active tuberculosis or not. However, the value of the information provided by this approach is limited because the analysis included healthy people with unknown LTBI status. In the current SR, we tried to determine the impact of CD4 + T-cell counts on the sensitivity of IGRAs in HIV-infected patients with active tuberculosis disease, but the results were inconclusive. While one of the three studies with QFT-GIT [23,28,31] observed lower sensitivity with CD4 + below 200 cells/mm 3 [23], another one found higher sensitivity with CD4 + below 200 cells/mm 3 [31], and a third one did not find significant differences in CD4 + T-cell counts between patients with either positive or negative QFT-GIT [28]. As for T-SPOT.TB, while two of the three studies [31,34,37] found no change in sensitivity with CD4+ Tcell counts [34,37], the other one found higher sensitivity in patients with CD4 + below 200 cells/mm 3 [31]. Since the decrease in sensitivity of IGRAs in HIV-infected patients is largely due to high rates of indeterminate results, the correct reporting of these results is essential for an accurate assessment of the sensitivity of the IGRA tests. Unfortunately, indeterminate results due either to a high-background production of interferon-c (negative control) or to a failure test due to an insufficient number of PBMCs are often explicitly excluded or not reported. In fact, in the three studies that provided these data, a third of all invalid T-SPOT.TB results were due to failed T-SPOT.TB tests because of a lack of cells [36,37,47]. This may lead to an overestimation of the sensitivity of T-SPOT.TB assay in HIV-infected patients, and challenges the commonly held assumption that performance of T-SPOT.TB is less affected (if at all) by CD4 + T-cell depletion than QFT-GIT.
It has been suggested that IGRAs are less affected than TST by HIV-associated immunosuppression. However, there is no consistent evidence that the IGRAs are more sensitive for detecting tuberculosis infection in patients with active disease. Data from the five studies reporting comparisons between QFT-GIT and TST yielded a pooled sensitivity of 67% and 60% respectively. Actually,  Table 6. Head-to-head comparison of the proportion of indeterminate results between QFT-GIT and T-SPOT.TB in HIV-infected patients. in the largest study, which included more than 800 patients, TST was at least as sensitive as QFT-GIT [32]. As might be expected, the specificity of either IGRA for active tuberculosis disease was suboptimal for use as a rule-in test [9]. Although IGRAs use highly M. tuberculosis-specific antigens, since they do not distinguish between latent and active infection they cannot provide optimal specificity. Besides, they reflect the high prevalence of LTBI in the countries in which most of the studies were conducted [57]. Whether the specificity of IGRAs in lowburden tuberculosis settings is better is currently unclear. The present SR identified only three studies in low-burden settings, all from Italy: two with QFT-GIT [26,27] and one with T-SPOT.TB [38]. Specificity was 89% for QFT-GIT in both studies, and 64% for T-SPOT.TB (Table 7).
Although culture-confirmed tuberculosis has been commonly used as a surrogate for tuberculosis infection, tuberculosisassociated immunodeficiency may impair the ability of IGRAs to detect the infection, particularly in HIV-infected patients. Therefore, their actual sensitivity for LTBI may be underestimated by extrapolating from patients with active disease [58]. Determining the capability of IGRAs to predict the risk of subsequent active tuberculosis is another way of evaluating the IGRAs suitability for detecting LTBI. A comprehensive SR, including mainly studies with non-HIV-infected individuals, showed a marginal advantage of IGRAs over the TST for predicting incident active tuberculosis [59]. Two studies identified in the current SR, both conducted in low-burden tuberculosis countries, showed modest associations between positive IGRA result and incident active tuberculosis in the short-to medium term [22,52]. Conversely, a negative result of QFT-GIT had a high negative predictive value (100%) in two studies [22,43]. These data, if further confirmed in large, longitudinal and properly designed studies, would help to improve the targeting of at-risk patients by reducing the number of people considered for preventive treatment.
Indeterminate results, due either to low IFN-c production in the positive control or to high IFN-c production in the negative control, may negatively affect the overall utility of IGRAs. The proportion of indeterminate results in the current SR showed huge differences across studies, ranging from no indeterminate results at all to rates as high as 25% and 33% for QFT-GIT and T-SPOT.TB respectively [31,36]. These differences are related to host characteristics (CD4 + cell counts), type of evaluated people (patients with active tuberculosis vs. people evaluated for LTBI), and setting (high-burden and resource-limited vs. low-burden and high-income settings), but are also due to differences in the criteria used for reporting data. Indeterminate results due to highbackground IFN-c production, as well as failure T-SPOT.TB tests due to an insufficient number of PBMCs, are often not counted as such and are excluded from the analyses. Therefore, the calculation of indeterminate result rates and their association with potentially influencing factors will inevitably be compromised by these limitations. Interestingly, the types of indeterminate results were not equally distributed for the two assays. While low IFN-c production upon stimulation with PHA (positive control) accounted for more than 90% of the indeterminate results with the QFT-GIT assay, half of the indeterminate T-SPOT.TB assays were due to high-background IFN-c production (negative control).
The pooled indeterminate rates for the two assays were higher in high-burden settings than in low-burden settings. They were also higher in patients with symptoms suggestive of tuberculosis or culture-confirmed tuberculosis than in those screened for LTBI. Because studies that enrolled patients with active tuberculosis were mainly carried out in high-burden countries, whilst those that enrolled patients screened for LTBI were from low-burden countries, further analyses to determine which of the two factors has a greater influence on the occurrence of indeterminate results cannot be performed. On the one hand, HIV-infected patients usually have profound CD4 + T-lymphocyte depletion either as a cause or as a consequence of the disease, which may cause anergy and indeterminate IGRA results [58]. On the other hand, indeterminate results have been related to operational factors mainly linked to resource-limited settings, such as delayed incubation [60][61][62], and the location of the laboratory at which the samples are processed (according to data from Zambia; K. Shanaube, personal communication) (Table 8).
Our SR has limitations. First, the validity of the results is limited by the inconsistency across the studies. This heterogeneity persisted after performing subgroup analyses. Second, the main body of literature on active tuberculosis comes from high-burden tuberculosis and resource-limited settings, which limits the generalization of our estimates. Conversely, studies for the prediction of subsequent development of active disease in HIVinfected patients were exclusively from low-burden countries. Therefore, the low risk of subsequent active tuberculosis for patients testing negative on QFT-GIT obtained in two low-burden countries in Europe cannot be extrapolated to countries with high burdens of tuberculosis such as Sub-Saharan African countries. Finally, the lack of an adequate standard for latent tuberculosis infection is a inherent limitation to every SR to draw confident estimates on the capacity of IGRA tests to detect tuberculosis infection in people without evidence of active disease.
Nonetheless, some relevant conclusions may be drawn from this SR. First, the current evidence indicates that neither IGRA is able to replace conventional microbiological diagnosis of tuberculosis in HIV-infected patients. Second, QFT-GIT, if the low risk of subsequent active tuberculosis in HIV-infected patients testing negative is confirmed, could replace TST for targeting at-risk patients for chemoprophylaxis in low-burden tuberculosis countries. Third, potential causes of invalid tests, such as delayed incubation and other operational factors, should be addressed in order to improve the performance of IGRAs, particularly in resource-limited high-burden tuberculosis countries with high HIV-coinfection prevalence.