Detection of Tuberculosis in HIV-Infected and -Uninfected African Adults Using Whole Blood RNA Expression Signatures: A Case-Control Study

Using a microarray-based approach, Michael Levin and colleagues develop a disease risk score to distinguish active from latent tuberculosis, as well as tuberculosis from other diseases, using whole blood samples. Please see later in the article for the Editors' Summary


Introduction
There is an urgent need for improved tests to diagnose active tuberculosis (TB), particularly in countries of sub-Saharan Africa most affected by the TB/HIV pandemic. The diagnosis of TB was problematic even before the emergence of HIV, as symptoms and radiological features of TB overlap those of many other infectious and non-infectious conditions. However in countries of sub-Saharan Africa, where HIV prevalence amongst individuals presenting with symptoms consistent with TB is over 50% [1], the diagnostic difficulty is increased, as TB must be distinguished from a wide range of opportunistic infections and HIV-associated malignancies that present clinically with similar symptoms and signs.
For over a century, diagnosis of TB has relied on clinical and radiological features, sputum microscopy (with or without culture), and tuberculin skin testing (TST). All of these have major drawbacks, particularly in HIV co-infected individuals [2,3], in whom radiological features are often atypical [4], cavitary lung disease is less common [5,6], and results of sputum microscopy are often negative [2,7]. Furthermore, culture facilities are largely unavailable in many African hospitals [8]. As TST and interferon gamma release assays (IGRAs) cannot discriminate TB from latent TB infection (LTBI) [9], they are of limited diagnostic utility amongst African adults where LTBI is highly prevalent in the general population [10], and amongst inpatients with other diagnoses. Molecular methods have improved detection of Mycobacterium tuberculosis (M.TB) DNA in sputum [11], but the sensitivity of this approach is lower in smear negative sputum samples even if culture positive [12]. Consequently, high proportions of patients with TB in sub-Saharan Africa remain undiagnosed or are treated empirically without laboratory confirmation. The need for improved diagnostic methods is highlighted by post mortem studies showing TB to be a frequent undiagnosed cause of death in Africa [13][14][15].
RNA expression analysis by microarray has emerged as a powerful tool for understanding disease biology [16]. Many diseases including cancer [17] and infectious diseases [18], as well as TB [19][20][21][22][23][24][25][26], are associated with specific transcriptional profiles in blood or tissue. Although previous studies in TB have suggested that RNA expression might be used diagnostically to distinguish TB from other conditions, these studies have excluded HIVinfected participants, and have compared TB with other diseases (OD) that are not representative of the spectrum seen in HIVinfected and -uninfected patients presenting to African hospitals with symptoms for which TB is included in the differential diagnosis [19][20][21][22][23][24][25][26]. There is thus a need to identify biomarkers that discriminate TB from OD prevalent in African populations, where the burden of the HIV/TB pandemic is greatest.
In this two country prospective case-control study, we investigated the hypothesis that host peripheral blood RNA expression would distinguish TB from other conditions prevalent in African populations in the context of endemic HIV infection, and explored the use of a transcriptional signature as the basis for a diagnostic test.

Ethics Statement
The study was approved by the Human Research Ethics Committee of the University of Cape Town, South Africa (HREC012/2007), the National Health Sciences Research Committee, Malawi (NHSRC/447), and the Ethics Committee of the London School of Hygiene and Tropical Medicine (5212). Written information was provided by trained local health workers in local languages and all patients provided written consent.

Study Sites
In order to enable generalization of our findings to African countries with differing prevalence of malaria and other parasitic infections, as well as other environmental exposures that might affect transcriptional profiles, we chose highly contrasting study sites (one urban, one rural) in two African countries with differing co-endemic diseases: Cape Town, South Africa. South Africa has one of the highest TB incidence rates in Africa (981 per 100,000) [27], as well as high rates of HIV infection (up to 41.8% prevalence in females aged [25][26][27][28][29][30][31][32][33][34][35] [28]. Patients undergoing investigation for suspected TB were recruited at GF Jooste Hospital Manenberg, Groote Schuur Hospital, and at Khayelitsha site B clinics serving the largely Xhosa population residing in the low income townships of Cape Town. Malaria is not endemic in these urban populations. Karonga district, Northern Malawi. The incidence of new TB cases in Karonga district (180 per 100,000, Karonga Prevention Study unpublished data, 2012) and the stable HIV prevalence (10%-15% of females aged 25-29, Karonga Prevention Study unpublished data, 2012) are lower in Karonga than in Cape Town. Malaria and helminth infection are hyperendemic. Patients were recruited at Karonga District hospital, which serves a rural population living by the shores of Lake Malawi.

Diagnostic Process
To ensure accurate assignment of patients to definite TB and OD groups, a rigorous diagnostic process was followed. All patients underwent chest radiographs and serological testing for HIV, along with cultures of blood, CSF, and urine, and biopsies for histological examination including TB culture where clinically indicated. Two sputum samples obtained after induction or coughing were examined by standard microscopy for acid fast bacilli (AFB) and cultured for TB using standard methods (i.e., solid media [South Africa and Malawi] and on liquid media [South Africa only]) [29]. Patients were followed up 26 wk post diagnosis to confirm that those with OD remained TB-free. Individuals were either assigned to one of the diagnostic groups or excluded once the results of investigations and follow-up were available. Healthy LTBI controls were recruited by random community selection (Malawi) and from HIV screening clinics (South Africa) from the same catchment areas as patients with TB ( Figure 1). In vitro IGRA to substantiate LTBI was undertaken using an in-house whole blood assay [30,31]. OD patients were recruited if they presented with symptoms that would mandate investigation for TB as a differential diagnosis. After intensive investigation, any case with an established alternative diagnosis to TB, no microbiological evidence of TB, and an absence of TB symptoms at the time of follow-up or with an observed improvement of clinical symptoms on follow-up without TB treatment, was recruited as an OD case. If TB could not be reliably ruled out of the differential, the patient was excluded.

Patient Cohorts
Patient recruitment strategies, which differed at each site, were embedded within health services administered by statutory providers in order to best investigate on an ''intention to test'' basis.
Cape Town, South Africa. Recruitment in Cape Town commenced 12th October 2007 and concluded 5th January 2010. Subject to research staff availability, 96 sequential patients presenting with at least one positive TB culture result were recruited from an outpatient TB clinic in Khayelitsha site B until 49 HIV-infected and 47 HIV-uninfected persons were recruited ( Figure 2). In Cape Town, 36.7% (18/49) HIV-infected patients with TB were smear-negative and 8.5% (4/47) HIV-uninfected patients were smear-negative. Patients in the OD category were recruited at GF Jooste and Groote Schuur hospitals in Cape Town. Patients were assessed by a hospital clinician and enrolled in the study if TB was considered in the differential diagnosis. After intensive investigation as described above, patients were assigned to the OD group if (1) an alternative diagnosis was established; (2) no microbiological evidence of TB was found after culture of sputum or other samples; and (3) an improvement of clinical symptoms was observed on follow-up without TB treatment (Figure 1). If a patient recruited to the OD group was later found to be culture positive for M.TB, they were reclassified appropriately. In total 138 HIV-infected and 80 HIV-uninfected patients were recruited in the OD group, of which 70 HIVinfected and 31 HIV-uninfected were excluded as TB diagnosis could not be excluded (i.e., TB uncertain) ( Figure 2).
Karonga district, Northern Malawi. Recruitment at Karonga District Hospital commenced on 1st June 2007 and ceased on the 30th November 2009. Patients attending the hospital were assessed by a local clinician. If this clinician considered TB to be within the differential diagnosis, patients were recruited by a study staff member and investigated according to clinical and study protocols as described above. Following the completion of inpatient care, patients were followed up for at least 26 wk post discharge to assess their progress including a verbal autopsy if the patient had died. Individuals were categorized following the completion of follow-up. Patients were assigned to the OD group if (1) a firm alternative diagnosis was established; (2) there was no microbiological evidence of TB; and (3) there was absence of symptoms of TB at the time of follow-up or assignation of an alternative cause of death on verbal autopsy ( Figure 1). Individuals who did not have TB and did not fulfill criteria for OD-e.g., failed to attend follow-up and with an unknown 6-mo outcomewere categorized as ''TB uncertain'' (i.e., TB uncertain). During the recruitment period 437 patients were recruited. Of these 254 had definite TB, 77 had a confirmed OD, and 106 were categorized as TB cannot be excluded. The first 60 HIV-infected and 59 HIV-uninfected patients with TB, along with all the OD patients were included in the RNA expression study (Figure 2). In Malawi, 13.3% (8/60) HIV-infected patients with TB were smear-negative and 10.2% (6/59) HIV-uninfected patients were smear-negative.

Oversight and Conduct of the Study
Patients were recruited by FZ and a team of research assistants in Karonga, Malawi, and by TO and hospital staff in Cape Town, South Africa. Assignment of patients to clinical groups was made by consensus of two experienced clinicians at each site (independent of those managing the patient clinically) after review of the investigation results. Testing for HIV status was conducted after appropriate counseling. Clinical data were anonymised and patient samples identified only by study number. Statistical analysis was conducted only after the RNA expression data and the clinical databases had been locked and deposited for independent verification.

Peripheral Blood RNA Expression by Microarray
Whole blood was collected at the time of recruitment (before or within 24 h of commencing TB treatment in suspected patients) in PAXgene blood RNA tubes (PreAnalytiX), frozen within 3 h of collection, and later extracted using PAXgene blood RNA kits (PreAnalytiX). RNA was shipped frozen to the Genome Institute of Singapore for analysis on HumanHT-12 v.4 expression Beadarrays (Illumina). Additional details of the microarray method, quality control, and analysis are provided in Text S1.

Statistical Analysis
Expression data were analysed using 'R' Language and Environment for Statistical Computing (R) 2.12.1 (Text S1). To identify transcript signatures applicable across geographic locations and in patients with differing HIV status, we combined HIV-infected and -uninfected patient cohorts from South Africa and Malawi. The recruited participants were randomly assigned to a training cohort (80% of the participants) and a test cohort (20%) with no overlap, using the ''sample( )'' function without replacement in 'R', which obtains a subset of a given set [32]. For additional validation we used the whole blood expression dataset from Berry et al. [25] comparing TB with LTBI and other infectious diseases in an African case-control study (accession GSE19491) (i.e., the ''validation'' dataset) (Text S1).
A Simplified Method for Identifying Individual Patient's Risk of Active TB Current whole genome array-based technologies are not well suited for use in resource poor settings as they are costly and require sophisticated technology as well as bioinformatics expertise. We therefore developed a method for translation of multiple transcript RNA signatures into a disease risk score (DRS), which could form the basis of a simple, low cost, diagnostic test requiring basic laboratory facilities and minimal bioinformatics analysis. For each individual, we calculated (on normalized intensities) the DRS using the minimal transcript selected sets for TB versus LTBI and TB versus OD. The score is derived by adding the total intensity at up-regulated transcripts, and subtracting the total intensity at all down-regulated transcripts (Text S1). The threshold for the classification was calculated as the weighted average of risk score within each class (group of patients), with weights given as the inverse of the standard deviation of the score within each class (Text S1). The information that the DRS requires for classification (i.e., the expression values of the transcripts of the signatures) can be derived from the dataset itself, which allows its unbiased application using expression data acquired using other array platforms or non-array technologies. The sensitivity and specificity of the score in disease classification were evaluated on the test cohort and validation dataset.

Accession Numbers
The data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number GSE37250 (http://www.ncbi.nlm. nih.gov/geo/query/acc.cgi?acc = GSE37250).

Results
We recruited 311 adults to the South African cohort and 273 to the Malawi cohort meeting the definitions for TB or OD, after screening a total of 314 in South Africa and 437 patients in Malawi (Figures 1 and 2; Table 1). After including samples from LTBI controls that were recruited separately (98 and 77 patients in South Africa and Malawi, respectively) and removing technical failures (48 samples), 536 consecutive patient samples remained for microarray analysis (Figure 2). The spectrum of infectious and malignant diseases in the OD cohorts reflected the range of conditions with similar clinical manifestations to TB at each site ( Table 2).

TB Specific RNA Signature That Is Independent of Geographic Location and HIV Status
We performed quality control on the microarray data in order to examine the effect of disease state on transcript expression and to check for assignment errors. Inspection revealed that the primary clustering was based on disease state (TB, LTBI, OD) rather than geographical location or HIV status ( Figure S1). There was substantial correlation of TB versus LTBI differential expression across different geographic locations and HIV status, which was also seen for TB versus OD ( Figures S2 and S3). This indicates the presence of a robust underlying signature of TB, independent of HIV status or geographical location.

Identification and Validation of Minimal Transcript Sets
To find minimal transcript sets required to discriminate TB from other groups, we applied the variable selection algorithm elastic net [34] to the training cohort (Methods; Text S1). A 27 transcript model was identified for discriminating TB from LTBI in the South Africa/Malawi training and test set ( Figure 3A and 3B; Table S1), whilst a 44 transcript model was identified for discriminating TB from OD ( Figure 3C and 3D; Table S2). These models were also applied to data from the South Africa validation dataset [25], which, unlike our cohort, included only HIVuninfected participants ( Figure S4).

Evaluation of a Simplified Disease Risk Score for TB
To evaluate the feasibility of using a simplified diagnostic test based on our transcript sets for TB diagnosis in low resource settings, we applied the DRS to our test cohort, which includes patients that were not used to discover the signatures, and to the South Africa validation dataset [25]. In our combined HIVinfected and -uninfected test set, the 27 transcript DRS discriminated TB from LTBI with sensitivity and specificity of 95%, 95% CI (87-100), and 90%, 95% CI (80-97), respectively, whilst achieving perfect classification in the HIV-uninfected cohorts and a slightly reduced accuracy in the HIV-infected cohorts ( Figures 4A, 5A, and 5B; Table 3). In the validation dataset, the DRS achieved a sensitivity of 95%, 95% CI (85-100), and a specificity of 94%, 95% CI (84-100) ( Figure 4B; Table 3). As for the discrimination between TB and OD, the 44 transcript DRS's sensitivity and specificity were 93%, 95% CI (83-100), and 88%, 95% CI (74-97), respectively, with consistent accuracy in the HIVinfected and -uninfected test cohorts ( Figures 4C, 5C, and 5D; Table 3). In the validation dataset, the patients were classified with 100% sensitivity, 95% CI (100-100), and 96% specificity, 95% CI (93-100) ( Figure 4D; Table 3). Similar values for sensitivity and specificity were obtained when the DRS was evaluated in the training dataset, demonstrating the robustness of our approach to avoid overfitting (Table S5). In order to evaluate the classificatory power of the DRS, we compared its performance with the regression model derived from the elastic net based on the same signatures (Table S5). We found that our DRS had similar accuracy in distinguishing TB from LTBI and OD to the weighted regression model.
In order to assess the predictive value of our DRS in a cohort of patients undergoing investigation for persistent symptoms such as cough, fever, and weight loss, i.e., where TB was included in the differential diagnosis, we used the prevalence of TB in our prospective Malawi cohort (58%; 254 confirmed TB cases of 437 patients with suspected TB) to calculate the positive and negative predictive value (PPV/NPV). The DRS for TB versus OD had a PPV of 92%, 95% CI (84-99), and a NPV of 90%, 95% CI (80-100) (Table S7). Using a 20% prevalence, which may be more reflective of a general primary care setting in a high-burden African country, NPV for TB versus OD is higher (98%, 95% CI [96-100]), but PPV decreases (66%, 95% CI [46-87]), emphasizing the value of DRS as a rule-out test, with those patients with positive DRS selected for further investigation (Table S7).
We also explored the effect of adjusting the threshold for the DRS in assigning individual patients to TB or LTBI/OD. By accepting a percentage of patients as ''non-classifiable,'' the majority of patients under investigation are accurately assigned. These ''non-classifiable'' patients could then be selected for more detailed investigation ( Figure S5).
As it would be advantageous to have a single signature that distinguished TB from non-TB, we assessed the performance of a signature in distinguishing TB from both TB and LTBI. A 53 transcript signature was identified ( Table S3) that distinguished TB from both LTBI and OD with sensitivity/specificity 91%/82%-a lower performance than TB/LTBI and TB/OD signatures alone. We also explored whether a smaller number of transcripts could be used to distinguish TB from LTBI and from OD, which would aid in manufacturing of a test (Text S1), resulting in a 21 and 29 transcript signature for distinguishing TB from LTBI and OD, respectively. The sensitivity of the smaller models was 6%-10% lower than the original models, while retaining the same specificity for TB versus OD (Table S8).
In contrast to our approach, previous studies of RNA expression as a diagnostic tool for TB have excluded HIV-infected patients, and have used other disease controls that were not recruited concurrently with TB cases or from the same population of patients undergoing investigation for TB [19,21,22,24,25]. To establish how these differences in biomarker study design might affect performance of biomarker signatures, we compared the performance of our 27 transcript TB/LTBI signature and our 44 transcript TB/OD signature with the performance of the signatures of Berry et al. [25] for discrimination of TB versus LTBI (393 transcripts) and TB versus OD (86 transcripts). While the 393 TB/LTBI signature achieved a sensitivity of 88%, 95% CI (80-94), and a specificity of 84%, 95% CI (76-92), on our TB HIV-uninfected cohorts, the performance on the HIV-infected group was 74%, 95% CI (65-82), and 80%, 95% CI (71-87), respectively ( Figure 6; Table 4). Furthermore, the Berry et al. TB/ OD 86 transcript signature had a lower performance on our cohorts (sensitivity 71%, 95% CI (62-80), specificity 76%, 95% CI (67-84), in HIV-uninfected; sensitivity 67%, 95% CI (58-75), specificity 69%, 95% CI (59-78), in HIV-infected) ( Figure 6; Table 4). Thus our minimal transcript signatures and the DRS method show better performance in distinguishing TB from LTBI and OD (especially in the HIV-infected cohorts) than the much larger number of transcripts identified by Berry et al. [25]. (Table 5) Finally, we evaluated the performance of our signatures in the smear-negative sub-group of patients with TB, the majority of whom were HIV-infected (31 smear-negative TB patients with definite negative smear status; seven TB HIV-uninfected and 24 TB HIV-infected). In the smear-negative patients the DRS showed a sensitivity for detecting TB of 68%, 95% CI (52-84), when using    (Table 4 (Table S9).

Discussion
We have identified a host blood transcriptomic signature that distinguishes TB from a wide range of OD prevalent in HIVinfected and -uninfected African patients. We found that patients with TB can be distinguished from LTBI with only 27 transcripts and from OD with 44 transcripts. Our findings appear robust as the results are reproducible in both HIV-infected and -uninfected cohorts, in different geographic locations, and in an independent TB patient dataset. The high sensitivity and specificity of the signatures in distinguishing TB from OD, even in the HIVinfected patients that have differing levels of T cell depletion and a wide spectrum of opportunistic infections as well as HIV-related complications, suggests that the signatures are promising biomarkers of TB. The relatively small number of transcripts in our signatures may increase the potential for using transcriptional profiling as a clinical diagnostic tool from a single peripheral blood sample (i.e., using a multiplex assay [35,36]).
The major challenge for diagnosis of TB in Africa is how to distinguish this disease from the range of other conditions that show similar symptoms in countries where TB and HIV are coendemic. Previous TB biomarker studies have focused on distinguishing patients with TB from healthy controls, or from LTBI [21,22,24], or have used other disease controls that may not represent the ''real world'' disease spectra from which TB should be clinically differentiated [19,25]. Furthermore, these TB biomarker studies have also excluded HIV co-infected patients who are the group that most need new diagnostics. Our study design should ensure that our signatures are applicable in TB/ HIV endemic countries as we recruited patients with TB concurrently with patients with a range of conditions that present with similar clinical features to TB, as well as recruiting both HIVinfected and -uninfected individuals.
We have identified separate signatures for distinguishing TB/ OD and TB/LTBI, which only overlap in three transcripts. In practice the clinical applications of these signatures might be distinct as the TB/LTBI signature would be of value in contact screening, where the concern is distinguishing active disease from previous exposure in minimally symptomatic individuals. The TB/OD signature would be of most value in evaluating symptomatic patients presenting to medical services with symptoms of TB. We have also explored whether a single signature might be used to distinguish TB from both LTBI and OD. The combined signature showed lower performance to the separate TB/LTBI and TB/OD signatures. Further exploration of the operational performance of a combined signature or separate signatures is needed to establish the best strategy.
Although our signatures and DRS distinguished the majority of patients with TB from those with LTBI or OD, a proportion of patients were not correctly classified. There is increasing recognition that TB and LTBI may represent a dynamically evolving continuum, particularly in HIV-infected patients and thus failure to culture M.TB is not absolute proof that TB is not present. Some false assignment by our current ''gold standard'' is to be expected as noted by post mortem studies at which undiagnosed TB is confirmed [14,15]. All patients in the OD group presented with symptoms for which TB was included in the differential diagnosis, and it is possible that TB may have been misdiagnosed in a small proportion of OD patients despite the extensive clinical investigation used to assign each patient to each diagnostic group. Some improvement in sensitivity and specificity of our DRS may also be achieved by weighting the signal from the most discriminatory transcripts, and this could be explored in subsequent refinements of the method.
A major concern in using transcriptional signatures as a clinical diagnostic tool in resource poor settings is the complexity, as well as cost, of the current methodologies. Our results have shown that transcriptional signatures can be used to distinguish TB from OD in an African setting. We explored the feasibility of a simplified method for disease categorization that may facilitate development of a diagnostic test based on our signatures. Our DRS provides a new approach that enables the use of multi-transcript signatures for individual disease risk assignment without the requirement for complex analysis. Our method could be used to develop a simple test in which the transcripts comprising the diagnostic signature (separated into those that are either up-or down-regulated in TB relative to controls) are each measured using a suitable detection system [35], and the combined signature used to identify each patient's risk of TB. For example, a simple test using the TB/OD signature probes that show increased transcript expression in TB relative to OD could be located in a single well or tube, and those probes that show reduced transcript expression in TB located in a second well or tube. Binding of RNA from a patient's blood to these probes could be detected as a combined signal from each tube using one of the aforementioned detection systems. To allow normalization, expression of up-or down-regulated transcripts in an individual patient could be compared with that of housekeeping genes, which do not show variation between healthy and disease states. There are methods for rapid detection of multi-transcript signatures including lateral flow reverse transcription (RT)-PCR based systems, nano-pore technology [37], nano-particle enzyme linked detection [38,39], and detection using nano-wires and electrical impedance [40]. Some of these may be suitable for direct analysis of multiple transcript signatures in blood and at a relatively low cost.
While this study provides a proof of principle that relatively small numbers of RNA transcripts can be used to discriminate active TB from latent TB infection and OD in Africa, limitations remain that need to be addressed in order to translate these results into a clinical test. One such limitation is that our study has not assessed performance of our DRS in patients treated for TB solely on the basis of clinical suspicion, without any microbiological confirmation. Amongst these ''probable/possible'' patients with TB, there is no gold standard to evaluate any new biomarker. Exclusion of probable/possible patients with TB may have produced better estimates of sensitivity and specificity than would be achieved in a prospective ''all comers'' study including the entire cohort of patients in whom TB is included in the differential diagnosis. Thus, further evaluation using a prospective population based study in which the decision whether and when to initiate TB treatment is evaluated against the new biomarker is required. Future studies will also be required to refine the use of these biomarkers in a clinical decision process either as an initial screening tool, or in conjunction with more detailed culture based diagnostics.
From a clinical perspective a simple transcriptome-based test that reliably diagnoses or excludes TB in the majority of patients undergoing investigation for suspected TB, using a single blood sample, would be of great value, allowing scarce hospital resources to be focused on the small proportion of patients where the result was indeterminate. The challenge for the academic research community and for industry is to develop innovative methods to translate multi-transcript signatures into simple, cheap tests for TB suitable for use in African health facilities.     Editors' Summary Background. Tuberculosis (TB), caused by Mycobacterium tuberculosis, is curable and preventable, but according to the World Health Organization (WHO), in 2011, 8.7 million people had symptoms of TB (usually a productive cough and fever) and 1.4 million people-95% from low-and middle-income countries-died from this infection. Worldwide, TB is also the leading cause of death in people with HIV. For over a century, diagnosis of TB has relied on clinical and radiological features, sputum microscopy, and tuberculin skin testing but all of these tests have major disadvantages, especially in people who are also infected with HIV (HIV/TB co-infection) in whom results are often atypical or falsely negative. Furthermore, current tests cannot distinguish between inactive (latent) and active TB infection. Therefore, there is a need to identify biomarkers that can differentiate TB from other diseases common to African populations, where the burden of the HIV/TB pandemic is greatest.

Supporting Information
Why Was This Study Done? Previous studies have suggested that TB may be associated with specific transcriptional profiles (identified by microarray analysis) in the blood of the infected patient (host), which might make it possible to differentiate TB from other conditions. However, these studies have not included people co-infected with HIV and have included in the differential diagnosis diseases that are unrepresentative of the range of conditions common to African patients. In this study of patients from Malawi and South Africa, the researchers investigated whether blood RNA expression could distinguish TB from other conditions prevalent in African populations and form the basis of a diagnostic test for TB (through a process using transcription signatures).
What Did the Researchers Do and Find? The researchers recruited patients with suspected TB attending one clinic in Cape Town, South Africa between 2007 and 2010 and in one hospital in Karonga district, Malawi between 2007 and 2009 (the training and test cohorts). Each patient underwent a series of tests for TB (and had a blood test for HIV) and was diagnosed as having TB if there was microbiological evidence confirming the presence of Mycobacterium tuberculosis. At recruitment, each patient also had blood taken for microarray analysis and following this assessment, the researchers selected minimal transcript sets that distinguished TB from latent TB infection and TB from other diseases, even in HIV-infected individuals. In order to help form the basis of a simple, low cost, diagnostic test, the researchers then developed a statistical method for the translation of multiple transcript RNA signatures into a disease risk score, which the researchers then checked using a separate cohort of South African patients (the independent validation cohort). Using these methods, after screening 437 patients in Malawi and 314 in South Africa, the researchers recruited 273 patients to the Malawi cohort and 311 adults to the South African cohort (the training and test cohorts). Following technical failures, 536 microarray samples were available for analysis. The researchers identified a set of 27 transcripts that could distinguish between TB and latent TB and a set of 44 transcripts that could distinguish TB from other diseases. These multi-transcript signatures were then used to calculate a single value disease risk score for every patient. In the test cohorts, the disease risk score had a high sensitivity (95%) and specificity (90%) for distinguishing TB from latent TB infection (sensitivity is a measure of true positives, correctly identified as such and specificity is a measure of true negatives, correctly identified as such) and for distinguishing TB from other diseases (sensitivity 93% and specificity 88%). In the independent validation cohort, the researchers found that patients with TB could be distinguished from patients with latent TB infection (sensitivity 95% and specificity 94%) and also from patients with other diseases (sensitivity 100% and specificity 96%).
What Do These Findings Mean? These findings suggest that a distinctive set of RNA transcriptional signatures forming a disease risk score might provide the basis of a diagnostic test that can distinguish active TB from latent TB infection (27 signatures) and also from other diseases (44 signatures), such as pneumonia, that are prevalent in African populations. There is a concern that using transcriptional signatures as a clinical diagnostic tool in resource poor settings might not be feasible because they are complex and costly. The relatively small number of transcripts in the signatures described here may increase the potential for using this approach (transcriptional profiling) as a clinical diagnostic tool using a single blood test. In order to make most use of these findings, there is an urgent need for the academic research community and for industry to develop innovative methods to translate multi-transcript signatures into simple, cheap tests for TB suitable for use in African health facilities.