Gene expression profiles classifying clinical stages of tuberculosis and monitoring treatment responses in Ethiopian HIV-negative and HIV-positive cohorts

Background Validation of previously identified candidate biomarkers and identification of additional candidate gene expression profiles to facilitate diagnosis of tuberculosis (TB) disease and monitoring treatment responses in the Ethiopian context is vital for improving TB control in the future. Methods Expression levels of 105 immune-related genes were determined in the blood of 80 HIV-negative study participants composed of 40 active TB cases, 20 latent TB infected individuals with positive tuberculin skin test (TST+), and 20 healthy controls with no Mycobacterium tuberculosis (Mtb) infection (TST-), using focused gene expression profiling by dual-color Reverse-Transcription Multiplex Ligation-dependent Probe Amplification assay. Gene expression levels were also measured six months after anti-TB treatment (ATT) and follow-up in 38 TB patients. Results The expression of 15 host genes in TB patients could accurately discriminate between TB cases versus both TST+ and TST- controls at baseline and thus holds promise as biomarker signature to classify active TB disease versus latent TB infection in an Ethiopian setting. Interestingly, the expression levels of most genes that markedly discriminated between TB cases versus TST+ or TST- controls did not normalize following completion of ATT therapy at 6 months (except for PTPRCv1, FCGR1A, GZMB, CASP8 and GNLY) but had only fully normalized at the 18 months follow-up time point. Of note, network analysis comparing TB-associated host genes identified in the current HIV-negative TB cohort to TB-associated genes identified in our previously published Ethiopian HIV-positive TB cohort, revealed an over-representation of pattern recognition receptors including TLR2 and TLR4 in the HIV-positive cohort which was not seen in the HIV-negative cohort. Moreover, using ROC cutoff ≥ 0.80, FCGR1A was the only marker with classifying potential between TB infection and TB disease regardless of HIV status. Conclusions Our data indicate that complex gene expression signatures are required to measure blood transcriptomic responses during and after successful ATT to fully diagnose TB disease and characterise drug-induced relapse-free cure, combining genes which resolve completely during the 6-months treatment phase of therapy with genes that only fully return to normal levels during the post-treatment resolution phase.


Results
The expression of 15 host genes in TB patients could accurately discriminate between TB cases versus both TST+ and TST-controls at baseline and thus holds promise as biomarker signature to classify active TB disease versus latent TB infection in an Ethiopian setting. Interestingly, the expression levels of most genes that markedly discriminated between TB cases versus TST+ or TST-controls did not normalize following completion of ATT therapy PLOS ONE | https://doi.org/10.1371/journal.pone.0226137 December 10, 2019 1 / 23 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 at 6 months (except for PTPRCv1, FCGR1A, GZMB, CASP8 and GNLY) but had only fully normalized at the 18 months follow-up time point. Of note, network analysis comparing TBassociated host genes identified in the current HIV-negative TB cohort to TB-associated genes identified in our previously published Ethiopian HIV-positive TB cohort, revealed an over-representation of pattern recognition receptors including TLR2 and TLR4 in the HIVpositive cohort which was not seen in the HIV-negative cohort. Moreover, using ROC cutoff � 0.80, FCGR1A was the only marker with classifying potential between TB infection and TB disease regardless of HIV status.

Conclusions
Our data indicate that complex gene expression signatures are required to measure blood transcriptomic responses during and after successful ATT to fully diagnose TB disease and characterise drug-induced relapse-free cure, combining genes which resolve completely during the 6-months treatment phase of therapy with genes that only fully return to normal levels during the post-treatment resolution phase.

Background
Tuberculosis (TB) is a leading cause of death [1] and 25% of the 10.0 million incident TB disease cases globally were reported in Africa during 2017 [2]. WHO recommends developing effective diagnostic tests and treatments for latent TB infection (LTBI) to achieve a 90% and 80% reduction of the incidence and death rate from Mycobacterium tuberculosis (Mtb) respectively by 2030 [3]. The currently available diagnostic tools (smear microscopy, solid and liquid sputum culture, Genexpert) have several limitations to detect latent and active TB [4,5,6,7] and for monitoring TB treatment response [8], and those limitations greatly contribute to the spread of TB disease. Because existing immunological methods to diagnose TB infection, such as the tuberculin skin test (TST) and Interferon-γ release assays (IGRAs), are not able to distinguish between LTBI and active TB disease [9], it has been suggested that the identification of biomarkers that can classify clinical stages of TB and monitor TB treatment responses is essential and costeffective for improving clinical practice [10]. Changes in gene expression in peripheral blood due to the interaction between the host immune response and Mtb could potentially be used as biomarkers to classify the different clinical outcomes of TB exposure and to monitor TB treatment response. There have been previous studies showing that various stages of Mtb infection can be distinguished using gene expression profiling in peripheral blood for the diagnosis of TB disease and monitoring TB treatment [11,12,13,14,15,16,17,18] in cohorts from Europe, North and South America, Asia and Africa (South Africa, Malawi and Gambia). For instance, Wu and colleagues [15] identified 10 genes whose expression differentiated patients with active TB disease from LTBI individuals in a North American cohort. Kaforou and colleagues [16] identified and validated a 44 gene signature that distinguished active tuberculosis from other diseases in different African cohorts, while Warsinske and his colleagues [17] identified a 3-gene messenger RNA expression score that distinguished individuals who progressed to TB cases from non progressors, individuals with TB cases from non TB cases, and individuals with slower treatment response during TB therapy in Brazil and South Africa. However, those host markers may not be applicable in another population, because various studies have forearm. TST positivity was classified as skin induration diameter �10 mm in HIV-uninfected individuals [22].

RNA extraction
RNA was extracted from 2.5ml blood collected in Paxgene tubes (PreAnalytiX, Qiagen, Germany) using the Paxgene RNA extraction kit (PreAnalytiX, Qiagen) according to the manufacturer's instructions. Briefly, Paxgene tubes were centrifuged at 4000 rpm for 10 minutes and the pellet was lysed and resuspended by Resuspension Buffer (Buffer BR1), followed by treatment with proteinase K to remove contaminating proteins. Ethanol-precipitated nucleic acids were loaded onto a spin column followed by on-column DNA digestion using RNase-free DNase (Qiagen). Finally, purified RNA was eluted with RNase-free buffer (BR5 buffer) and quantified using a NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Wilmington, USA). RNA samples with 260/280 nm absorbance ratios below 1.70 or above 2.3 were excluded from further analyses.

Dual-color Reverse-Transcription Multiplex Ligation-dependent Probe Amplification (dcRT-MLPA)
DcRT-MLPA was performed as described in detail elsewhere [18]. Briefly, for each target-specific sequence, a specific reverse transcription (RT) primer was designed located immediately downstream of the left and right-hand half-probe target sequence. Complementary DNA (cDNA) was generated from RNA using an RT primer mix. Subsequently, MMLV reverse transcriptase was inactivated by heating at 98˚C for 2 minutes and cDNA was incubated overnight at 60˚C with a mixture of customized left and right-hand half-probes to hybridize with the target cDNA. Annealed half-probes were ligated using ligase-65 enzyme and subsequently amplified by PCR (33 cycles of 30 sec at 95˚C, 30 sec at 58˚C, and 60 sec at 72˚C, followed by 1 cycle of 20 min at 72˚C). Primers and probes were from Sigma-Aldrich Chemie (Zwijndrecht, The Netherlands) and MLPA reagents from MRC-Holland (Amsterdam, The Netherlands). PCR amplification products were 1:10 diluted in HiDi formamide-containing 400HD ROX size standard, denatured at 95 o C for 5 min, cooled on ice and analyzed on an Applied Biosystems 3730 capillary sequencer in GeneScan mode (Base Clear, Leiden, The Netherlands).
Trace data were analyzed using GeneMapper software 5 package (Applied Biosystems). The areas of each assigned peak (in arbitrary units) were exported for further analysis in Microsoft Excel spreadsheet software. Data were normalized to GAPDH and signals below the threshold value for noise cutoff in GeneMapper (log2 transformed peak area 7.64) were assigned the threshold value for noise cutoff. Finally, the normalized data were log2 transformed for statistical analysis.
RT primers and half-probes were designed by Leiden University Medical Centre (LUMC, Leiden, The Netherlands) [18,23] and comprised sequences for 4 housekeeping genes and 105 selected genes to profile the innate and adaptive immune response (S1 Table). Genes associated with active TB disease or protection against disease, as described in the literature, were included in the study.

Statistical analysis
The Kolmogorov Smirnov test showed the data were not normally distributed. A nonparametric Kruskal-Wallis H test was used to compare medians among more than two clinical groups. A non-parametric two tailed Wilcoxon rank-sum (Mann-Whitney) test was used to compare two unpaired data sets while a Wilcoxon signed-rank test was used for two paired data sets., Ingenuity Pathway Analysis (IPA) was used to look the network of those genes that discriminate TB cases from controls in HIV positive and HIV negative patients. The statistical significance level used was P<0.05 and all P values are two-tailed. All data analysis was performed using Inter cooled STATA version 11.0 (College Station, Texas, USA).

Characteristics of the study population
A total of 80 HIV-negative study participants composed of 40 TB cases, 20 TST+ and 20 TSTwere included in this study. Malnutrition (BMI<18.5 kg/m2) was detected in 52% of TB patients compared to 15% of TST+ and 0% of TST-individuals ( Table 1).
Non  (Fig 2A). Genes that could best classify TB patients and TST-were IL7R, PRF1, NLRP1, CD3E, CCR7, FCGR1A, IL5, TLR9, BLR1  Median (inter quartile range) gene expression values (peak areas normalized for GAPDH and log2-transformed) are shown at baseline and significant differences between study groups were determined using Kruskal-Wallis H and Wilcoxon Mann-Whitney test. In red, genes are indicated that were more highly expressed in the test group compared to the reference/control group while blue indicates genes that had lower expression in the test group compared to the reference/control group.
Only genes whose expression level significantly differed between any of the study groups are listed. P-values � 0.05 are indicated in bold.

Impact of anti-TB treatment (ATT) on the kinetic responses of the biomarkers associated with active TB
Next, we assessed the effect of ATT treatment on expression of the genes that markedly discriminated between TB cases versus TST+ and TST-controls at baseline. Thus, the gene expression of these markers in TB patients was measured at six months (6M) of ATT and compared to the baseline value (0M) of the same patients and with that of both control groups (TST+ and TST-). The expression levels of genes that markedly discriminated between TB cases versus TST+ and TST-at baseline partially normalized between baseline and 6M in TB patients following ATT treatment. Interestingly, the expression levels of many genes had not fully normalized to TST+ or TST-levels at the end of 6M of ATT therapy (Fig 3 & Table 3). Only the expression of 8 genes, including 4 transcripts which were among those with the most powerful potential to discriminate between TB disease and TST+ or TST-(PTPRCv1, FCGR1A, CASP8 and GNLY) (Fig 2), became indistinguishable from those of TST+ and TSTat the end of 6M ATT therapy (Table 3). However, most of the genes whose expression levels were not completely normalized yet at 6M did display expression levels that were indistinguishable from TST+ or TST-at 18 months follow up (Table 4 & Fig 4).

Different gene networks discriminate TB cases from controls in HIVpositive and HIV-negative individuals
Out of the 48 genes which were significantly differentially expressed between TB cases and TST+ subjects in this HIV-negative cohort, only 7 genes (CD4, PTPRCv1, TLR3, TNFRSF1A, NLRP12, BLR1 and FCGR1A) were significantly different between HIV-positive TB cases and TST+ individuals in our previous study in the same location [24]. Moreover, the expression of TNFRSF1A, TLR3 and NLRP12 was significantly higher in TB cases than TST+ controls during HIV coinfection, in contrast to the results obtained here in HIV negative individuals. Similarly, only 12 out of the 39 host genes which were significantly differentially expressed between TB cases and TST-in HIV negative individuals, including FCGR1A, RAB24, CD3E, CD4, IL7R, PTPRCv1, GNLY, GZMB, TNFRSF1A, CCL5, NLRP12 and BLR1, were also significantly different between TB cases and TST-in HIV coinfected individuals in our previous study [24], and again the expression of TNFRSF1A and NLRP12 was significantly higher in TB cases than TST-controls during HIV coinfection, in contrast to the results obtained here in HIV negative individuals. None of the 17 host genes which were significantly differentially expressed between HIV-negative TST+ and TST-individuals was significantly different in HIV positive TST+ and TST-individuals in our previous study [24]. Ingenuity Pathway Analysis of the data from the HIV-positive cohort in the previous study [24] revealed an over-representation of pattern recognition receptors including TLR2 and TLR4 (Fig 5A) in TB-associated genes which was not seen in the HIV-negative cohort ( Fig   Fig 2. Identification of single genes with discriminatory power to classify HIV-negative study groups at baseline (M0).

Receiver operator characteristics (ROC) curves showing the accuracies of individual genes in discriminating (A) TB cases versus TST+ subjects, (B) TB cases versus TST-subjects and (C) TST+ versus TST-subjects. AUC = Area under the curve.
https://doi.org/10.1371/journal.pone.0226137.g002 Host biomarkers classifying tuberculosis infection and disease status 1A). The comparison of HIV-positive TST+ and TST-individuals revealed a central role for cytotoxicity and T cell genes (Fig 5B) in contrast to the dominance of pro-inflmmatory cytokines seen in HIV-negative individuals (Fig 1B).

Discussion
Assessing the consistency of previously identified candidate biomarkers and finding additional candidate genes for diagnosing TB disease and for monitoring treatment responses will be important for the future direction of TB disease control. Here, we identified gene expression patterns which could discriminate clinical stages of TB, using a focused gene expression profiling platform, dcRT-MLPA [18], targeting innate and adaptive immune response genes, to analyze RNA expression levels of 105 pre-selected genes in peripheral blood. The gene expression of 15 genes with AUCs �0.80 (IL7R, CD3E, IL5, NLRP1, PRF1, TLR9, CCR7, NLRP12, TLR5, PTPRCv1, FCGR1A, BLR1, GNLY, RAB33A and NCAM1) was strongly associated with TB disease and these genes indeed play critical roles in the immune response against TB. There was a clear association between TB disease and low BMI in this cohort: observed gene expression differences might be related to nutritional status but this is intrinsically linked to disease profile in TB.
Expression of TLR9, NLRP1, NLPR12, RAB33A and BLR1 was significantly lower in TB patients compared to TST+ and TST-subjects, in agreement with published data [18,25,26,27]. Toll-like receptors (TLR) play a critical role in the innate immune response to exogenous pathogens. Low expression of TLR9 has a critical role in TB incidence and progression, and this might be associated with combined defects in pro-inflammatory cytokine production such as IFN-γ recall responses [26]. Low expression of NLRP1 and NLRP12 might be related to a risk of susceptibility for bacterial diseases, via reduced cleavage of pro-IL-1β and pro-IL-18 to produce mature isoforms [28], and via avoidance of infected macrophage lysis [29] which contributes to pathology in TB. Rab33A is a novel CD8 + T cell factor and the expression may involved in susceptibility to TB disease [27].
The observed lower expression of T cell associated genes (e.g. IL7R, CD3E, CCR7 and PTPRCv1) in TB patients has been shown previously [21,30] and might be associated with reactivation of infection and migration of cells to the site of infection [31]. Similarly, lower expression of other immune subset genes (such as NK marker NCAM1) in blood in TB patients may also relate to migration of lymphocytes or natural killer cells from the peripheral blood to the site of infection [32]. Furthermore, GNLY and PRF1 expression levels were also significantly lower in TB patients compared to TST+ and TST-individuals, which is consistent with published data [33,34] and might be explained by rapid consumption of both perforin and granulysin during active disease due to an ongoing effector immune response, or due to migration of the T cell subset responsible for its production [35].
FCGR1A and TLR5 were also found to be differentially expressed between TB cases and TST+ or TST-individuals, in agreement with published data [36,37,38,39]. However, these genes were higher expressed in TB patients compared to controls and were found to constitute the best discriminatory power between TB cases versus both TST+ and TST-controls. FCGR1A is an essential component of interferon signalling and plays a central role in endocytosis, phagocytosis, antibody-dependent cellular toxicity, cytokine release, and superoxide Median gene expression levels (peak areas normalized to GAPDH and log2-transformed) of the indicated genes are shown as box-and-whisker plots (5-95 percentiles). Significant differences among the groups and between study groups were determined using Kruskal-Wallis H test and Wilcoxon Mann-Whitney test respectively. Shown are individual genes that were found to have the best discriminatory power (AUCs � 0.80) to distinguish between active TB cases (TB) versus latently infected (TST+) and uninfected (TST-) controls in HIV-negative subjects. ( � = P-value �0.05, �� = P-value �0.01, ��� = P-value �0.001, ���� = P-value �0.0001).

Gene Symbol TB cases (0M) TB cases (6M) TST+ (0M) TST-(0M) TB cases (6M) vs TB cases (M0) TB cases (6M) vs TST+ (M0) TB cases (6M) vs TST-(M0)
Immune cell subset markers CD19 7.6(7.6-7.6) 9.7 (9.4-9.9) 7.7(7.6-8.  generation [40] but may also participate in TB pathogenesis. In contrast, TLR5 is expressed in myeloid cells during TB infection and its role may associate with an imbalance in Th1 and Th2 cells by increasing the expression of IL-4 [41]. We also assessed the expression levels of host genes in response to ATT. We showed that expression levels of a subset of genes that markedly discriminated between TB cases versus TST+ and/or TST-controls at baseline were normalized in ATT treated TB patients at 6 months. However, in contrast to most previous studies in which normalization was completed between 2 and 6 months of treatment [42,43], the majority of the genes in our study were only fully normalized at the 18 months follow-up time point. Treatment-response transcriptomic signatures can significantly change already within 1 week of treatment [44], and continue to change until the end of ATT treatment at 6 months [18,45] and even after treatment is completed [11,46]. The expression of only a small number of genes, including PTPRCv1, FCGR1A, GZMB, CASP8 and GNLY, fully returned to the expression levels observed in TST+ and TSTsubjects after the full 6 months of treatment in this study. Differential expression of gene profiles in TB patients during 6 months anti-TB chemotherapy compared to baseline has previously been reported [42,43,47] and correlated with a clearance of actively dividing bacilli load [44]. However, TB cases with clinically curative treatment at the end of 6 months therapy may not have completely cleared the infection yet, and may not have reached the end of the disease pathology resolution process due to the presence of few remaining viable Mtb, with the potential to elicit a host response [48] as well as ongoing immunopathology in sterilized lesions.
There were some notable differences in discriminating TB cases from controls using the expression of immune-related genes amongst HIV-positive [24] and -negative individuals  Median (inter quartile range) gene expression values (peak areas normalized for GAPDH and log2-transformed) are shown. Significant differences between active TB patients at baseline (0M) and 6 months following ATT treatment initiation (6M) were determined using Wilcoxon signed-rank test. Significant differences between active TB at the 6M and TST+ or TST-at the 0M time point was determined using Wilcoxon Mann-Whitney test. In red, genes are indicated that were more highly expressed in the test group compared to the reference/control group whereas in blue genes are indicated that had lower expression in the test group compared to the reference/control group. Genes listed in this table were differentially expressed between any of the study groups at baseline (0M) (   (this study). The discriminatory potential of genes identified in HIV-negative individuals using ROC included immune cell markers (NCAM1), T cell associated genes (IL7R, CD3E, CCR7, PTPRCv1), T helper type 2 associated genes (IL5), cytotoxicity genes (GNLY and PRF1), pattern recognition receptors (TLR5 and TLR9), inflamasome components (NLRP1 and NLRP12), IFN signalling genes (FCGR1A), GTPase activating genes (RAB33A) and Gprotein couple receptors (BLR1) (Fig 2A and 2B). With the exception of FCGR1A, all of these genes did not have discriminatory potential amongst HIV-positive individuals using ROC cutoff � 0.80 [24]. Pattern recognition receptors, including TLR2 and TLR4, were over-represented in network analysis of TB-associated genes in HIV-positive individuals (Fig 5A) which was not the case in HIV-negative individuals (Fig 1A), revealing fundamental differences in biological response and biomarker expression in these cohorts. In previous studies, TB patients without HIV infection showed no difference in TLR2 and TLR4 expression in monocytes compared to healthy donors [49] but TLR2 and TLR4 are most strongly up-regulated in mDCs of TB patients coinfected with HIV [50] consistent with the findings in this report. Using ROC cutoff � 0.80, the expression of FCGR1A was the only marker consistently identfied in both HIV-positive and -negative individuals which is consistent with a previous report by Sutherland et al [30]. The dominance of pro-inflammatory cytokines seen in HIV-negative LTBI may be related to activation of T cells [51] which may contribute to containment of Mtb infection. In contrast, low expression of cytotoxicity genes and T cell-associated genes observed in HIVpositive LTBI may reflect enhanced recruitment of T cells to the site of Mtb infection [52], or Median (inter quartile range) gene expression values (peak areas normalized for GAPDH and log2-transformed) are shown. Significant differences between active TB patients at 18 months following ATT treatment initiation (18M) and baseline (0M) were determined using Wilcoxon signed-rank test. Significant differences between active TB at the 18M and TST+ or TST-at the 0M time point was determined using Wilcoxon Mann-Whitney test. In red genes are indicated that were more highly expressed in the test group compared to the reference/control group whereas in blue genes are indicated that had lower expression in the test group compared to the reference/control group. Genes listed in this table were differentially expressed between any of the study groups at baseline (0M) (  deletion of the activated T cells [53], which may contribute to HIV disease progression and exacerbate the HIV epidemic. There were also notable differences between this report and a previous report in the context of Ethiopia. While only 9 of 45 host genes genes measured by Mihret et al. had significantly different expression between active TB cases and household contacts [21], 21 out of these 45 host genes had significantly differencial expression in TB cases compared to both TST+ and TST-subjects in our study. The expression of FCGR1A and IL7R were the only TB-associated markers that were consistently differentially expressed between TB patients and control groups in our study compared to the previous study in the context of Ethiopia and this may be attributable to the selection criteria for the control groups [30] which consisted of household contacts in Mihret et al. and daily laborers in our study, or may reflects huge genetic heterogeneity amongst the Ethiopian population. Moreover, 5 out of 45 host genes measured by Mihret et al. [21] showed differential expression between latent TB infected and uninfected individuals, whereas 7 of the 45 host genes was differentially expressed between latent TB infected and uninfected individuals in our study. However, there was no overlap in the genes discriminating between TST+ and TST-individuals in both studies. In conclusion, the expression levels of 15 host genes (IL7R, CD3E, IL5, NLRP1, PRF1, TLR9, CCR7, NLRP12, TLR5, PTPRCv1, FCGR1A, BLR1, GNLY, RAB33A and NCAM1) in peripheral blood can discriminate active TB disease from latent TB infection and uninfected controls in an HIV-negative cohort. However, almost all these markers, except for FCGR1A, can not discriminate between active and latent TB in TB-HIV co-infected subjects. Our data also show that complex gene expression signatures are required to fully measure changes in blood transcriptomes during and after successful ATT, such that a combination including those which resolve completely during the 6-months treatment phase of therapy (PTPRCv1, FCGR1A, GZMB, CASP8 and GNLY) and those which only fully return to normal levels during the post-treatment resolution phase, might be required to fully characterise drug-induced relapse-free cure. Further research is needed to completely charaterise the optimal complex signature in different populations and larger study populations.
Supporting information S1 Table. List of target genes for dcRT-MLPA. 105 selected genes and 4 housekeeping genes to profile innate and adaptive immune responses. (DOC)