Memory T Cells in Latent Mycobacterium tuberculosis Infection Are Directed against Three Antigenic Islands and Largely Contained in a CXCR3+CCR6+ Th1 Subset

An understanding of the immunological footprint of Mycobacterium tuberculosis (MTB) CD4 T cell recognition is still incomplete. Here we report that human Th1 cells specific for MTB are largely contained in a CXCR3+CCR6+ memory subset and highly focused on three broadly immunodominant antigenic islands, all related to bacterial secretion systems. Our results refute the notion that secreted antigens act as a decoy, since both secreted proteins and proteins comprising the secretion system itself are targeted by a fully functional T cell response. In addition, several novel T cell antigens were identified which can be of potential diagnostic use, or as vaccine antigens. These results underline the power of a truly unbiased, genome-wide, analysis of CD4 MTB recognition based on the combined use of epitope predictions, high throughput ELISPOT, and T cell libraries using PBMCs from individuals latently infected with MTB.


Introduction
Tuberculosis is one of the major causes of death from infectious disease. Current diagnostics do not distinguish active and latent infection, and the only available vaccine has limited efficacy. Hence, there is an urgent need for both novel vaccines and diagnostic strategies.
Human T cell responses to MTB involve CD4, CD8, CD1 and ch T cells. Seminal studies showed that human memory T helper 1 (Th1) cells directed against the purified protein derivative (PPD) of MTB secreted IFN-c [1]. IFN-c has an essential role in the protective immunity to mycobacteria, as individuals with genetic defects in the IFN-c receptor has an increased susceptibility to mycobacteria [2]. Th1 cells mainly express the chemokine receptors CCR5 and CXCR3 [3], while Th17 cells co-express CCR6 and CCR4 and Th22 cells co-express CCR6 and CCR10 [4,5].
While several studies have reported the identification of MTB antigens, from abundant or easily purified proteins [6,7], a truly genome-wide study to identify antigens is lacking. In only a few cases have techniques allowing ex vivo detection and/or characterization of MTB-specific T cells, prior to any in vitro expansion and manipulations, been utilized [8,9,10].
A key issue relating to MTB immunity is whether different classes of antigens elicit responses that have the same or diverse functional characteristics. MTB antigens described so far are predominantly secreted MTB proteins [11], Some of which are not essential for bacterial survival [12]. As a result, it was hypothesized that secreted proteins might act as decoy antigens, diverting the immune response from recognizing more relevant MTB proteins [13].
In this regard, two intriguing MTB protein categories are the PE/PPE proteins, and the Esx protein family, which have been shown to elicit B and T cell responses [14,15]. The function(s) of PE/PPE proteins are not fully understood but data indicates that they influence antigen presentation and host cell apoptosis [15]. The PE/PPE genes encode almost 200 proteins (4% of the total open reading frames (ORFs)) [16], unique to Mycobacteria and most prevalent in pathogenic strains. While PE/PPE proteins are mainly located within the bacterial cell wall and cell surface, some are also secreted [17,18].
T cell epitopes have been described from all main MTB protein categories, indicating that protein function or cellular location per se does not determine which proteins can be recognized. Previous studies in complex pathogen systems demonstrated that immune responses are directed against a relatively large fraction of the genome [23,24]. However, epitope reactivity is currently described only from about 4% of the approximately 4,000 ORFs of the MTB genome ( [11] (IEDB, www.iedb.org)). Hence, we hypothesized that a genome-wide probe of the immunogenicity of MTB ORFs would reveal a large number of novel antigens. Defining the breadth of responses is key for the design of vaccination strategies that mirror natural immunity [25], evaluation of disease the performance of vaccine candidates and the development of diagnostics.
By combining HLA class II peptide binding predictions with modern high throughput techniques such as ex vivo ELISPOT analysis, HLA class II multimers, and the screening of T cell libraries [26], we were able for the first time to identify and characterize the genome-wide antigen response in latently infected individuals.

Breadth and Dominance of a Genome-wide Library of MTB-derived Predicted HLA Class II Epitopes in LTBI Donors
Protein sequences from five complete MTB genomes (CDC1551, F11, H37Ra, H37Rv and KZN 1435) and sixteen draft assemblies from the NCBI Protein database (Table S1) were aligned. The binding capacity of all possible 15-mer peptides (n = 1,568,148) was predicted for 22 HLA DR, DP and DQ class II alleles ( Figure S1 and Table S2) most commonly expressed in the general population [27], to select peptides predicted to bind multiple HLA class II alleles (promiscuous epitopes). This approach identifies the most dominant and prevalent responses, corresponding to approximately 50% of the total overall response [27].
A total of 20,610 peptides (with a range of 2 to 10 per ORF, and an average of 5), including 1,660 variants not totally conserved amongst the genomes considered in the analysis, were synthesized and arranged into 1,036 peptide pools of 20 peptides ( Figure S1). The ex vivo production of IFN-c by PBMCs from 28 LTBI donors induced by each of the 1,036 pools was measured utilizing ELISPOT. Pools recognized by $10% of donors were deconvoluted, and 369 individual MTB epitopes were identified (Table  S3). Individual donors recognized, on average, 24 epitopes, underlining the large breadth of response to MTB.
Epitope responses were ranked on the basis of magnitude to assess their relative dominance. The top 80 epitopes accounted for 75% of the total response and the top 175 epitopes accounted for 90% of the total response ( Figure 1A). Only occasional weak responses were detected in 28 TB uninfected/non-Bacille Calmette-Guérin (BCG) vaccinated control donors, thus demonstrating that these responses were LTBI-specific ( Figure 1A). The epitopes were mapped to individual MTB antigens using the H37Rv as a reference genome. A total of 82 antigens were recognized by more than 10% of LTBI donors ( Figure 1B). These 82 antigens accounted for approximately 80% of the total response in LTBI donors ( Figure 1C). Responses to the epitopes from the most frequently recognized antigens were further characterized utilizing PBMCs depleted of either CD4 or CD8 T cells. The majority (97%) of these epitopes were recognized exclusively by CD4 T cells (Table S3), as expected because of their identification on the basis of predicted HLA class II binding capacity.

Novel MTB Antigens and Sources of CD4 T Cell Epitopes Recognized by LTBI Donors
Comparing these 82 most prevalently recognized antigens with antigens for which similar ex vivo epitope reactivity has been described (IEDB), we found that the majority (61/82 antigens, 74%) was novel. While a given antigen might not have been analyzed in sufficient detail to lead to the description of defined epitopes, it might nevertheless have been described as a target of T cell responses. Therefore we performed a literature search for each individual antigen to further categorized them as novel, or as targets of CD4 T cells, CD8 T cells or undefined T cell type. This revealed that 41% of the antigens we identified had not previously been described as T cell targets ( Figure S2A and Table 1). The responses to novel antigens, in terms of both response frequency and magnitude, are comparable to those directed against previously known T cell targets (Table S4).
Further analysis of the IEDB data revealed a limited overlap, (18%; 28/158) between antigens identified in this study and antigens known as sources of HLA class I epitopes ( Figure S2B). Finally, no significant correlation was found with the antigens recognized by serological responses from the MTB proteome [28] ( Figure S2C).

HLA Class II Reactivity Is Highly Focalized on MTB Antigenic Islands
Next, using the TubercuList database [16], we determined the protein category to which the identified antigens belong ( Figure 2). As expected, the identified antigens were associated with almost every category, with the exception of regulatory proteins and proteins of unknown function. The significant overrepresentation of PE/PPE proteins was notable, as well as the underrepresentation of proteins in the conserved hypotheticals, cellular metabolism and respiration categories.

Author Summary
Mycobacterium tuberculosis is one of the most lifethreatening pathogens of all time, having infected onethird of the present human population. There is an urgent need for both novel vaccines and diagnostic strategies. Here, we were able to identify the targets most dominantly recognized by latently infected individual that successfully contain infection. These targets are contained in three broadly genomic antigenic islands, all related to bacterial secretion systems and composed by several distinct ORFs. Thus, our results suggest that vaccination with one or few defined antigens will fail to replicate the response associated with natural immunity. Our analysis also pinpoints that the Th1 cells dominating the response are associated with novel and well-defined phenotypic markers, suggesting that the response is molded by unique MTB associated factors. This study demonstrates further that the approach combining peptide binding predictions with modern high throughput techniques is generally applicable to the study of immunity to other complex pathogens. Together, our data provide a new angle in the worldwide fight against M. tuberculosis and could be used for diagnostic or vaccine developments.
The localization of antigens recognized was next visualized by plotting the recognition data on a linear map of the MTB genome. Analysis of either percent of donors responding or percent of total response revealed striking clusters of reactivity within certain regions of the genome ( Figure 3A). When the MTB genome was parsed into 5-gene windows, significant antigenic clusters (defined by minimum 4 proteins within the 5-gene window being recognized by 7.1% of LTBI donors) could be identified using binomial distribution probability and Bonferroni correction. Three significant antigenic islands ( Figure 3B), encoding 0.55% of the total ORFs, accounted for 42% of the total response ( Table 2). One of the islands (Island 3) contains the well-known Rv3875 and Rv3874 antigens, which is an Esx protein pair secreted via a T7SS. Strikingly, the other two islands also contain Esx protein pairs. Moreover, two of the antigenic islands are part of the known T7SS systems Esx-1 (Island 3) and Esx-3 (Island 1). It is noteworthy that the proteins recognized included not only the proteins believed to be secreted, but also the proteins forming the actual secretion apparatus (Island 1). Indeed, the antigens identified within these islands correspond to proteins from several different protein categories, mostly assigned to the cell wall and cellular processes and the PE/PPE category, which is not surprising since several of these proteins are part of the T7SS. Additionally, Rv3615c [29], which is functionally linked to Esx-1 [30], was also prevalently recognized. However, it stands as a single antigen and not as part of an antigenic island.

Antigenic Islands Rather than PE/PPE and Esx Proteins Are the Major Determinant of Immunodominance
To dissect whether the main determinant of immunodominance was related to a given antigen being contained within an antigenic island or belonging to PE/PPE and Esx proteins families, we calculated the percentage of the total response for different groups of proteins as well as the percentage of the MTB genome associated with these protein groups ( Table 2). To compare different protein groups we calculated the ratio between % of response and % genome, as a percent enrichment.
The PE/PPE proteins were responsible for 19% of the total response, and when divided into PE/PPE proteins within an island compared to non-island, the island PE/PPE were more predictive of immunogenicity than the non-island ones (Table 2). Also, in the case of Esx proteins and T7SS, proteins within the antigenic islands were more likely to be immunogenic than those outside the islands. Proteins not in the antigenic islands, and not belonging to PE/PPE and T7SS categories, were responsible for 14% of the total response (Table 2). Thus, these data show that the antigenic islands identified are highly predictive of immunogenicity, and that to be contained within the antigenic islands is the most reliable predictor of the immunodominance of PE/PPE and Esx proteins.

Similar Multifunctionality of T Cell Responses to Different Categories of MTB Antigens
It has been proposed that some of the responses against secreted MTB proteins act as decoys [13], thereby supporting bacterial persistence. It has also been proposed that T cells differing in their degree of multifunctionality might differ in terms of protective potential, or have a role in pathology [31,32,33,34]. Definition of dominant antigens allows testing the validity of these hypotheses. To address these issues we detailed responses against PE/PPE, Esx and other proteins expressed in the three major antigenic islands, or elsewhere, by a variety of approaches, including multiparameter intracellular cytokine staining (ICS) assays, tetramer staining and T cell libraries.
The frequency of IFN-c, TNFa, and IL-2 expressing CD4 T cells elicited by proteins from the PE/PPE and cell wall and cell processes category, and from within an island versus non-island, induced similar cytokine expression patterns ( Figure 4A and C; gating strategy in Figure S3). The vast majority of CD4 + T cells were IFN-c + TNFa + IL-2 + or IFN-c + TNFa + , followed by TNFa + single producing CD4 + T cells. To a lesser extent, TNFa + IL-2 + , single IFN-c + , and single IL-2 + cells were also detected ( Figure 4A and C). Triple cytokine producers were found in 27-40% of cytokine-expressing CD4 + T cells, 30-43% expressed any 2 cytokines, and 23-44% produced a single cytokine ( Figure 4B and D). We did not observe any donor-, antigen-or epitope-specific pattern of cytokine production ( Figure 4E).

Memory Phenotypes and T Cell Subsets Associated with Different Categories of MTB Antigens
CD4 + T cells were stained with selected HLA-epitope tetramer reagents and tetramer + cells were enriched [8,35]. Epitope-specific T cell responses were detected in 16 donors at frequencies 0.25 to 24.3% (median of 3.8, interquartile range 1.5-15.3) for seven different HLA/T cell epitope tetramer combinations ( Figure 5A). Only a small number of tetramer-positive cells were detected with the epitope-specific tetramers in donors with a HLA mismatch ( Figure 5A), which confirmed that tetramer specificity was derived from the epitope and HLA molecule combination. Epitope tetramer combinations were selected based on the number of donors responding, HLA restriction, and the availability of corresponding reagents for tetramer production. Memory subset phenotypes were determined using Abs to CD45RA and CCR7. Similar to the multifunctionality phenotype, we did not observe any differences in memory phenotype when comparing proteins from within an island vs. non-island ( Figure 5B and C). Rv0129c/ Rv1886/Rv3804, Rv3418c and Rv1195 epitope-specific tetramer + T cells predominantly consisted of CD45RA 2 CCR7 + central memory T cells in all donors analyzed, followed by effector memory (CD45RA 2 CCR7 2 ). Percentages ranged between 70.1 and 91.3% (median 85.0, interquartile range (77.7-86.8)) for central memory T cells and 8.6-26.8% (13.3 (10.2-19.0)) for effector memory T cells. Only a minor fraction appeared to be naïve (CCR7 + CD45RA + ) or effector T cells (CCR7 2 CD45RA + ). For Rv0288/Rv3019c the percentages ranged between 49.5 to 84.5% (56.8 (52.0-74.7)) for central memory T cells, 9.8-37.1% (25.9 (13.3-33.8)) for naïve and 4.8-17.2% (10.0 (7.4-16.8)) for effector memory T cells. Again, a minor fraction of the tetramer + cells appeared to be effector T cells ( Figure 5B and C).
Next, we set up T cell libraries from 4 representative donors and the CXCR3 + CCR6 + subset were directly stimulated, after expansion, with 59 representative peptide pools. The results of this analysis are shown in figure 8. Using this approach we were able to demonstrate that the results obtained with the MTB lysate also extended to responses specific for the various epitopes, and to confirm with a complementary approach the results of the ex vivo IFN-c ELISPOT analysis utilizing the library of predicted HLA class II binding epitopes.

Discussion
Individual MTB proteins have been studied to identify novel vaccine candidates, with several studies focused on culture filtrate proteins [6,7]. Other studies utilized bioinformatic approaches to select a subset of genes as antigen candidates [37,38]. However, the lack of a true genome-wide characterization has hindered a complete understanding of the mechanisms and specificity of the immune response to MTB.
This study provides the first in-depth truly genome-wide description of human T cell responses to MTB. We characterized and isolated T cells directly ex vivo, thus avoiding biases introduced as a result of in vitro restimulation and expansion of T cells before analysis. This approach should be generally applicable to the study of immunity to other complex pathogens. The HLA alleles were chosen to allow coverage of the most frequent DP, DQ and DR specificities in the general population [39]. However, we readily acknowledge that this selection has potential limitations and may bias the results toward the epitopes recognized by these alleles.
In terms of T cells recognizing MTB we found that the T cell response to MTB antigens in LTBI donors is strongly biased towards a subset of CXCR3 + Th1 cells that co-express CCR6 [4]. Interestingly, this narrow distribution was only seen for MTB and not other pathogens such as S. pyogenes and C. albicans within the same donor. The origin of CCR6 + Th1 cells and their differentiation requirements remains to be defined; they may represent a separate Th1 lineage, or they may differentiate from plastic CCR6 2 Th1 cells or CCR6 + Th17 cells [40]. Future studies will examine whether this highly focused response is key to MTB containment by examining patients who remain healthy vs. patients who progress to active disease. Striking levels of heterogeneity of responses were detected. This expands previous observations using smaller subsets of antigens [6,41], and a genome-wide screen of antibody responses [28]. The observed heterogeneity might reflect differences in MTB strains, bacillary load, and metabolic state, resulting in qualitative or quantitative differences in antigen expression [42,43]. In any case, since natural immunity to MTB is multiepitopic and multiantigenic, and more than 80 antigens are necessary to capture 80% of the T cell response, vaccination strategies including one or a few antigens are unlikely to replicate natural immunity. Likewise, monitoring the immune response to one or a few antigens in the setting of clinical trials might yield a severely incomplete and biased picture of immune reactivity.
Several antigens from the DosR regulon, as well as resuscitation-and reactivation-associated antigens have been described as preferentially recognized by individuals with latent infection using long-term T cell cultures [44,45]. We observed reactivity to two of these proteins, Rv2031c (2 donors) and Rv2627c (1 donor), and no significant association with proteins from the DosR regulon or latency-associated antigens, similar to previous observations [46].
We identified three antigenic islands within the MTB genome map as main determinants of immunodominance. Remarkably, the majority of the novel antigens identified are associated (contained within or in close proximity to) these islands, which all contain Esx protein pairs and PE/PPE proteins, and are part of a putative secretory system. Our analysis demonstrated that these factors synergistically contribute to determining immunodominance and confirms the importance of PE/PPE and Esx proteins, but suggests that their immunodominance is perhaps mostly determined by their location within these antigenic islands.
Two main hypotheses can be put forth to explain the mechanism by which these features determine immunodominance. First, secreted proteins may act as decoys to divert the immune response from recognizing nonsecreted MTB proteins [13], thus favoring bacterial persistence. The second hypothesis envisions that antigenic islands are dominant because they are intrinsically immunogenic, and because they perform key biological functions necessary to maintain MTB persistence.
The decoy hypothesis has two predicated features; either secreted proteins result in diversion of the immune system from the bacteria itself, or the decoy effect is achieved by inducing an immune response to decoy antigens that are functionally distinct from non-decoy antigens. In the first case, we note that both secreted proteins and proteins involved in the secretion apparatus are equally recognized. Indeed, immune reactivity towards proteins involved in the secretion system apparatus has previously been described for T3SS and inflammasome activation by flagellin and the T3SS rod proteins [54,55]. Furthermore, we were unable to detect a functionally distinct immune response in terms of multifunctionality, memory phenotype and T cell subsets, and independent of island vs. non-island localization and secretion status of the antigen recognized. Taken together, these observations argue against the decoy hypothesis. T cells that secrete multiple cytokines are a potential correlate of protection, but have also been implicated in pathology [31,32,33,34]. Whatever their role might be, the majority of epitope-specific CD4 + T cell responses were multifunctional, with no differences between antigens from islands vs. non-islands, and between the PE/PPE vs. cell wall and cell processes categories. In terms of T cell phenotypes and T cell subsets a similar picture emerged, with epitope specific CD4 + T cells being predominantly CD45RA 2 CCR7 + central memory cells, in agreement with previous studies [26]. For some epitope specific CD4 + T cells a large fraction were CD45RA + CCR7 + , a phenotype traditionally regarded as naïve. Such T cells have previously been reported [56,57], and might reflect early differentiation into antigen-specific  cells. Additional studies would be required to investigate this further.
The available data favors the second hypothesis, that the three antigenic islands are dominant because they perform key biological functions and are necessary to maintain MTB persistence. The most prevalently recognized island is Esx-3, which is controlled by the iron-dependent regulator IdeR and the zinc uptake regulator Zur [58,59], suggesting its involvement in fundamental biological processes such as metal iron homeostasis. In addition, Esx-3 is essential for in vitro growth, and is conserved in a wide range of mycobacterial species [20,60]. Furthermore, the Esx-3 system contributes to immune protection against MTB challenge in mice of the IKEPLUS strain [61] in a HLA class II dependent fashion.
Genes from island 2 are, like island 1 (Esx-3), regulated by Zur [58], providing a possible functional link between them. While island 2 is not part of an Esx secretion system per se, it is believed to originate from a duplication of the Esx-3 system [19]. Esx-1 and Esx-3 also appear to be linked, since Rv3873 interacts with Rv0288 [62].
Secretion systems similar to the T7SS associated with two of the three antigenic islands are also found in other bacteria, such as Listeria monocytogenes, S. aureus, and Bacillus anthrax [63]. Secretion of the substrates from T7SS are not dependent on interaction with host cells, unlike other bacterial secretion systems such as T3SS and T4SS, which are switched on upon host-cell contact. This suggests that T7SS, while essential for pathogenicity, may fulfill more general physiological roles than strictly host-cell oriented functions.
This study was completed in a non-TB-endemic population. Ongoing studies include a larger study population from different ethnicities and geographic locations, as well as patients with different disease states and BCG vaccination status. This will provide answers for different HLA phenotypes, as well as whether patients from an endemic area or with different disease states show a different recognition pattern.
In conclusion, this study describes the immunological footprint of MTB CD4 T cell recognition to an unprecedented level of detail. The high throughput cellular screens utilized here to analyze the human immune response to MTB provides information on the specificity, frequency and class of memory T cells, as well as on the individual variability in magnitude and quality of the response. As a result, 34 novel antigens and three broadly immunodominant antigenic islands were defined. The study of the class of proteins recognized, together with the phenotype of responding T cells, disproves the notion that responses against secreted antigens are a decoy utilized to favor bacterial persistence, and rather suggest that these proteins, together with those that are part of their general secretion apparatus, are targeted by fully functional T cell responses. More broadly, this study provides proof of principle of how such high throughput techniques can be applied to other complex pathogen systems. In terms of potential practical applications, the novel T cell antigens identified could be of potential use for diagnostic or vaccine purposes. Indeed, the heterogeneity of responses demonstrated herein suggests that a too narrow focus for vaccine evaluations will not replicate natural immunity. Finally, the antigens and epitopes identified can also be used as tools for identifying biomarkers to provide correlates of risk for, or protection against, tuberculosis disease.

Ethics Statement
Research conducted for this study was performed in accordance with approvals from the Institutional Review Board at the La Jolla Institute for Allergy and Immunology. All participants provided written informed consent prior to participation in the study.

Study Subjects
Leukapheresis samples from 28 adults with LTBI and 28 control donors were obtained from the University of California, San Diego Antiviral Research Center clinic (age range 20-65 years). Subjects had a history of a positive tuberculin skin test (TST). LTBI was confirmed by a positive QuantiFERON-TB Gold In-Tube (Cellestis), as well as a physical exam and/or chest X-ray that was not consistent with active tuberculosis. None of the study subjects endorsed vaccination with BCG, or had laboratory evidence of HIV or Hepatitis B. The control donors had a negative TST, as well as a negative QuantiFERON-TB. Approval for all procedures was obtained from the Institutional Review Board (FWA#00000032) and informed consent was obtained from all donors.

Bioinformatic Analyses
Proteins from the 21 MTB genome projects available from the NCBI Protein database were downloaded into an in-house MySQL database. Of these, 5 were complete (CDC1551, F11, H37Ra, H37Rv, KZN 1435) and 16 were draft assemblies (Table  S1). The protein sequences were parsed into all possible 15mer peptides (n = 1,568,148), for each of which binding to 22 different HLA DR, DP and DQ class II alleles most commonly expressed in the general population (Table S2) was predicted using the IEDB HLA class II 'consensus' prediction method [64]. The sequences of the H37Rv strain were used as a reference sequence. For each H37Rv protein, alignments were made of all orthologs identified in other genomes, as determined by a BLAST search. Because of the overall high sequence conservation among the proteins from all the 21 genomes, 1,220,829 (91.4%) of 15mers were completely conserved among all of the strains. For each protein, the bestpredicted binders, as ranked by consensus percentile, were selected for synthesis. In order to ensure coverage of each of the proteins, the number of peptides selected per protein was no less than 2 and no more than 10, depending upon protein length (18,950 peptides). Any variants among the orthologs at the selected positions were also selected (1,660), for a total of 20,610 peptides.

Peptides
Sets of 15-mer peptides synthesized by Mimotopes (Victoria, Australia) and/or A and A (San Diego) as crude material on a small (1 mg) scale were combined into pools of 20 peptides. Peptides utilized for tetramers were synthesized as purified material (.95% by reversed phase HPLC). The IEDB submission number for the peptides is 1000505. dots). Each dot represents one donor/epitope combination median 6 interquartile range is indicated. (B, D) The fraction of the total cytokine response against (B) cell wall and cell processes, (D) PE/PPE proteins, expressing all 3, 2 or 1 cytokine. (E) Heat-map of each of the seven possible combinations of IFN-c, TNFa and IL-2 for each individual donor and epitope tested grouped by protein category and island localization. Each column represents one donor. Colors indicate frequency of epitope-specific CD4 T cells, grey is considered negative for indicated cytokine production. Multiple proteins indicate that the peptide sequence is homologous in these proteins. doi:10.1371/journal.ppat.1003130.g004 Definition of Immunosignatures Associated with MTB PLOS Pathogens | www.plospathogens.org

PBMC Isolation
PBMCs were obtained by density gradient centrifugation (Ficoll-Hypaque, Amersham Biosciences) from 100 ml of leukapheresis sample, according to manufacturer's instructions. Cell were suspended in fetal bovine serum (Gemini Bio-products) containing 10% dimethyl sulfoxide, and cryo-preserved in liquid nitrogen.

Ex Vivo IFN-c ELISPOT Assay
PBMCs incubated at a density of 2610 5 cells/well were stimulated with peptide pools (5 mg/ml) or individual peptides (10 mg/ml), PHA (10 mg/ml) or medium containing 0.25% DMSO (corresponding to percent DMSO in the pools/peptides, as a control) in 96-well plates (Immobilon-P; Millipore) coated with 10 mg/ml anti-IFN-c (AN18; Mabtech). Each peptide or pool was tested in triplicate. After 20 h incubation at 37uC, wells were washed with PBS/0.05% Tween 20 and incubated with 2 mg/ml biotinylated anti-IFN-c (R4-6A2; Mabtech) for 2 h. The spots were developed using Vectastain ABC peroxidase (Vector Laboratories) and 3-amino-9-ethylcarbazole (Sigma-Aldrich) and counted by computer-assisted image analysis (KS-ELISPOT reader, Zeiss). Responses were considered positive if the net spot-forming cells (SFC) per 10 6 were $20, the stimulation index $2, and p,0.05 (Student's t-test, mean of triplicate values of the response against relevant pools or peptides vs. the DMSO control). For experiments utilizing depletion of CD4 + or CD8 + T cells, these cells were isolated by positive selection (Miltenyi Biotec) and effluent cells (depleted cells) were used for experiments.
The response frequency was calculated by dividing the no. of donors responding with the no. of donors tested. The magnitude of response (total SFC) was calculated by summation of SFC from responding donors.

Intracellular Cytokine Staining
PBMCs were cultured in the presence of 5 mg/ml MTB peptide and 4 ml/ml Golgiplug (BD Biosciences) in complete RPMI medium at 37uC in 5% CO 2 . Unstimulated PBMCs were used to assess nonspecific/background cytokine production. After 6 h, cells were harvested and stained for cell surface antigens CD4 (anti-CD4-PerCPCy5.5, OKT-4) and CD3 (anti-CD3-EFluor450, UCHT1). After washing, cells were fixed and permeabilized, using a Cytofix/Cytoperm kit (BD Biosciences) and then stained for IFN-c (anti-IFN-c-APC, 4S.B3), TNFa (anti-TNFa-FITC, MAb11) and IL-2 (anti-IL-2-PE, MQ1-17H12). All antibodies were from eBioscience. Samples were acquired on a BD LSR II flow cytometer. The frequency of CD4 + T cells responding to each MTB peptide was quantified by determining the total number of gated CD4 + and cytokine + cells and background values subtracted (as determined from the medium alone control) using FlowJo software (Tree Star). A cut-off of 2 times the background was used. Combinations of cytokine producing cells were determined using Boolean gating in FlowJo software.

Tetramer Staining
HLA class II tetramers conjugated using PE labeled streptavidin were provided by the Tetramer Core Laboratory at Benaroya Research Institute. CD4 T cells were purified using the Miltenyi T cell isolation kit II according to manufacturer's instructions. (C) T cell libraries were set up from the sorted subsets by polyclonal stimulation and expansion for 3-4 weeks. Libraries were analyzed by stimulation with autologous monocytes with or without MTB whole cell lysate and proliferative response was measured by 3 H-thymidine incorporation. Shown is the estimated frequency of MTB-specific T cells per 10 6 CD4 memory T cells for LTBI donors. (D) Distribution of MTB-specific T cells in the three memory T cell subsets. Data represent median 6 interquartile range from four donors. Mann Whitney test, *, p,0.05. doi:10.1371/journal.ppat.1003130.g007 Purified cells (,10610 6 ) were incubated in 0.5 ml PBS containing 0.5% BSA and 2 mM EDTA pH 8.0 (MACS buffer) with a 1:50 dilution of class II tetramer for 2 h at room temperature. Cells were then stained for cell surface antigens using anti-CD4-FITC (OKT-4), anti-CD3-Alexa Fluor 700 (OKT3), anti-CCR7-PerC-PEFluor710 (3D12), anti-CD45RA-EFluor450 (HI100) (all from EBioscience) and Live/Dead Yellow (Life Technologies) to exclude dead cells. Tetramer-specific T cell populations were enriched by incubating cells with 50 ml of anti-PE microbeads (Miltenyi Biotech) for 20 min at 4uC. After washing, cells were resuspended in 5 ml MACS buffer and passed through a magnetized LS column (Miltenyi Biotec). The column was washed three times with 3 ml of MACS buffer, and after removal from the magnetic field, cells were collected with 5 ml of MACS buffer. Samples were acquired on an BD LSR II flow cytometer and analyzed using FlowJo software.

Antigen and IEDB Analysis
The identified epitopes were compared for sequence homology and the weakest epitopes sharing .90% homology were eliminated. The epitopes were mapped to the H37Rv genome allowing 1 substitution per peptide, to identify antigens. IEDB queries utilized criteria matching the experimental study (organism; MTB, host organism; human, latent disease, ex vivo, HLA class II). Epitopes were then mapped as above. To capture the most frequently recognized antigens the response frequency score (no. donors responded -Square root of no. donors responded/no. donors tested), was utilized [65].  Figure S3 Gating strategy for multifunctionality analysis. Cells were first gated based on forward vs. side-scatter, then CD3 vs. CD4 and finally for each cytokine (IFN-c, TNFa, IL-2). Gates for each cytokine were based on the negative control and they were used for subsequent Boolean gating. (EPS)     Figure 8. The T cell library approach complements the ex vivo IFN-c ELISPOT assay. CCR6 + CXCR3 + T cell libraries were set up for 4 representative donors. The sorted T cells were polyclonally expanded and analyzed for the presence of antigen-specific T cells by stimulation with peptide pools and measurement of 3 H-thymidine incorporation. Shown is proliferation (cpm) of individual cultures from 4 different donors. Dotted lines represent the cut-off value. Response to antigens within genomic islands is shown in yellow, orange, and red; response to antigens outside antigenic islands is shown in white. Antigenic islands are indicated by capped lines. doi:10.1371/journal.ppat.1003130.g008