Human and Murine Clonal CD8+ T Cell Expansions Arise during Tuberculosis Because of TCR Selection

The immune system can recognize virtually any antigen, yet T cell responses against several pathogens, including Mycobacterium tuberculosis, are restricted to a limited number of immunodominant epitopes. The host factors that affect immunodominance are incompletely understood. Whether immunodominant epitopes elicit protective CD8+ T cell responses or instead act as decoys to subvert immunity and allow pathogens to establish chronic infection is unknown. Here we show that anatomically distinct human granulomas contain clonally expanded CD8+ T cells with overlapping T cell receptor (TCR) repertoires. Similarly, the murine CD8+ T cell response against M. tuberculosis is dominated by TB10.44-11-specific T cells with extreme TCRβ bias. Using a retrogenic model of TB10.44-11-specific CD8+ T cells, we show that TCR dominance can arise because of competition between clonotypes driven by differences in affinity. Finally, we demonstrate that TB10.4-specific CD8+ T cells mediate protection against tuberculosis, which requires interferon-γ production and TAP1-dependent antigen presentation in vivo. Our study of how immunodominance, biased TCR repertoires, and protection are inter-related, provides a new way to measure the quality of T cell immunity, which if applied to vaccine evaluation, could enhance our understanding of how to elicit protective T cell immunity.


Introduction
The adaptive immune system can generate 10 14 unique TCRs, which provides the capacity to recognize an enormous universe of distinct antigens [1][2][3][4]. Despite our understanding of the genetic and structural basis for TCR diversity and antigen recognition, it remains challenging to predict the magnitude and diversity of T cell responses. The size of the T cell response to model antigens generally correlates with the abundance of antigen-specific T cells in the naïve repertoire (e.g., precursor frequency) [5][6][7]. Paradoxically, pathogen-specific T cell responses are often focused on a small number of the available antigenic epitopes and use a narrow TCR repertoire, a phenomenon termed "immunodominance". Pathogens have numerous strategies to evade host immunity, hindering our ability to determine a priori how T cell diversity relates to antimicrobial immunity. Thus, the relationship between immunodominance and host defense during infection is incompletely understood. For pathogens that rapidly mutate, such as human immunodeficiency virus 1 (HIV-1), a diverse T cell response could benefit the host by efficiently detecting escape mutants, while a biased response could be detrimental. For slowly replicating pathogens that encode numerous antigens, the relation between diversity and protection is less clear.
Here, we investigated the origin and protective capacity of immunodominant T cell responses following M. tuberculosis infection in both humans and mice. Extreme TCR bias, the presence of public TCRs, and strong selection of a complementarity determining region 3 (CDR3) β motif were shown by TCR sequencing of sorted tetramer + cells from the lungs of infected mice. We discovered that TCR bias emerges soon after T cell priming in the lymph node (LN) and becomes more extreme during chronic infection. Cloning TB10.4 4-11 -specific TCRs allowed us to develop retrogenic (Rg) mice to study immunodominant TCRs in vivo. Competition studies using TB10.4 [4][5][6][7][8][9][10][11] -specific Rg CD8 + T cells showed that small differences in T cell affinity lead to clonal dominance in vivo. Finally, TB10.4 4-11 -specific CD8 + T cells transferred IFNγ-dependent protection against M. tuberculosis infection that required TAP1-dependent antigen presentation.

Human CD8 + T cell expansions in lung granulomas
Deep sequencing of the TCRβ repertoire was performed on CD8 + T cells purified from 11 lung granulomas and one LN obtained from 5 patients undergoing lung resection for medically non-responsive tuberculosis (see Table 1, S1 Data). The CD8 + T cells in the granulomas were more clonal than the peripheral blood TCRβ repertoire of healthy individuals (Fig 1A and S2A Data). TCRβ expansions were detected in all lung samples ( Fig 1B and S3 Data). Abundant clonotypes were detected in anatomically distinct granulomas from the same patient ( Fig 1B). There was extensive overlap of the TCR sequences between granulomas from the same patient, although most sequences detected in each granuloma were unique to that lesion ( Fig 1C). However, it was the shared sequences that were most abundant. For example, of 217 TCRβs common to all three granulomas from patient #23, 143 were the most abundant in each lesion (the top 10 are shown in Fig 1C and S2B Data). Thus, the same T cells clonotypes were abundant in different lesions from the same subject despite their varied pathology (see Table 1). Some TCRβs appeared to have undergone antigen-selection, as sequences with identical CDR3β amino acid sequences were encoded by distinct recombination events. For example, five distinct DNA recombination events led to the CDR3β sequence 'CASSVDGGTEAFF' and two occurred at a high frequency (1.16% and 0.9%, Fig 1D). Thus, CD8 + T cells in human granulomas undergo clonal expansion, and while there is considerable heterogeneity between distinct lesions, the most abundant clonotypes were shared between granulomas. As the antigenspecificity and the HLA restriction of these T cells were unknown, the inferences that we can make are limited. Furthermore, co-infection with HIV, present in four of the five subjects, could potentially confound the analysis, since HIV itself can alter the TCR repertoire of CD8 + T cells. Indeed, we cannot be certain whether these TCRs are specific for Mtb. Therefore, we next turned to an animal model of tuberculosis to address the origin and consequences of clonal CD8 + T cell expansions.
To determine whether TCR bias was established during priming in the LN or after T cell trafficking to the lungs, we used monoclonal antibodies (mAbs) specific for a subset of the known Vβ families (see Table 2). All 5 Vβ families were expressed by CD8 + T cells obtained from LNs of mice 21 days after infection, the majority of which are not specific for Mtb ( Fig  2D). The distribution of these 5 Vβ families among tetramer+ cells in the LN was similar to the bulk CD8 + T cell population (Fig 2D). Starting on day 21 in the lung, and more dramatically by day 28 in the LN and lung, significant Vβ family bias was detected among TB10.4 4-11 -specific CD8 + T cells. For example, Vβ4 (mouse 'A' and 'B'), Vβ7 (mouse 'B'), or Vβ10 (mouse 'C') dominated the TB10.4 4-11 -specific CD8 + T cell response by day 21 post infection ( Fig 2E). The dominant Vβ family was frequently expanded in both the LN and lung, suggesting that TCR bias developed early after T cell priming. Overtime, biases became more dramatic, suggesting ongoing preferential expansion of certain clones. The TB10.4 4-11 -specific CD8 + T cells in one infected mouse (mouse 'H') were >80% Vβ7 + after 25 weeks of infection ( Fig 2E and 2'H'). Thus, after an initial priming of a broad repertoire in response to antigen, TCR bias among TB10.4 4-11 -specific CD8 + T cells develops in the draining LN early after T cell priming and becomes established during the chronic phase of infection. The highly oligoclonal response and the dominance of CDR3β amino acid sequences suggest that the immunodominant T cell clonotypes undergo selection during infection.

The paradox of high precursor frequency and extreme TCR bias
Two general mechanisms can explain how TCR bias develops during the TB10.4 4-11 response: 1) TB10.4 4-11 -specific CD8 + T cells are drawn from a limited pool of naïve precursors; or 2) the naïve TB10.4 4-11 -specific CD8 + T cells pool is diverse but competition leads to selection and bias. To discriminate between these possibilities, we measured the precursor frequency of TB10.4 4-11 -specific CD8 + T cells in uninfected mice.
Using sequential tetramer staining and enrichment of antigen-specific T cells from the naïve repertoire [5,22], we determined that approximately 1:13,000 CD8 + T cells were specific for the TB10.4 4-11 epitope in the C57BL/6 naïve repertoire, or about 857 cells per mouse ( Fig  3A and 3B). This precursor frequency is among the highest recorded for antigen-specific CD8 + T cells in the mouse [7].
The high frequency of TB10.4 4-11 -specific CD8 + T cell precursors in C57BL/6 mice contrasts with the limited number of unique T cell clonotypes that comprise the post-infection Clonality of CD8+ TCR sequences obtained from infected lung granulomas (n = 6) compared to splenocytes from uninfected mice (n = 9); p = 0.0004 by Mann-Whitney. (b,c) Frequency of TCR Vβ families (b) and CDR3βs (c) of TB10.4 4-11 -specific CD8 + T cells from the lungs of C57BL/6J mice infected with M. tuberculosis for 9 weeks. Data were obtained by Next-generation sequencing of tetramer-purified T cells from 6 individual mice. (d) Frequency of TCR Vβ4 (TRBV2), Vβ5 (TRBV12), Vβ7 (TRBV29), Vβ10 (TRBV4) and Vβ11 (TRBV16) families of TB10.4 4-11 -tetramer + CD8 + T (red) or TB10.4 4-11 -tetramer -CD8 + T (blue) cells from the pulmonary LN of C57BL/6J mice infected with M. tuberculosis at 21 days post infection. (e) Frequency of TCR Vβ4 (TRBV2), Vβ7 (TRBV29) and Vβ10 (TRBV4) families of CD8 + T cells from the pulmonary LN or TB10.4 4-11 -specific CD8 + T cell repertoire. To estimate the frequency of TB10.4 4-11 -specific TCRβ clones in the naïve repertoire, TB10.4 4-11 -specific sequences from infected mice were used to interrogate data sets from three uninfected C57BL/6 mice (referred to below as spleen A, B, or C) each containing more than a million TCRβ sequences (Provided by David Hamm, Adaptive Biotechnologies reference data "Mus musculus TCR Beta from spleen", 2014). By performing a pairwise comparison, we identified TB10.4 4-11 -specific TCRβ DNA sequences in the naïve T cell repertoire of uninfected mice (Fig 3C). For example, an abundant TCRβ from the lung of infected mouse #2 (17.8%) was present in spleen B (0.0026%); however, none of the abundant (>0.1%) TCRβs from the lung of infected mouse #2 were present in spleen A or C ( Fig 3C). When the translated CDR3β amino acid sequences were used to query the naïve repertoire, the number of matches increased from 40 to 345; and several of the most highly represented CDR3βs were identified ( Fig 3D). This raises the possibility that there are multiple DNA recombination events that can generate CDR3β regions capable of recognizing TB10.4 [4][5][6][7][8][9][10][11] in the naïve repertoire. Importantly, the frequencies of TB10.4 4-11 -associated TCRβs in the naïve repertoire were similar to the median frequency determined for the entire naïve T cell population ( Fig 3E). For example, the median frequency of the clonotypes detected in naïve spleen #B was 0.0011%. The frequencies of "CASSQDRENSDYTF" and "CASSRDRENS-DYTF" in the naïve spleen #B repertoire were 0.00425% and 0.00106%, respectively. Thus, the low frequency of these two CDR3β regions in the naïve TCR repertoire does not predict their massive clonal expansion after infection.
To make this analysis more quantitative, we determined the frequencies of all TB10.4 4-11 -associated CDR3β sequences that could be identified in any of the three uninfected spleens (A, B, or C). Those sequences ("TB10-associated", Fig 3E) had a slightly higher median frequency than the median of the entire splenic TCR repertoire ("uninfected spleen) (0.001845% vs. 0.001142%, P < 0.0001). We next focused on the subset of "TB10-associated" CDR3βs that were present in at least 2 of the 6 infected lungs, and defined these as "shared sequences" ( Fig  3E). The median frequency of the "shared sequences" was 0.003614%, which was increased compared to "uninfected spleen" (P < 0.0001); this is in agreement with previous observations describing a higher frequency of public TCRs in the naïve repertoire [2]. Finally, the CDR3βs that we defined as "expanded sequences" (those with a frequency >1% among CD8+ T cells in the Mtb-infected lung and that account for 75% of the total TB10.4 4-11 -associated sequences)  consists of all sequences from uninfected C57BL/6 mice. 'TB10-specific' are the TB10-specific sequences found in the uninfected splenic repertoire; 'shared sequences' are TB10-specific sequences detected in more than 2 infected mice; 'expanded sequences' are all "shared sequences" with frequency >1% in at least one infected mouse. Box represents minimum and maximum, and bar and dot is the median frequency. NS, not significant; ****, p < 0.0001, by one-way ANOVA with Kruskal-Wallis post test. had a frequency similar to "uninfected spleen" (0.001559%, not significant). Thus, TB10.4 4-11associated sequences do not appear to be over-represented in the naïve repertoire compared to other sequences. Interestingly, the individual frequency of highly represented sequences within an individual naive mouse varied by more than 10-fold, raising the possibility that the precursor frequency of individual CDR3β clonotypes could affect their representation in the post-immune repertoire (e.g., after infection). Interestingly, "CASSRDRENSDYTF," which was one of the more successful CDR3βs as it was commonly detected during Mtb infection, was not present at a greater frequency than other TCRs in the naïve repertoire ( Fig 3E).
Although T cells specific for the immunodominant epitope TB10.4 4-11 have a high precursor frequency in the naïve repertoire of C57BL/6 mice (Fig 3B), extreme TCR bias emerges during the CD8 + T cell response to TB10.4 [4][5][6][7][8][9][10][11] (Fig 2b). While heterogeneity in the frequency of TB10.4 4-11 -specific clonotypes found in the naïve repertoire could account for some of the bias (Fig 3E), the precursor frequency of individual clonotypes in the naïve repertoire does not appear to be the dominant factor influencing representation in the post-infection repertoire. We next evaluated the possibility that clonotypic dominance occurs because of clonal selection during infection.

Evidence for antigen-driven peripheral T cell selection
The CDR3 length of the unique TCRβ sequences from the infected and uninfected mice was similar with a median CDR3β length of 36-39 bases (Fig 4A). In contrast, 56% of the highly represented TCRs (e.g., frequency >1%) had a CDR3β length of 42. Analysis of the amino acid sequences with this CDR3β length (accounting for 36% of all TB10.4 4-11 -specific sequences) revealed a strong consensus motif-"CASSxDReNsdytF" (Fig 4B). This motif was present in clonal expansions if different mice, and several distinct DNA recombination events generate the conserved aspartic acid (Asp, "D") at CDR3β residue 6 (e.g., V-D recombination, germline encoded, N region addition; Fig 4C). Thus, we conclude that this residue is under strong selection.
In addition to the selection of the conserved Asp, the over-representation of certain CDR3β regions in their entirety appeared to be the consequence of selection. In Mouse #1, 63% of the CDR3β sequences encoded the amino acid sequence "CASSLDRENSDYTF." Remarkably, three distinct VDJ recombination events generated this CDR3β in Mouse #1 ( Fig 4C). These data strongly suggest that these TCRs were selected. Not only were the TCRβ expansions extremely biased within each individual mouse, but some abundant sequences shared identical CDR3βs between mice, constituting so called "public" TCRs [2,23] (Fig 4C).
The "DREN" motif was also detected among TCRs from human lung granulomas in three of the five patients, and was expanded in patient #24 ( Fig 4D). Two different patients share the motif "CASSxDRENTEAFF," which is similar to the murine motif ( Fig 4D). While the "DREN" motif was detected in the peripheral blood TCRβ repertoire of normal donors, those clonotypes had a 10-fold lower average frequency ( Fig 4D and S4 Data). Some clonotypes found in the Mtb granulomas were 1000-fold enriched compared to the average "DREN" frequency in peripheral blood of normal donors ( Fig 4D). These data are consistent with selection since distinct recombination events generate different TCRs with the "DREN" motif ( Fig 4D). Finally, there is evidence for public TCRs as patients #23 and #26 have clones with the identical CDR3β sequence "CASSSDRENTEAFF".
The extreme TCR bias ( Fig 2C) and the identification of multiple VDJ recombination events encoding identical CDR3βs indicate that dominant T cell clonotypes undergo selective expansion during infection. To quantify selection during the response to M. tuberculosis, we calculated the ratio of unique amino acid sequences to unique nucleic acid sequences. For example, the same immunodominant CDR3β was encoded by three unique DNA sequences in mouse #1. Shared sequences (>2 mice) and abundant sequences (>1%) had lower median values (0.158 and 0.154, respectively) than the bulk population of TB10.4 4-11 -specific CD8 + T cells (0.868) or T cells from uninfected mice (0.845), indicating strong selection at the level of the CDR3β amino acid sequence (p< 0.0001; Fig 4E). Thus, our TCR analysis indicates that clonotypic dominance occurs because of strong selection of certain CDR3β amino acid sequences.

Retrogenic mouse model for TB10.4-specific CD8+ T cells
To study the biology of TB10.4 4-11 -specific CD8 + T cells and to delineate the mechanism(s) responsible for the development of TCR bias during M. tuberculosis infection, we developed retrogenic mice. To ensure that we could track the recombinant TCR-expressing CD8 + T cells, we used Vα2var mice, which have a single Vα gene (Vα2) but can still generate diversity through limited VαJα recombination [24]. Vα2var mice resisted tuberculosis and generated a dominant TB10.4 4-11 -specific CD8 + T cell response, although it was more variable than in C57BL/6 mice (see S5 Data).
Single cell sorting of H2-K b /TB10.4 4-11 tetramer + cells from M. tuberculosis-infected Vα2var mice was followed by single cell PCR to determine the TCRα and TCRβ sequences. Three mice were analyzed and the TB10.4 4-11 -specific CD8 + T cell response in each mouse was dominated by one or two TCR clonotypes ( Fig 5A). Each identified dominant TCRβ paired with a single TCRα chain, supporting that these were true clonal expansions. Despite the constrained Vα2, the VαJα recombination site was remarkably diverse. In contrast, there were structural parallels between the CDR3β sequences from Vα2var ( Fig 5B) and C57BL/6 mice (Figs 2 and 3). While the dominant TCRβs from the Vα2var mice used distinct Vβs and Jβs, there was enrichment of arginine ("R") and aspartic acid ("D") in the CDR3β (Fig 5B and 5C). To determine whether the enrichment of "R" or "D" was significant, we assessed their occurrence in the normal TCRβ repertoire. We queried the splenic TCRβ repertoire from three C57BL/6 mice representing over 1.1 million reads and~53,000 unique sequences each (Fig  2A). The average frequency of "R", "D", or "RD" at CDR3β position 6, 7, or 6-7, was 10.1%, 7.5%, and 2.5% respectively, indicating "R", "D", and "RD" were significantly enriched among the clonally expanded TB10.4 4-11 -specific CD8 + T cells (P < 0.0001; see S6 Data).
doi:10.1371/journal.ppat.1004849.g004 Vβ11 staining in CD8 + T cells (right panel). Red, CD4 + T cells; blue, are colored in CD8 + GFP -T cells; green, CD8 + GFP + T cells. The numbers represent: the frequency of CD4 + vs. CD8 + T cells among gated CD3 + T cells (left); the frequency of CD3 + CD8 + cells that express GFP (middle); and the frequency of CD3 + CD8 + GFP + cells that are Vα2 + Vβ11 + (right). (e) Tetramer staining of CD8 + GFP -T cells (dashed line) and CD8 + GFP + T cells (filled green histogram). The numbers represent the frequency of CD3 + CD8 + GFP + cells that are stained by the tetramer. (f) IFNγ production by lung T cells from TCR3 retrogenic mice infected Priming and acquisition of effector function by naïve TB10.44-11-specific CD8+ T cells Naïve (CD44 lo CD62L hi ) Rg3 CD8 + T cells were purified and transferred into congenically marked recipient mice infected with M. tuberculosis. Immediately after transfer (d6 or d7 after infection) very few Rg CD8 + T cells were detected (~100-200 cells per lung) and they were mostly naïve (Fig 6A). Rg CD8 + T cells began to acquire an activated phenotype (CD44 hi CD62L lo ) starting on d11 in the LN. Associated with their activation, the numbers of Rg3 CD8 + T cells in the LN dramatically increased during days 11-13 post-infection ( Fig 6A). Subsequently, these Rg CD8 + T cells began to accumulate in the lung by day 13-15, and the Rg3 CD8 + T cell numbers continued to increase through day 18 ( Fig 6A). The increase in cell numbers in the LN and lung correlated with proliferation, occurring first in the LN, then in the lung, and finally in the spleen ( Fig 6B). Thus, priming of Rg3 CD8 + T cells tracks the kinetics that has been established for the IFNγ-response in intact mice and activation of transferred transgenic CD4 + T cells [25][26][27][28]. Following priming, the Rg3 CD8 + T cells rapidly acquired the ability to produce IFNγ in both the LN and lung (Fig 6C). Thus, Rg3 CD8 + T cells are primed in the LN and are subsequently recruited to the lung, where they express a variety of effector functions including the production of IFNγ, TNF, and granzyme B (see S7 Data), similar to endogenous TB10.4 4-11 -specific CD8 + T cells.

TAP1 and IFNγ are required for protection mediated by CD8+ T cells
It is unknown whether the immunodominant CD8 + T cell response to TB10.4 is protective. Therefore, we activated Rg3 CD8 + T cells in vitro with TB10.4 4-11 peptide, IL-2 and IL-12 and after 60-72 hours, transferred them into sublethally irradiated mice and infected them with M. tuberculosis. We compared the protective capacity of Rg3 CD8 + T cells with ovalbumin-specific CD8 + T cells (e.g., OT-I cells).
When a large number of activated cells (e.g., 10 6 ) were transferred, both Rg3 and OT-I cells transferred considerable protection (Fig 7A). The ability of large numbers of OT-I cells to transfer protection has not been directly investigated, but may be due to their highly activated state and could involve the production of IFNγ, has shown previously for ovalbumin-specific CD4+ T cells [29]. As the number of transferred cells was titrated down, protection mediated by OT-I cells diminished (Fig 7A). In contrast, Rg3 cells continued to mediate significant protection even when as few as 10,000 cells were transferred (Fig 7A). Transfer of Rg4 T cells also conferred protection against M. tuberculosis challenge (see S8A Data).
To demonstrate that protection required TCR dependent recognition of antigen, we transferred activated Rg3 cells (10 6 cells/mouse) into sublethally irradiated WT or TAP1-/-recipient mice. Rg3 effector CD8 + T cells were able to transfer protection to WT but not TAP1-/-recipients ( Fig 7B). As in vitro activation bypasses the need for in vivo priming, these results show that protection mediated by Rg3 CD8 + T cells requires recognition of antigens processed by the TAP-dependent class I MHC antigen-processing pathway.

Differences in TCR avidity can determine clonotypic dominance during infection
Two of the cloned TCRs (Rg3 and Rg4) differed in their binding to K b /TB10.4 4-11 tetramers, revealing a difference in avidity (Fig 8A). No difference in the level of TCR level was detected (see S8B Data). Importantly, when transferred separately into intact mice, both Rg3 and Rg4  were primed, underwent expansion and trafficked to the lung with similar kinetics, and mediated protection (Figs 6 and 7 and see S8 Data). To investigate whether TCR avidity could affect immunodominance, we co-transferred naïve Rg cells expressing either TCR3 or TCR4 at a 1:1 ratio into congenic recipients and analyzed their relative abundance following infection. Following co-transfer into M. tuberculosis infected mice, Rg3 and Rg4 T cells initially maintained a 1:1 ratio and importantly, both Rg3 and Rg4 T cells were primed in the LN and increased in number (Fig 8B and 8C). However, by day 11 post infection, Rg4 T cells began to outnumber Rg3 T cells, and by day 14, Rg4 T cells outnumbered Rg3 T cells by a ratio of 10:1 in the LN (Fig 8B and 8C). In the lung, there was no change in the ratio or cell count until day 14, at which point Rg4 CD8 + T cells accounted for >90% of the Rg CD8 + T cells (Fig 8B). While the absolute number of both Rg3 and Rg4 CD8 + T cells increased in the lung by day 20, Rg4 CD8 + T cells outnumber Rg3 CD8 + T cells by 100:1 (Fig 8C). Although both Rg3 and Rg4 were able to expand and confer protection when transferred separately, Rg4 T cells dominated when in competition with Rg3 T cells. When co-transferred, both Rg3 and Rg4 CD8 + T cells were primed in the LN and proliferated; however, small differences in their affinity led to the establishment of clonotypic dominance by Rg4 CD8 + T cells during the T cell response to M. tuberculosis infection.

Discussion
We report that extreme TCR bias develops during the polyclonal CD8 + T cell response to a single immunodominant epitope during tuberculosis in humans and in mice. In mice, preferential Vβ use by TB10.4 4-11 -specific CD8 + T cells is detected in the LN within 3 weeks of infection, indicating that bias develops soon after T cell priming. With time, TCR clonality becomes more extreme. Why do a few clonotypes dominate the TB10.4 4-11 -specific CD8 + T cells during M. tuberculosis infection? TCR diversity arises by three principal mechanisms: 1) V, D, and J segments generate combinatorial diversity; 2) imprecise recombination and insertion of non-templated 'N' sequences at the VβD, DJβ and VαJα junctions; and 3) random assortment between TCRα and TCRβ chains [30,31]. By these mechanisms, people have the potential to generate >10 14 unique TCRαβ receptors [1,32]. Thymic selection constrains the repertoire by ensuring that T cells that are unable to recognize MHC and the T cells that recognize self-antigens with high affinity are deleted. As a result, people have~10 10 T cells and each clonotype is represented by 10-500 T cells; therefore, the number of unique TCRs in any individual (~2.5 x 10 7 ) is far fewer than the number of potential TCRs [2,3]. Although sharing of TCRs between people is improbable, such 'public' TCRs are detected and may have special significance [23].
If the number of antigen-specific T cells in the naïve repertoire is limiting, clonotypic dominance could arise by a "founder" effect in which few T cells are primed and expand, leading to T cell populations of restricted diversity. A founder effect is unlikely to explain the TCR bias among TB10.4 4-11 -specific CD8 + T cells because we detect~900 naïve T cells in each C57BL/6 mouse, a precursor frequency that is among the highest recorded for antigen-specific CD8 + T cells in the mouse [7]. TB10.4 4-11-tetramer + CD8 + T cells from the LNs of infected mice use many Vβ families, which also argues against a founder effect. Another possibility is that the frequency of each clonotype in the naïve repertoire is skewed and results in TCR bias after infection. We observe considerable heterogeneity in the relative abundance of naïve precursors with the potential to recognize the TB10.4 [4][5][6][7][8][9][10][11] . However, skewing in the naïve repertoire cannot solely explain the selection that we observe for TB10.4 4-11 -specific CD8 + T cells following infection.
After infection, TB10.4 4-11 -specific CD8 + T cells express a limited number of CDR3β sequences. Given the high precursor frequency, we predicted a priori that the response would be dominated by private clonotypes, i.e., TCRs unique to each individual. Instead, common CDR3β motifs are generated by independent VDJ recombination events, within the same mouse and among different individuals. The most frequently used and shared TCRβs are under selective pressure, determined using a modification of Warren's method [33].
Developing retrogenic mice that over-produce CD8 + T cells specific for TB10.4 [4][5][6][7][8][9][10][11] permitted us to perform competition experiments with naïve TB10.4 4-11 -specific CD8 + T cells. A small difference in affinity significantly affects clonal representation and establishment of hierarchical dominance during infection. These data provide crucial experimental support for the idea of antigen-driven selection based on our TCR analysis. Although TCR affinity seems to be a major factor driving immunodominance, other factors can contribute. For example, the inflammatory environment and tissue-specific cues influence the fate of individual CD8+ T cells during infection [4], and these factors may contribute to the establishment of immunodominant T cell responses during tuberculosis.
Why is TB10.4 an immunodominant antigen? While there is a high precursor frequency, few of these T cell clones are significantly represented in the final immune response. The lack of a correlation between the precursor frequency and immunodominance in C57BL/6 and BALB/c mice [ref :10] indicates that a high precursor frequency is not a prerequisite for immunodominance during chronic infection. TB10.4 belongs to a larger family of ESAT6-related proteins that are secreted by specialized type VII secretion systems, and many of the secreted proteins are immunodominant in different animal species and humans [34][35][36]. While the abundance of TB10.4 during infection is unknown, paucity of this antigen during T cell priming or limited antigen presentation in the infected lung could drive clonotypic dominance by selecting for higher affinity T cells.
Whether certain mycobacterial antigens that elicit immunodominant T cell responses act as "decoys" to distract the immune response from responding to subdominant epitopes that might be more important targets of protective immunity was raised by Baena and Porcelli [37]. The finding that portions of the mycobacterial genome that encode T cell epitopes was more evolutionarily conserved has fueled this idea and raised the possibility that T cell immunity could benefit M. tuberculosis, possibly by creating an inflammatory environment that facilitates transmission [38]. Work done by the Andersen group finds that cryptic epitopes of ESAT6 are minor components of natural immune response to M. tuberculosis but specific vaccination strategies that elicit CD4+ T cells specific for the subdominant epitopes generate more durable protection against tuberculosis [39][40][41].
Could TCR diversity (or bias) be a surrogate for the quality or effectiveness of T cell immunity? Spectratyping of peripheral blood T cells from tuberculosis patients reveals TCR skewing compared to healthy controls [20,42,43]. Extreme TCR bias (e.g., clonality) was noted primarily in the setting of severe clinical disease, raising the possibility that TCR bias is associated with disease progression [19,20]. Arguing against this interpretation is the presence of highly skewed TCR repertoires in lung granulomas from patients with latent tuberculosis [17]. In our cohort of patients, we also see the emergence of clonal T cell expansions in the lungs of patients with active disease, compared to the frequency of T cells in the peripheral blood of normal donors. Importantly, clonal expansions can be detected among peripheral blood CD4 + and CD8 + T cell in uninfected healthy individuals, with memory T cells being 50-fold less diverse than naive T cells [33]. While these data indicate that T cell expansions can occur independently of active infection, clonotypes expressing the CDR3β motif "DREN" were 1000-fold enriched in Mtb granulomas compared to their average frequency in the peripheral blood of normal donors. Thus, these expansions are several orders of magnitude greater than the expansions reported in uninfected individuals [33]. This may indicate that the "DREN" expansions are driven by Mtb infection. Furthermore, based on our murine data, we infer that T cell selection, possibly driven by affinity, is occurring. We are confident that we captured the majority of lung TB10.4-specific CD8+ T cells present in each individual mouse. In contrast, we observed considerable heterogeneity in the TCR repertoire obtained from distinct granulomas in each human subject. While this heterogeneity is not surprising based on the work on Flynn and Barry [44,45], it is important to recognize that TCR bias at the level of the granuloma may be driven by heterogeneity in bacteria and bacterial antigens, as well as the persistent immune response. Therefore, the links between TCR bias, functional capacity of T cells and protection during tuberculosis are still unclear, and longitudinal studies coupled with the functional study of antigen-specific T cells are needed to define what constitutes a protective T cell response against tuberculosis in both people and in experimental animal models, both in terms of TCR diversity and T cell function [46].
While antigen choice is a key part of vaccine development, predicting which immunogenic epitopes elicit memory responses that generate protective immunity during infection has not been straightforward. Some pathogens mutate to escape T cell surveillance; other pathogens avoid immune detection by sequestering their antigens. CD4 + and CD8 + T cells specific for TB10.4 are elicited following clinical tuberculosis infection. Based on the ability of TB10.4-specific CD4 + T cells to mediate protection in animal models, the TB10.4 antigen has been incorporated into subunit vaccines. Our data is the first to show that TB10.4-specific CD8 + T cells transfer protection and that protection requires antigen presentation by TAP1-dependent pathways. This implies that the TCR-mediated recognition of infected cells is a prerequisite for the antimicrobial activity of CD8+ T cells. Furthermore, IFNγ is a key mediator of bacterial control.
Why then did a vaccine designed to elicit TB10.4-specific CD8 + T cells fail to protect mice against M. tuberculosis [47]? We chose to measure protection in immunocompromised (e.g., sublethally irradiated) mice, as it is difficult to demonstrate CD8+ T cell mediated protection in mice with an intact CD4+ T cell compartment. Another factor that may impair the ability of TB10.4-specific CD8+ T cells to protect mice with an intact immune system is inefficient presentation of IMYNYPAM. Lindenstrom et al show that vaccination with native TB10.4 protein does not elicit TB10.4-specific CD8+ T cells because the amino acid sequence surrounding the epitope (e.g., IMYNYPAML) is inefficiently processed and presented. In contrast, a homologous protein (TB10.3, EsxR) contains a homologous sequence (e.g., IMYNYPAMM), which is more efficiently processed and presented. Less TB10.3 is produced by the bacterium than TB10.4, and because of this, Lindenstrom speculates that IMYNYPAM is not presented efficiently by infected macrophages. Our own data showing protection (Fig 7) used cells activated in vitro prior to adoptive transfer, bypassing the need for priming. TAP1-dependent protection mediated by these TB10.4-specific CD8+ T cells implies that infected cells in the lung present IMYNYPAM. However, the source of the antigen (e.g., TB10.3 vs. TB10.4) is uncertain. If IMYNYPAM-specific CD8+ T cells recognize the small amount of TB10.3 expressed by infected macrophages, then selecting CD8+ T cells with a high affinity for IMYNYPAM will be even more important for host resistance.
Our study took advantage of the well-characterized CD8 + T cell response to Mtb in mice, where responses to an epitope of TB10.4 elicit an immunodominant T cell response. This allowed us to track and purify antigen-specific CD8 + T cells during infection. We find that the TB10.4-specific CD8 + T cell response is characterized by extreme clonality despite originating from a high-frequency naïve precursor pool. We were able to show that structural features of the CDR3β region were important for epitope recognition, most likely because of clonal competition and affinity selection. Similarly, we found that human CD8 + T cells also undergo selection and clonal expansion. Although the clonal expansions we detected in humans were not as dramatic as in the mouse model, we believe that this is partially because we were unable to purify human antigen-specific CD8 + T cells, due to the lack of appropriate reagents. The ability to use tetramer-sorted cells allows one to analyze T cells that are all specific for a single epitope, which adds considerable power to the TCR analysis. Therefore, it is uncertain whether the extreme immunodominance we observe in the murine system will be found in humans, and this is the subject of active investigation. For example, greater T cell diversity is theoretically expected in humans, due to the contributions of HLA variability, but also empirically, as demonstrated by the large-scale studies to date that have generally found a more diverse T cell response in infected individuals [48,49]. However, this may also be due to technical barriers; while antigen-specific CD8+ T cell expansions are found during tuberculosis, they often appear to be unique to individuals and their private repertoires, which further complicates TCR analysis of antigen-specific T cells [48][49][50]. If this holds true, it could be difficult to fully exploit TCR analysis as a biomarker for following disease progression or treatment efficacy, although if expanded T cell clones from the lung are correlated with those that are present (and can be detected) in peripheral blood, algorithms might be developed that are independent of antigenspecificity.
Finally, based on the propensity of Mtb to drive clonal selection of CD8+ T cells, we infer that there is a paucity of antigen presentation in the infected lung. Such a state may arise because of inefficient cross-presentation of antigens by the class I MHC-processing pathway and could explain why CD8+ T cells have not proven to be as protective as CD4+ T cells. Similarly, if low levels of antigen presentation rapidly select for high affinity CD8 + T cells during infection, an effective vaccine will need to elicit similarly high affinity T cells, rather than large numbers of diverse T cells, if they are to be effective in controlling bacterial replication.

Ethics statement
The University of KwaZulu Natal (UKZN) Biomedical Research Ethics Committee (BREC) approved the study protocol, its associated informed consent documents and data collection tools. Written informed consent was obtained for all research subjects.
All animal experiments were performed in accordance with National and European Commission guidelines for the care and handling of laboratory animals.   Table 1). The operating surgeon classified the tissue based on their experience and the pre-operative radiological data. Each sample was washed in multiple changes of HBSS and then diced into approximately 1 mm 3 pieces, which were strained, re-suspended in 7mls of pre-warmed digestion media (R10 (RPMI supplemented with 10% FCS, 2 mM L-glutamate, 100 U/ml Penstrep), containing 0.5 mg/ml collagenase D (Roche) and 40 U/ml DNaseI (Roche), and transferred to GentleMACS C-tubes (Miltenyi) for mechanical digestion per manufacturers instructions. The resultant suspension was incubated for 60 min at 37 o C, subjected an additional mechanical digestion step. The resulting suspension was strained through a 70 μm cell strainer, washed twice in HBSS, stained and CD8 + T-cells sorted using the FACS ARIA system, gating on the live (nearIR, Invitrogen) singlet, CD45 + , CD3 + and CD4population. Cells were sorted directly into RLT buffer, and genomic DNA extracted using the DNeasy Minikit (Qiagen) as per manufactures instructions.

Experimental infection and bacterial quantification
M. tuberculosis (Erdman strain) infection was performed via the aerosol route, and mice received 50-200 CFU/mouse. A bacterial aliquot was thawed, sonicated twice for 10 s in a cup horn sonicator, and then diluted in 0.9% NaCl-0.02% Tween 80. A 15 ml suspension of M. tuberculosis was loaded into a nebulizer (MiniHEART nebulizer; Vortran Medical Technologies) and mice were infected using a nose-only exposure unit (Intox Products). Alternatively, the bacterial aliquot was diluted in a final volume of 5ml, and mice were infected using a Glas-Col aerosol-generation device. At different times post-infection, mice were euthanized by carbon dioxide inhalation, organs were aseptically removed, individually homogenized and viable bacteria were enumerated by plating 10-fold serial dilutions of organ homogenates onto 7H11 agar plates. Plates were incubated at 37°C and M. tuberculosis colonies were counted after 21 d.
For cell sorting, stained and non-PFA fixed cells were suspended in MACS buffer (Miltenyi Biotec, Germany) and deposited in collection tubes using a BD Canto flow cytometer (Becton Dickinson, NJ, USA). For both FACS analysis and cell sorting, single-lymphocyte events were gated by forward scatter versus height and side scatter for size and granularity.

Next generation sequencing
For TCRβ high-throughput sequencing, genomic DNA was purified from sorted cell populations and sequenced by Adaptive Biotechnologies Corp. (Seattle, WA) using the ImmunoSEQ assay (http://www.immunoseq.com) as previously described [51]. Data were analyzed using the ImmunoSEQ analyser toolset. Clonality is the entropy of the TCRB frequency distribution and is calculated as 1-(entropy/log 2 [# unique TCRs]). Here entropy, a measure of diversity within a complex data set, is also known as the Shannon-Wiener index, Shannon's diversity index or Shannon's entropy [52,53]. Thus "0" represents a diverse repertoire and "1" is a completely clonal repertoire.

Enumeration of naïve Ag-specific T cells
Analysis of the precursor frequency of naïve T cells was performed as previously described [54]. Briefly, the spleen and axillary, mesenteric, cervical, inguinal, popliteal, and salivary LN were harvested from individual mice, dispersed, and filtered through a 70-mm mesh and enumerated for total and CD8 + T cell composition. The cell suspension was then costained with identical PE-and APC-conjugated tetramers and then purified via anti-PE magnetic bead selection (Miltenyi Biotec, Germany). Positive and negative fractions were then surface stained with anti-MHC II, CD11b, CD19, and CD4 as a "dump" channel, and anti-CD8α, CD3 and CD44. Flow cytometry counting beads were added immediately before samples were collected by the cytometer to determine the fraction of tetramer+ events collected and used to determine the total number and frequency of tetramer+ cells in each animal. For analysis purposes, naïve cells were cells that were present in the bound fraction, costained with PE-and APC-conjugated tetramers, and did not express CD44.

Generation of retrogenic mice
TCR retroviral constructs were generated as 2A-linked single open reading frames using PCR and cloned into a murine stem cell virus-based retroviral vector with a GFP marker as previously described [55,56]. Details of cloning strategies and primer sequences are available upon request (samuel.behar@umassmed.edu). Retroviral-mediated stem cell gene transfer was performed as previously described [55,56]. Intracellular cytokine staining 5×10 5 cells were plated in each well of a round bottom 96-well plate and incubated in the presence of TB10.4 4-11 peptide (10 μM; New England Peptide). Incubation in the presence of αCD3/αCD28 (1 μg/mL; BioLegend) or in the absence of stimuli were used as positive and negative controls, respectively. Cells were incubated for 1 h at 37°C, at which point Golgi Stop solution (BD Pharmingen, CA, USA) was added to each well for the remaining 4 h. Cells were collected after the 5 h stimulation and then surface stained with the antibodies described above, followed by intracellular staining for IFNγ (clone) using BD Permwash Kit (BD Pharmingen, CA, USA) as per manufacturer's instructions.

In vitro CTL assay
In vitro cytotoxicity was determined using peptide-coated EL4 target cells differentially labeled with the cell proliferation dye efluor 450 (eBiosciences) as previously described [21]. Briefly, target cells were pulsed with 10 μM of TB10.4 peptide at 37°C for 1 h in complete medium or left unpulsed (as controls), followed by extensive washing. Target cell populations were then labeled with either 10 μM or 100 nM e450 dye in PBS for 20 min at room temperature, followed by extensive washing. Labeled populations were mixed at an equal cell ratio with effector retrogenic cells at 1:1:1 ratio in round bottom 96-well plates (100,000 cells/population, in triplicate). Plates were incubated for 4-12 h at 37°C, in the dark. After incubation, the cells were analyzed by flow cytometry, and the ratios of recovered GFP (retrogenic, effector) and e450-labeled target EL4 populations were determined.

Adoptive T cell transfer for priming and competition
Single cell suspensions of pools of spleens and LNs from naive retrogenic mice (6 to 10 wks post reconstitution) were prepared. CD8 + T cells were purified from each suspension using the CD8 + T cell isolation kit and magnetic separation (Miltenyi Biotec, Germany). After purification, cells were counted and transferred via the tail vein into congenically marked recipients (CD90.1), which had been infected 6-7 d earlier with virulent M. tuberculosis (Erdman) via the aerosol route. For priming experiments, 10 4 to 10 5 cells were transferred into each recipient. For competition experiments, cells were mixed at a 1:1 ratio (confirmed by FACS analysis prior to injection) and then transferred into each recipient.

Measurement of cell proliferation
For analysis of cell proliferation of retrogenic cells after adoptive transfer, bead-purified naïve Rg cells (see above) were labeled with 10 μM of the cell proliferation dye efluor 450 (eBiosciences) in PBS for 20 min at room temperature, followed by extensive washing.

In vitro activation of T cells
Single cell suspensions of pools of spleens and LNs from naive retrogenic or OT-I mice were prepared. CD8 + T cells were purified from each suspension using the CD8 + T cell isolation kit and magnetic separation (Miltenyi Biotec, Germany). After purification, cells were counted and mixed at a 1:1 ratio with peptide coated APCs in media containing 100U/mL of IL-2 and 10U/mL of IL-12. APCs used were red blood cell lysed splenocytes from naïve C57Bl6/j mice, pulsed with 10 μM of TB10.4 or SIINFEKL peptide at 37°C for 1 h in complete medium, followed by irradiation with 3200 Rads from a cesium-137 source and extensive washing. 2x10 6 cells were plated, 1 mL into each well of a 24 well plate. After 24 h, cells were fed with 1 mL of fresh media, containing 100U/mL of IL-2 and 10U/mL of IL-12. 48 h after the initial stimulation, cells were fed by removing 1 mL of culture media and addition of 1 mL of fresh media, containing 100U/mL of IL-2. 60-72 h after the initial stimulation, cells were extensively washed with complete media, and used for adoptive transfer experiments.

Adoptive T cell transfer for protection and survival
An adoptive transfer model was used to analyze the ability of T cells to mediate protection against pulmonary M. tuberculosis infection as previously described [21]. Briefly, C57BL/6 or TAP-ko mice were sublethally irradiated with 600 rad using a cesium-137 source. The next day, 10 4 to 10 6 bead-purified activated Rg T cells (or OT-1 cells, as controls) were transferred via the tail vein. Mice were infected with virulent M. tuberculosis (Erdman) via the aerosol route within 24 h of the adoptive T cell transfer. Three weeks after infection, the bacterial burden in the lung and spleen was determined. For survival experiments, naïve T cells (10 5 per mouse) were transferred into TCRα KO mice via the tail vein; following transfer, mice were infected with virulent M. tuberculosis (Erdman) via the aerosol route.

Statistical analysis
Population medians were used to compare TCR frequencies. Other data are represented as mean + SEM. For data with a verified for Gaussian distribution, a t-test was performed to compare two groups; otherwise, a Mann-Whitney U test was used. To compare more than 2 groups, one-way ANOVA, followed by Bonferroni post-hoc test was performed. Differences with a p<0.05 were considered significant and represented by Ã .
Supporting Information S1 Data. Deep Sequencing of human TCRs from TB granulomas. Summary of deep sequencing data of human TCRβs from lung granulomas and lymph node resected from subjects with tuberculosis. (PDF) S2 Data. Clonality and clonality distribution. A. The Vβ gene distribution is shown for two samples: (top) CD8+ T cells from a human lung granulomas; and (B) PBMC from a normal donor. One can see how differences in the Vβ gene use affects the calculation of clonality can be visualized. Numbers do not add to 100% because only productive recombination events are plotted. Width axis: V gene; Depth Axis: CDR3 length; Height axis: frequency. B. The Vβ gene distribution is shown for four samples from a single subject (#23): lung sample A, B, and C; and LN. Although there are considerable differences between the samples, particularly with respect to TCR diversity (see Fig 1), the most abundant TCRs are shared between the different lung lesions and to a lesser extent, the LN sample. Again, one can visualize how the Vβ gene use affects clonality. Width axis: V gene; Depth Axis: CDR3 length; Height axis: frequency. (PDF) S3 Data. TCRβ frequencies in human lung granulomas. The TCRβ frequencies, derived for the frequency of unique DNA sequences, in each sample analyzed by deep sequencing. The bar is the median and the error bars denote the interquartile range. For comparison, the TCRβ frequencies in PBMC from three normal donors are shown. Only productive recombination events are plotted. (PDF) S4 Data. DREN motif in human TCRs. The frequency of the "DREN" motif in the CDR3 region of TCRβ chains sequenced from human granulomas or PBMC samples from normal individuals. Multiple unique sequences were detected in these samples. The frequency of "DREN" motifs was elevated in lung granuloma compared to peripheral blood. Statistical testing after log 10 transformation showed that these differences were significant (P<0.001) for all pairwise combinations by one-way ANOVA with Tukey's multiple comparison test. These data show that TCRβ chains with the "DREN" CDR3 motif are increased in frequency among CD8 + T cells in tuberculous lung granulomas compared to peripheral blood T cells from healthy controls. (PDF) S5 Data. Infection of Vα2var mice with Mtb. A. Schematic of the Vα2var locus of Vα2var mice. B. The TB10.4 4-11 -specific CD8 + T cell response following aerosol infection of WT vs. Vα2var mice. C. Survival of C57BL/6, Vα2var, or TCRα knockout mice after low dose aerosolized Mtb. (PDF) S6 Data. Frequency of arginine and aspartic acid in naïve cells. To determine whether the enrichment of "R" or "D" was significant, we compared their occurrence in the normal TCRβ repertoire. We queried the splenic TCRβ repertoire from three C57BL/6 mice representing over 1.1 million reads and~53,000 unique sequences each. The average frequency of "R", "D", or "RD" at CDR3β position 6, 7, or 6-7, was 10.1%, 7.5%, and 2.5% respectively, indicating "R," "D," and "RD," were significantly enriched in the clonally expanded TB10.4 4-11 -specific CD8 + T cells. Testing using the Chi-square with Yates' correction shows that the frequency of R, D, or RD in the clonally expanded sequences is significantly increased in the four cloned sequences compared to the normal splenic repertoire of B6 mice (P<0.0001). (PDF) S7 Data. Effector function of Rg3 CD8 + T cells. Rg3 CD8+ T cells were transferred to Mtb infected mice as described in the methods. Three to six weeks after aerosolized Mtb infection, lung cells were stimulated with the TB10.4 epitope. Intracellular staining for IFNg, IL-2, TNF, and granzyme were performed. The results shown are gate on Rg3 CD8+ T cells identified by their co-expression of GFP and Va2 (PDF) S8 Data. Rg4 CD8+ T cells mediate protection and are primed in the lungs of Mtb infected mice. A. Protection mediated by 10 6 , 10 5 or 10 4 in vitro activated Rg4 (e.g., TCR4) CD8 + T cells. T cells were transferred by the IV route into sublethally irradiated mice and challenged with Mtb. CFU were determined~4 wks after infection. Protection (Δlog10) is the lung CFU in mice that did not receive Rg T cells minus the lung CFU in mice that did receive Rg T cells. Two independent experiments are shown each with 5 mice. One way ANOVA with Tukey's post test to compare differences in CFU. Ã , p<0.05; ÃÃÃ , p<0.001. B. The level of TCRα staining on Rg3 vs. Rg4 CD8+ T cells from the lungs of infected mice. T test showed the difference was not significant. C. Priming of naïve Rg4 and Rg3 CD8+ T cells in the pulmonary LN of Mtb infected mice occurs with similar kinetics. Rg3 and Rg4 CD8+ T cells were co-transferred or Rg3 CD8+ T cells were transferred alone. Priming is indicated by acquisition of an activated (CD44 + CD62L lo phenotype. (PDF) S9 Data. Primers used for single cell PCR analysis of TCRs used by murine CD8+ T cells. List of the TCRα and TCRβ primers used for nested PCR for the single cell analysis of TCRs used by TB10-specific CD8+ T cells from the lungs of Mtb-infected mice. (PDF) S10 Data. Retrogenic TCR sequences. TCR sequences and 2A peptide cloned into retroviral vectors. Sequences of four TCR sequences as assembled and cloned into the retroviral vectors. The first sequence is color-coded. VαJα (blue); Cα (purple); P2A (grey); VβDJβ (yellow); Cβ (green). (PDF)