Unbiased Analysis of TCRα/β Chains at the Single-Cell Level in Human CD8+ T-Cell Subsets

T-cell receptor (TCR) α/β chains are expressed on the surface of CD8+ T-cells and have been implicated in antigen recognition, activation, and proliferation. However, the methods for characterization of human TCRα/β chains have not been well established largely because of the complexity of their structures owing to the extensive genetic rearrangements that they undergo. Here we report the development of an integrated 5′-RACE and multiplex PCR method to amplify the full-length transcripts of TCRα/β at the single-cell level in human CD8+ subsets, including naive, central memory, early effector memory, late effector memory, and effector phenotypic cells. Using this method, with an approximately 47% and 62% of PCR success rate for TCRα and for TCRβ chains, respectively, we were able to analyze more than 1,000 reads of transcripts of each TCR chain. Our comprehensive analysis revealed the following: (1) chimeric rearrangements of TCRδ-α, (2) control of TCRα/β transcription with multiple transcriptional initiation sites, (3) altered utilization of TCRα/β chains in CD8+ subsets, and (4) strong association between the clonal size of TCRα/β chains and the effector phenotype of CD8+ T-cells. Based on these findings, we conclude that our method is a useful tool to identify the dynamics of the TCRα/β repertoire, and provides new insights into the study of human TCRα/β chains.


Introduction
CD8 + T cells play an important role in adaptive immunity against virus-infected cells and tumor cells [1][2][3]. In the primary antigen response, naive CD8 + T cells are activated in secondary lymph nodes and consequently undergo clonal expansion and differentiation into effector and memory CD8 + cells that sequentially circulate in the periphery in vivo [4,5]. Effector CD8 + T cells have direct effector functions such as cytotoxic activity and cytokine production in response to the target cells, whereas memory CD8 + T cells do not show these functions, but have the ability to proliferate and secrete large amounts of cytokines when the cells are stimulated by antigens [6].
T-cell receptor (TCR)a/b chains are heterodimeric membrane proteins expressed on the surface of CD8 + T-cells, and they contribute to direct recognition of antigen peptide presented on the major histocompatibility complex (MHC) in the target cells [7,8]. The specificity of antigen recognition for diverse peptide-MHC (pMHC) complexes depends on the 3 complementarity determining regions (CDRs) of both TCRa and TCRb chains. CDR1 and CDR2 are encoded by the germline sequences and mainly used for the binding to the MHC, whereas CDR3 is known to be the highly polymorphic and the principal antigen recognition site created by extensive genomic rearrangement occurring among variable (V), diversity (D), and joining (J) segments. The diversity of CDR3 is further generated by the deletion and insertion of nucleotides within the junction of V-J and V-D-J in TCRa and TCRb chains, respectively [9][10][11].
Methods to characterize the diversity and clonality of the TCRa/b repertoire have been previously described and remarkably improved by the development of recent technologies such as TCR spectratyping [12][13][14] and deep sequencing [15][16][17][18]. However, most approaches have focused on the characterization of a single TCRb chain without consideration of the TCRa/b pairs that determine the actual TCR diversity and clonotype. There are some methods that have been described for the analysis of paired TCRa/b chain transcripts from single cells, but these methods are limited to activated human T-cells in vitro or antigenspecific mouse T-cells ex vivo [19][20][21].
The TCR amplification methods are basically categorized into 2 groups based on the utilization of the 59rapid amplification of cDNA end (RACE) method [22] or the multiplex PCR method [23]. The 59-RACE method can exclude potential bias and provide full-length TCRa/b transcripts that are useful for sequential study such as that of TCRa/b transduction, which analyzes the specificity of an antigen. However, the specificity and efficiency of PCR amplification in the 59-RACE method are low, especially when there is contamination by short fragments created by mRNA degradation or incomplete cDNA synthesis in the reverse transcription process. In contrast, the multiplex PCR method gives better specificity and efficiency of PCR than the 59-RACE method but does not provide the full length of TCRa/b transcripts. In addition, it potentially has bias because of multiple primers designed for each variable segment of TCRa/b chains.
Here, we report an unbiased method developed by the integration of 59RACE and multiple PCR methods for amplification of the full-length and paired TCRa/b chain transcripts at the single cell level in human CD8 + T-cell subsets. This method has wide applications and has allowed us to demonstrate chimeric rearrangements in TCRa/b chains, regulation of TCRa/b chain expression with multiple transcriptional initiation sites, and dynamics of the TCRa/b repertoire among different subsets of human CD8 + T cells.

Results
Amplification of full-length TCRa/b chain transcripts from single CD8 + T-cells We applied the integrated 59-RACE and multiplex PCR method for amplification of the full length of both TCRa and TCRb chain transcripts from single cells in CD8 + T-cell subsets including naive (CD27 high CD28 + CD45RA + CCR7 + ), central memory (CD27 + CD28 + CD45RA -CCR7 + ), early effector memory (CD27 + CD28 + CD45RA 2 CCR7 2 ), late effector memory (CD27 low CD28 2 CD45RA +/2 CCR7 2 ), and effector (CD27 2 CD28 2 CD45RA +/2 CCR7 2 ) phenotypic populations obtained from the peripheral blood of 3 unrelated donors (Table 1) [24]. The PCR amplifications were successfully performed ( Figure 1), and the overall PCR success rate was approximately 47% for TCRa and 62% for TCRb chains. This result is consistent with a previous report indicating that the PCR success rate for TCRb chains is slightly better than that for TCRa chains in mice [19]. Although the results of PCR amplification given by 59-RACE and multiplex PCR methods were almost identical, as shown in Figure 1B, a subset of TCRa/b chain transcripts appeared once from either 59-RACE or and multiplex PCR method. These results indicate that the integration of 59-RACE and multiplex PCR methods could increase the PCR success rate. Indeed, we selected 1250 samples for analysis (974 samples from 59RACE and 276 samples from Multiplex PCR method) for TCRa chain and 1661 samples (1075 samples from 59RACE and 586 samples from Multiplex PCR method) for TCRb chain (Table S1). Form these data, we found that approximately 80% of TCRa and 87% of TCRb chains were in-frame and that approximately 16% of TCRa chains and 6% of TCRb chains were out of frame. Among the chains, 3.4% of TCRa ones and 5.8% of TCRb ones were germline transcripts that lacked the variable segments and showed unproductive TCRs (Figure 2A and 2B) [25,26]. All germline transcripts were detected by 59 RACE, but not by the multiplex PCR method. A fraction of samples failed to identify the TCR because of unreadable sequences, which may indicate the existence of dual TCRs. Therefore, we performed PCR with a single gene-specific primer designed for the middle of the variable region in the TCRa/b chains, and the sequence analysis revealed that productive dual TCRa and TCRb chains were expressed in approximately 1.9% and 0.8% of CD8 + T-cells, respectively ( Figure 2C and 2D). Approximately 14% of CD8 + Tcells had only unproductive TCRa (Fig2A and C), though productive TCRb chain was detected in a half of these cells. We did not clarify whether these cells express only TCRb chain because success rate to detect TCRa chain by this method is approximately 50%. Furthermore, consistent with a previous report [19], the most frequent dual TCRa and TCRb were observed with a combination of productive TCR and unproductive TCR, the latter of which had either a stop codon or frameshift.
Since CDR3 is the most diverse region, created by the deletion and insertion of nucleotides within the junction of V-J and V-D-J in TCRa and TCRb chains, respectively, we analyzed the distribution of CDR3 length at the amino acid (aa) level in TCRa and TCRb chains by using samples showing only productive TCR defined by the in-frame sequence (1,011 and 1,444 reads for TCRa and TCRb chains, respectively). The analysis demonstrated that the length of CDR3a ranged from 4 to 18 aa, whereas the length of CDR3b was slightly longer, ranging from 5 to 20 aa ( Figure 2E). Further analysis of paired CDR3a and CDR3b lengths with 736 samples identified the most frequent pair at the position intersecting 11 and 13 aa of CDR3a and CDR3b lengths, respectively ( Figure 2F). These results suggest that human TCRa/ b chains may have a preferential combination of CDR3a and CDR3b lengths for recognition of diverse pMHC complexes.

Identification of transcriptional initiation sites (TISs) in genes of TCRa/b chains
Our integrated method included 59-RACE, which is able to amplify the full length of transcripts and has been used to identify the TIS of genes. Therefore, we examined whether our sequence data obtained from single cells also had the power to identify the TISs of TCRa/b chains. At first we collected samples that showed the presence of a translational initiation site in TCRa/b chains, and then the sequences were aligned with the genomic sequences to measure the length of the 59-UTR (59-untranslated region) that eventually informs the position of the TCRa/b TISs. With 773 and 930 reads for TCRa and TCRb chains, respectively, the distribution of TISs in the genes of TCRa/b chains showed that the TISs of the TCRb chain accumulated around 40-bp downstream of the translational initiation sites but that those of the TCRa chain were concentrated in 2 locations, i.e., 40 bp and 110 bp downstream of the translational initiation sites ( Figure 3A). The TISs in individual TCRa/b variable segment (TRAV and TRBV) were also analyzed, and the results showed that subsets of TRAV and TRBV transcripts started from more than 2 positions ( Figure 3B and 3C), suggesting that there are multiple binding sites for transcription factors that tightly control the expression of TCRa/b chains restricted in T lymphocytes [27,28].

Usage of TCRa/b variable segments in human CD8 + T-cell subset
To analyze the usage of TRAVs and TRBVs in human CD8 + T-cell subset, we extracted samples including the variable segments but not those identified as germline transcripts from the whole sequence data set. The assembly of the samples obtained from 1,207 and 1,540 reads for TCRa and TCRb chains, respectively (Table S2), demonstrated that 42 out of 54 TRAVs and 47 out of 64 TRBVs were detected in the CD8 + T-cell subset with different frequencies but that all pseudogenes except TRBV21-1 were not detectable ( Figure 4A and 4B). The IMGT data base together with a previous report [15] has defined TRBV21-1 as a pseudogene based on the fact that it has a frameshift in the leader sequence, but about 30% of our samples carrying the TRBV21-1 rearrangement appeared as productive TCRb chains encoding CDR1, CDR2, and CDR3 domains with in-frame sequences in the entire transcript (data not shown), suggesting that TRBV21-1 may function and be required for antigen recognition in vivo. There were several TRBV segments that were not detected in this analysis. Interestingly, we found TRDV-TRAJ rearrangement in a substantial number of TCRa transcripts ( Figure S1). These results, along with the finding of TRBV21-1 utilization, define the requirement of the 59-RACE method and the limitation of the multiplex PCR method if multiple primers are designed for variable segments in TCRa/b chains.
Using the same data set, we next analyzed the usage of TRAVs and TRBVs in each CD8 + subset among 3 unrelated donors ( Figure 4C and 4D). The results indicated that the usage of TRAV1-2 in all of the donors was significantly higher in the early effector memory cells than in other 3 subsets and that the usage of TRAV8-3 was significantly higher in naïve subset than in other 3 subsets ( Figure 4C). The usage of TRBV12-3 was significantly lower in early effector memory subset than in late effector memory subset while that of TRBV2 was significantly higher in naïve subset than in effector subset ( Figure 4D). Furthermore, the detailed analysis of TRAV1-2 showed that most TRAV1-2 had rearranged with TRAJ33 and that CDR3a was highly conserved among samples and also among donors ( Figure S2 and Table S3). In addition, the paired TCRa/b analysis showed that the TRBV6 subgroup was preferentially used for the pairing with TRAV1-2-TRAJ33 rearranged within the TCRa chain.
Usage of joining and diversity segments of TCRa/b chains in human CD8 + T-cell subset The analysis of usage of joining and diversity segments (TRAJ, TRBJ, and TRBD) was performed on the same data set as used for the TRAV and TRBV usage analysis. The result showed that 51 out of 61 TRAJs and 13 out of 14 TRBJs in TCRa/b chains were detected with different frequencies in the CD8 + T-cell subset of the 3 donors ( Figure 5A and 5B). Consistent with the observation made by TRAV and TRBV usage analysis, we found that several joining segments defined as pseudogenes were not detectable in any of the CD8 + T-cell subsets. There was no significant difference in the usage of TRBD1 and TRBD2 among the 3 donors ( Figure 5C). We observed that a subset of TCRa/b chain transcripts lacked TRAJ, TRBJ or TRBD. This result suggests that the transcripts lacking TRAJ and TRBJ may have been due to splicing errors and that the lack of TRBD may have been a consequence of deletion of its nucleotides during the process of V-D-J recombination.
To analyze the usage of TRAJ and TRBJ in each CD8 + subset among the 3 unrelated donors, we used the same data set. The analysis demonstrated that the usage of TRBJ1-1 was significantly higher in naïve subset than in early effector memory and effector subsets ( Figure 5E).

Identity and clonotype of TCRa/b chains in human CD8 + T-cell subset
Given the fact that antigen-experienced CD8 + T-cells clonally proliferate after activation and sequentially differentiate into memory phenotypic CD8 + T-cells that circulate in the periphery in vivo, it is possible that the identity and clonotype of TCRa/b chains may be in different proportions among CD8 + T-cell subsets. To examine this possibility, we at first analyzed the identity of TCRa and TCRb chains individually in each CD8 + Tcell subset by using the same data set as used for the TRAV and TRBV usage analysis. The results showed that the identity of both TCRa and TCRb chains had gradually increased from naive cells to effector cells, which had been phenotypically classified beforehand ( Figure 6A and 6B) [24]. In addition, approximately 60% of effector cells showed identical TCRa/b chains once at least, suggesting the occurrence of clonal expansion in vivo. Using a data set that provided the paired TCRa/b chain transcripts (901 paired sequence reads), we next analyzed the clonotype of TCRa and TCRb chains by satisfying the following requirements for the different samples in each CD8 + T-cell subset of the 3 donors: (1) identical usage of variable, diversity, and joining segments in TCRa/b chains and (2) perfect matching of CDR3 length and sequence at the amino acid level in TCRa/b chains. The analysis revealed that the largest proportion of clonotype was the effector subset, with none or a smaller proportion of it as naive, central memory, early effector memory or late effector memory subset ( Figure 6C). This result shows that the clonal size of CD8 + T-cells was associated with the effector phenotype, as had been previously described and classified by the degree of expression of 3 effector molecules [24]. Further analysis demonstrated that the number, type, and size of TCRa/b clonotypes were different among the donors having different HLA-types with the exception of the HLA-A*24:02 type in donor 2 and donor 3 ( Figure 6D and Table S4), suggesting that the cells showing the TCRa/b clonotype were clonally expanded after activation by the recognition of different pMHC complexes in vivo.

Discussion
The methods for TCR amplification have been previously reported and continuously improved by the development of recent technology [12][13][14][15][16][18][19][20][21]. However, we were unable to find any suitable method for our analysis based on our requirements, such as the amplification of full-length and paired TCRa/b chains from human CD8 + T-cells at the single cell level with high PCR success rate and no potential bias. Hence, we newly developed an integrated method originating from 59-RACE and multiplex PCR methods. We found that our method successfully worked across human CD8 + T-cell subsets obtained from 3 unrelated individuals without a reduction in PCR success rate compared with that obtained with methods previously described [19,21]. In the comprehensive TCRa/b analysis, the necessity of our integrated method was emphasized by the following evidence: 1) a subset of TCRa/b chain transcripts was amplified with either the 59-RACE or multiplex PCR method, 2) the 59-RACE, but not multiplex PCR, method could amplify germline transcripts and transcripts carrying the chimeric rearrangements of TCRd-TCRa 3) the combination of multiplex PCR and sequential PCR with single TRAV or TRBV primers identified the existence of dual TCR expression in single CD8 + T-cells. In light of this evidence, we expect that the full-length of paired TCRa/b chain transcripts amplified by the 59-RACE method will be useful for not only the identification of transcription initiation sites, but also for simplifying the strategy of TCRa/b tranduction to confirm the pairing and potential antigen recognition through the direct cloning into lenti-virus expression vectors, as described previously [29].
The regulation of mono-or bi-allelic expression at TCR and immmunoglobulin (Ig) loci is known to involve changes in chromatin structure, methylation, and replication timing in 2 identical alleles [30][31][32][33]. In our data set, about 80% of CD8 + Tcells showed mono-allelic expression of TCRa and TCRb chains, with expression of dual TCRa and dual TCRb in approximately 3% of them. However, a previous study demonstrated that dual TCRa expression, but not dual TCRb expression, was detected in 10,20% of influenza-specific peripheral CD8 + T-cells [19], suggesting that the difference in the frequency of dual TCR may depend on the type of cells targeted. Indeed, we found that 9.2% of early effector cells from the donor 1 expressed dual TCRa chains with single TCRb chains and that there was no association between CD8 + T-cell subsets and the frequency of dual TCRs among the 3 unrelated donors. The germilne transcription coding for unproductive TCR and Ig composed of D-J-C or J-C segments has been found to occur before the V(D)J rearrangement, and is thought to be driven by a developmental stage-specific promoter that should be activated in immature cells [25,26,34]. A striking observation in our data was the expression of germline transcripts in peripheral CD8 + T-cells, where we found that most of these transcripts were detected together with a productive TCR transcript in a subset of CD8 + Tcells. This finding, together with the result that approximately 80% of the CD8 + T-cells analyzed in this study expressed a single TCR support the idea that these germline transcripts were expressed in immature T-cells by biallelic activation at D or J locus, but that only 1 of the 2 alleles was inactivated in mature T-cells after V-(D)-J rearrangement through methylation and the changes in chromatin structure. Some of the mature T-cells analyzed escaped in this manner.
TISs have been investigated in murine and human TCRb chains, with the finding that the positions fall into a range of 19-40 bp and 26 bp upstream of the translation initiation site in most murine TRBVs and human TRBV7-2 (Vb6.7), respectively [27,28]. These results are consistent with ours in that the positions of TISs were located quite close to the translational initiation site. However, our finding that there were multiple transcription sites in a subset of TRBVs as well as TRAVs, and the fact that the promoters located in each TRBV did not have uniform transcriptional activity, suggests that the regulatory mechanism of TCRa/b expression may be more complicated and that these events may be implicated in the usage of variable segments in TCRa/b chains in CD8 + T-cell subsets. Chimeric TCRs created by the rearrangement of Vd-Ja has been found in peripheral T-cells of mice and humans, and the heterodimerization of chimeric da TCR and TCRb chains can be expressed on the surface of CD8 + T-cells and they recognize antigen presented by antigen-presenting cells [35][36][37]. We also found that a subset of CD8 + T-cells expressed the chimeric da TCR chain together with the TCRb chain. We are not able to know the function of the chimeric da TCR chain, owing to the limitations of our technology, but the finding of the clonotype showing the identical pair of chimeric da TCR chain and TCRb chains in the early effector memory subset suggests that these cells had some function in response to antigen stimulation in vivo. These findings suggest that the diversity of human TCRa/b genes may be greater than previously estimated [14].
The expression of TCRa/b chains varied among CD8 + T-cell subsets in the 3 unrelated donors having different HLA-types (except for HLA-A*24:02 type in donor 2 and donor 3), but a finding that about 10,20% of early effector memory cells in all 3 unrelated donors expressed a particular type of TCRa chain carrying a rearrangement of TRAV1-2 and TRAJ33 is interesting from an immunological point of view. Although further study with an increased number of donors will be necessary, a detailed analysis showing the highly conserved CDR3a in rearrangements of TRAV1-2 and TRAJ33 and the preferential usage of the TRBV6 subgroup as its partner may suggest that these early effector memory cells are distinct from the other subsets and recognize various pMHC complexes with some similarity at the level of protein conformation.
In summary, we described herein an unbiased method for amplification of paired TCRa/b chains at the single-cell level. We believe that our method is novel and has the potential for a wide range of applications. Indeed, the application of the method for the characterization of TCRa/b chains in CD8 + T-cell subsets could provide the first evidence that the proportion of TCRa/b identity and clonotypes is associated with the effector function of CD8 + T-cells. We expect that our method using phenotypic classification of CD8 + and CD4 + T-cells will be a useful tool to identify the dynamics of TCRa/b genes in patients with various infectious diseases or tumors and may contribute to the immunotherapy of them.

Sample preparation
Human peripheral blood mononuclear cells (PBMCs) were prepared from heparinized peripheral blood from 3 unrelated donors (Table 1), using Ficoll-Paque PLUS (GE Healthcare, Uppsala, Sweden), and stored in liquid nitrogen. Before use, the PBMCs were rested overnight in culture media (RPMI 1640 supplemented with 10% FCS, 100 U/ml MEM-NEAA, 100 U/ ml sodium pyruvate, and 200 U/ml recombinant human IL-2). This study was approved by the Kumamoto University Ethical Committee, and written informed consent was obtained from all participants.
Cell staining and single-cell sorting Surface staining of PBMCs and classification into CD8 + T-cell subsets were performed as described previously [24]. Single cells sorted from each of the CD8 + T-cell subsets by use of a FACSAria equipped with 405-, 488-, and 633-nm lasers and FACSDiva acquisition software (BD Biosciences) were plated into a 96-well plate containing cell lysis buffer (see below).

Single-cell RT-PCR with integrated 59-RACE and multiplex PCR
The sorted single cells were plated into each well of a 96-well plate with 2 ml of cell lysis buffer containing 1.5 ml of resuspension buffer (Invitrogen), 0.1 ml of lysis enhancer solution (Invitrogen), 0.03 ml of 25 mM dNTPs, 0.1 ml of 40,000U/ml RNase inhibitor, and 0.22 ml of a primer mixture (10 mM concentration of each of hTCR-CA-R2.2, hTCR-CB1-R3.2, and hTCR-CB2-R3 primer) ( Table S5). The cells were incubated at 75uC for 10 min and then put on ice immediately. cDNA was synthesized directly from cell lysates by using 6.0 ml of reverse-transcription (RT) solution consisting of 1.2 ml of 5x 1 st -strand DNA buffer (Invitrogen), 0.19 ml of 0.1M DTT (Invitrogen), 0.19 ml of RNase inhibitor (NEB), 0.19 ml of SuperScriptIII Reverse Transcriptase (Invitrogen), and 2.33 ml of DEPC-treated H 2 O (Invitrogen). RT reactions were performed at 54uC for 60 minutes followed by incubation at 85uC for 5 minutes. Template RNA was digested with 1U of RNase H (Invitrogen) at 37uC for 20 minutes. Extra primers and dNTPs were removed from RT samples by use of a Zymo-Spin TM I-96 Plate (ZYMO Research) according to the manufacturer's instructions. The purified cDNA was incubated at 94uC for 3 minutes and subsequently on ice for at least 2 minutes, and then the tailing of the cDNA was performed with 2 ml of tailing solution consisting of 0.15 ml of 10 mM dGTP (Promega), 0.15 ml of 1M P-K buffer (1M K 2 HPO 4 , 1M KH 2 PO 4 ; pH7.0), 0.48 ml of 25 mM MgCl 2 (Promega) , 40 U/ml TdT (Roche), and 1.12 ml of water. The tailing was performed at 37uC for 60 minutes followed by incubation at 65uC for 10 minutes. For amplification of TCRa-and TCRb-chain transcripts, touch-down PCR (first-round PCR) was performed in 25 ml of 2x primeSTAR GC buffer (TaKaRa), 4 ml of 2.5 mM dNTPs (TaKaRa), 1 ml of 10 mM Oligo-dc-adaptor2, 1 ml of 10 mM hTCR-CA-R7, 1 ml of 10 mM hTCR-CB1-R9, 0.625 U of PrimeSTAR (TaKaRa), and 5.75 ml of water, using the following conditions: 1) 96uC for 2 minutes, 2) 3 cycles of 96uC for 15 seconds and 72uC for 2 minutes, 3) 3 cycles of 96uC for 15 seconds, 69uC for 15 seconds, and 72uC for 1.5 minutes, 4) 3 cycles of 96uC for 15 seconds, 66uC for 15 seconds, and 72uC for 1.5 minutes, 5) 26 cycles of 96uC for 15 seconds, 63uC for 15 seconds, and 72uC for 1.5 minutes. Using 1 ml of a 1:20 dilution of the first-round PCR reactions, a nested second PCR was performed in a 20 ml reaction mixture consisting of 10 ml of 2x PrimeSTAR GC buffer (TaKaRa), 1.6 ml of 2.5 mM dNTPs (TaKaRa), 0.1 ml of 2.5 U/ml PrimeSTAR (TaKaRa), 6.66 ml of water, 0.32 ml of 10 mM AP2, and 0.32 ml of 10 mM reverse primer corresponding to the TCR constant region (hTCR-CA-R9 for TCRa chain and hTCR-CB1-R6 for TCRb chain). The PCR conditions were as follow: 1) 96uC for 2 minutes, 2) 35 cycles of 96uC for 15 seconds, 58uC for 30 seconds, and 72uC for 1 minute, and 3) 72uC for 3 minutes. Multiplex PCR was also performed with 2.5 ml of the first-round PCR product, 10 ml of Taq colorless buffer (Promega), 3 ml of 25 mM MgCl 2 (Promega), 4 ml 2.5 mM dNTPs (Promega), 50 U of Taq DNA polymerase (Promega), 0.5 ml of a 10 mM oligonucleotide mixture, containing either 1 of 54 TRAV forward primers or 1 of 65 TRBV forward primers, and 0.32 ml of the 10 mM reverse primer for the TCR constant region under the following conditions: 1) 96uC for 2 minutes, 2) 35 cycles of 96uC for 15 seconds, 57uC for 30 seconds, and 72uC for 1 minute, and 3) 72uC for 3 minutes.

Sequencing and data analysis
One microliter of the PCR products was treated with 0.2 ml of ExoSAP-IT (usb) at 37uC for 15 minutes and subsequently at 80uC for 15 minutes. Sequencing reactions were performed in a 9 ml of reaction mixture consisting of 1 ml of the ExoSAP-IT-treated PCR products, 0.15 ml of 10 mM TCR reverse primer (hTCR-alpha-1st or hTCR-alpha-1st), 1.5 ml of 5x sequencing buffer, 1 ml of BigDyeH Terminator v3.1, and 5.35 ml of water. The mixture was incubated at 96uC for 1 minutes followed by 25 cycles of 96uC for 10 seconds and 62uC for 1 minute. The sequences were determined with 3500 and 3500xL Genetic Analyzer (Applied Biosystem, USA). The repertoire of TCRa and TCRb chains was analyzed by the IMGT/ V-QUEST search tool (http://www.imgt.org/IMGT_vquest/ vquest?livret = 0&Option = humanTcR), and germ-line transcripts were identified by searching against the human genome sequences (BLAST search: http://blast.ncbi.nlm.nih.gov/). The presence of dual TCRs was detected by sequence analysis with Sequence Scanner v1.0 software (Applied Biosystem, USA). Individual TCRs were amplified by PCR with a single forward primer designed for each variable segment and the TCR reverse primer (hTCR-CA-R9 for TCRa chain or hTCR-CB1-R6 for TCRb chain). Sequencing reactions and data analysis were performed as described above. If PCR products show the sequences of two alpha and one beta chains or those of one alpha and two beta chains, a cell is evaluated to contain dual TCR. Figure S1 Non-canonical rearrangement of the alpha chain.

Supporting Information
(TIF) Figure S2 Conservation of CDR3a amino acid sequences created by TRAV1-2 and TRAJ31 rearrangements in early effector memory subset. CDR3a amino acid sequences created by TRAV1-2 and TRAJ31 rearrangements were identified by IMGT/V-Quest tool, and the conservation was analyzed by Multiple Align Show (http://www.bioinformatics.org/SMS/multi_align. html). Amino acid sequences having 100% of identity and 50% of similarity are shown in black and dark gray, respectively. (TIF)  Table S4). doi:10.1371/journal.pone.0040386.g006 Table S1 Nucleotide sequences of TCRa and b used in this study. (XLSX)