Early Relaxation Dynamics in the LC 13 T Cell Receptor in Reaction to 172 Altered Peptide Ligands: A Molecular Dynamics Simulation Study

The interaction between the T cell receptor and the major histocompatibility complex is one of the most important events in adaptive immunology. Although several different models for the activation process of the T cell via the T cell receptor have been proposed, it could not be shown that a structural mechanism, which discriminates between peptides of different immunogenicity levels, exists within the T cell receptor. In this study, we performed systematic molecular dynamics simulations of 172 closely related altered peptide ligands in the same T cell receptor/major histocompatibility complex system. Statistical evaluations yielded significant differences in the initial relaxation process between sets of peptides at four different immunogenicity levels.


Introduction
T cells (TC) do not recognize natural antigens but short peptides (called T cell epitopes) which are presented in the context of major histocompatibility complex (MHC) molecules on the surface of specialized antigen presenting cells (APCs) after uptake and processing. The interaction between T cell receptor (TCR), peptide, and MHC (TCRpMHC) is one of the most important processes in adaptive immunology. However, the detailed structural mechanism of T cell activation (TCA) is still unknown [1]. Several models for the mechanisms involved in TCR triggering have been proposed [2]: roughly they can be grouped in aggregation models, models based on conformational changes, and segregation models. It is known that T cells can recognize seemingly dissimilar epitopes [3] where sequence similarity alone is not sufficient to explain immunogenicity [4] suggesting that structural rearrangements [5], changes in heat capacity (which can be an indicator of conformational change or flexibility [6]), biochemical similarities [7], hydrophobicity, molecular weight and structural preferences [8] may play an additional important role in determining immunogenicity. Hence it is difficult to develop predictive methods for peptide immunogenicity and only few methods have been published. For example, POPI [9] and its extension POPISK [8] are based on physicochemical properties. Also the general alignment methods ALIGN [10] and PSI-BLAST [11] were used to align new peptides with known immunogenic ones [9]. In contrast, the binding affinity between peptide and MHC can be predicted with rather high accuracy [12] and therefore many predictive studies use the peptide binding affinity to MHC as an approximation for immunogenicity [13]. However, this does not provide the full picture: while binding affinity (usually ,500 nM [14]) is a prerequisite for immunogenicity, the magnitude of the pMHC affinity does not correlate well with the magnitude of immunogenicity ( [8] and therein references). The explanation for immunogenicity is rather to be found in a synergistic combination of TCRpMHC affinity, mean interaction time, and relative abundance of both complexes. However, to incorporate all these factors in a (structural) predictive model currently does not seem feasible.
Molecular dynamics (MD) [15] is a computational method to solve Newton's equations of motion for a given system of atoms. Various MD studies have been performed in relation to TCRpMHC interaction: Cuendet et al. investigated the dissociation of the TCR from the pMHC via a steered MD simulation and gave insight into the dissociation mechanism of two complexes, which differed by only a single amino acid mutation [16]. Yaneva et al. performed MD simulation studies of HLA-DR3 with and without invariant chain-associated peptide (CLIP) and found that larger conformational changes of alpha-helices flanking the MHC binding groove occur without CLIP [17]. Wan [27]. In this study they investigated the dissociation of the TCR on the basis of the reduction of hydrogen bonds, an increase of water in the interface and energetic changes. Narzi et al. used MD simulations to investigate the disease associated MHC alleles HLA-B*27:09 and HLA-B*27:05 with different viral and self-peptides [28]. In a previous study our group showed, via a combination of MD simulation, pMHC binding assays, and in vitro T cell activation assays, that an Nterminal peptide flanking region (PFR) of MHC class II can significantly influence the immunogenicity compared to the same peptide without the PFR [29]. Recently we could also show the molecular background of mug pollen Art v 1 [25][26][27][28][29][30][31][32][33][34][35][36] bound to HLA-DR1 and HLA-DR4 [30]. In another study we compared data of 3 altered peptides ligands (APLs) related to experimental allergic encephalomyelitis [31]. Since it is known that altered peptide ligands often induce alterations in the TCRpMHC interface [32] we extended our approach of investigating APLs via MD to a more systematic screening of 172 well described and strongly related TCRpMHC systems. For this purpose we performed a total of 192 MD simulations with a total length of 2 720 ns and found indications that more and less immunogenic complexes might have slightly different initial relaxation dynamics.

Experimental Data
We selected the protein data bank [33] identification (PDBid) 1mi5 as a structural basis for our study. It contains the crystal structure of LC13 TCR in combination with HLA-B*08:01 and the Epstein Barr Virus (EBV) peptide with the amino acid sequence FLRGRAYGL. We chose this system for two reasons. Firstly, the TCRpMHC structure has been determined and described as a whole (PDB-id 1mi5; [34]) as well as in its unliganded parts: TCR (PDB-id 1KGC; [35]) and pMHC (PDB-id 1M05; [36]). Secondly, Kjer-Nielsen et al. depict a systematic substitution study of all 9 amino acid positions in the peptide with the remaining 19 standard amino acids. For each of these mutations they provide the results [34] of a cytotoxicity assay [37] over a range of peptide concentrations. This yields 172 (20+8619) experimental immunogenicity values for APLs of the same TCR/MHC complex. This data set allows for a systematic and explorative comparison between effects induced by more and less immunogenic peptides.

Simplifications in the TCRpMHC Structure Avoided
Since the TCRpMHC is a large complex, many authors have only simulated the variable regions of the TCR, the epitope and the a1 and a2 domain for MHC class I, or the a1 and b1 domain for MHC class II. There is evidence that this reduction may be legitimate [38][39][40][41][42]. However, this view is not shared by everyone, see, for instance, [43]. In our study the aim is to track subtle changes in shape and dynamics. Hence we simulated the full TCRpMHC without any simplifications in the available TCRpMHC x-ray structure. This includes the constant regions of the TCR as well as the a3 region and b2 microglobulin of the MHC.

Construction of the Simulated Complexes
We modeled all 172 TCRpMHC complexes on the basis of PDB accession code 1mi5. These 172 complexes differ from each other only by one amino acid substitution in the peptide. We performed these amino acid side chain substitutions with SCWRL [44] and visually confirmed them with the substitution method of SPDBV [45], as this combination turned out to be the most appropriate way [46,47].

Molecular Dynamics Workflow
We performed MD simulations using Gromacs 4 [48] according to the following workflow. We immersed each modeled TCRpMHC structure into a three-dimensional explicit water cube with side length of 119 Å , allowing for a minimum distance of 20 Å between protein and box-boundary. Additionally, we applied periodic boundary conditions. Sub- sequently, we used a steepest descent method to minimize the energy of the system. In the next step we warmed the system up to 310 K. Finally, MD simulations were carried out for a simulation time of 10 ns for each complex (in total yielding a simulation time of 1720 ns) using bond constraints that allowed for an integration step of 3 fs [49,50]. Further simulation parameters were set to values derived in one of our previous studies [42].
To further investigate the behaviour of the simulations over a longer time period, we additionally performed 50 ns simulations of 20 TCRpMHC complexes. For this purpose we choose the complexes with mutations in position 7 of the peptide. This position is considered as pivotal for the recognition process by the TCR since the tyrosine present in the x-ray structure protrudes deep within a pocket created by CDR1alpha, CDR3alpha, and CDR3beta of the TCR [34]. Together with the 10 ns simulations this yields a total simulation time of 2 720 ns (106172+20650).

Regions of Interest within the TCR
The structure of the investigated TCRpMHC complex is shown in Figure 1. To systematically investigate the TCR we grouped all residues according to the secondary structure labelling and complementary determining regions (CDR) labelling from [35], as well as the secondary structure labelling provided by the program VMD [51]. In total we investigated 94 different residue groups (see Table 1 for a detailed list).

Root Mean Square Deviation (RMSD) Calculations
The RMSD is defined as where n is the number of atoms, i is the current atom, A is the target structure and B is the reference structure. We employed the g_rms function of Gromacs to calculate the backbone RMSD values for the regions of interest described in Table 1. Thereby we used two different fitting strategies for the simulation trajectories. Since we are interested in the deformations of the TCR we firstly fitted the model to pMHC to obtain insight into the overall movements of the TCR and, secondly, we fitted it to the TCR itself to investigate the relative deformations within the TCR. Note that these RMSD calculations yield the norm in Euclidean space between each single time step and a reference structure, which in our case was the configuration at time 0.

Statistical Evaluation
Although immunogenicity is a continuous variable and not a binary property, we had to create such a binary property, between more and less immunogenic peptides, by segregating the peptidesat four different concentration thresholds -into two classes: the more immunogenic peptides (''groupM''), and the less immunogenic peptides (''groupL'').
N Threshold 1 is determined at the upper limit of the experimental specificity assay. This means that all APLs which were able to induce 50% of maximum lysis at an arbitrary concentration within the range of the assay [34] are classified as groupM. This threshold is determined at 10 -5 M and yields 121 APLs (or 1210 ns of simulation time) in the groupM and 51 APLs (510 ns) in the groupL.    illustrated as a 1-bit image of significances (see result section). We conducted this procedure for the 4 thresholds determined above, both fitting strategies (see previous section) and p-value thresholds at a = 0.05, 0.01, and 0.001, respectively, leading to a total number of 7557600 unpaired t-tests.
It is obvious that p-values of this procedure cannot directly confirm statistical significance on the differences between the two classes due to the large number of tests. Therefore, we merely compute labels indicating whether p,a only, as an intermediate measure for differences. Statistical testing is relegated to one step further. Our new null hypothesis is that there are no consistent differences between groups and thus labels indicating p,a should occur only sporadically and about equally distributed over time and space. The alternative hypothesis is that differences show a systematic structure with a large number of adjacent labels, both in time and space.
We define three measures, applied to the entire map (image), expressing the extent to which the maps show such a systematic structure.
The first such systematicity measure is a reduction of ''salt and pepper noise'' while the edges of the image are preserved. For this purpose we used a 2-dimensional median filter employing a 363 window. This systematicity measure is defined as where I is the nR x nT image matrix of ''pixels'' (nR = number of regions, nT = number of time points), with each ''pixel'' being 1 if p,a in the t-test, 0 otherwise. The term medFilt is the median filter as described above. Note, that not all consecutive regions and residues are covalently bound. There are 6 exceptions at the end of the regions which are not adjacent to the beginning of the next region (region numbers 4, 12, 20, 39, 60, and 73). These regions cannot be expected to correlate with their non-adjacent neighbours in the region list of Table 1. Thus, we computed s 1 for 7 distinct sub-images of the 9463350 image and summed up the results. By this way only regions that show consistent adjacent differences in both space and time yield large values of s 1 . We also considered the aspect that the application to 7 sub-images may penalize boundary regions of the sub-images too strongly. Hence, we also implemented this method for the image as a whole. However, the results were almost identical (data not shown). Therefore we refer to the median-method as the method being applied to 7 subimages. The second, alternative, systematicity measure is defined as In other words, for each complete 363 window the sum of all ''pixels'' equal to 1 is squared, and the resulting values are summed up. This approach is also applied to the sub-images mentioned above. Thus, large values of s 2 are obtained only if adjacent regions show consistent differences in both space and time. This approach is referred to as the square-method.
In the third approach we did not apply any kind of filtering to the image and directly add up the ''pixels'' equal to 1: This approach is referred to as the direct-method. In order to finally test whether an observed systematicity value is significantly high, we approximate the distributions of s 1 , s 2 and s 3 under the null hypothesis ''there is no structure'' by taking 500 random permutation splits of all APLs for the thresholds 1 to 4 determined above. Note that this yields 500 permutations with 121 against 51 APLs, 500 permutations with 90 against 82 APLs, 500 permutations with 47 against 125 APLs, and 500 permutations with 33 against 139 APLs. These permutations were performed independent from position meaning that the APLs were chosen from any position and were not restricted to the grouping of their initial position in the peptide.
We then calculated the 95% percentile of the respective random distributions as critical values for the values of s 1 , s 2 and s 3 of the true splits. If that value is larger than the critical value then we can speak of a significant structure (in its entirety, not in any detail) and reject the null hypothesis.

Results and Discussion
Are Differences to be Found in Local Phenomena Rather than in Global Rearrangements?
As described in the methods section we first performed unpaired t-tests and created the corresponding difference images. The first major difference occurs between the 2 fitting strategies. It turns out that the fitting of the simulation trajectory to the TCR leads to a much sharper discrimination between the groups (data not shown) and therefore only this version is discussed in detail. In the TCR fitting version several regions are systematically highlighted as different between the groups (Figure 2). In particular, in several regions continuous ''lines'' of differences over a longer time period can be observed. This effect of fitting suggests that the spatial rearrangements during the very early relaxation process in reaction to different peptide immunogenicity classes consist of local phenomena within the TCR rather than of global rearrangements of the TCR. This is in agreement with the literature which proposes small shifts instead of major structural changes [32]. However, since the simulation time in MD simulations is limited (10 and 50 ns respectively) major structural changes occurring later cannot be disproven by this study.

''Lines'' of Differences Over Time Remain Visible Even for Smaller Alphas
We additionally applied p-value thresholds of a = 0.01 and a = 0.001 (Figure 2) to reduce the number of random hits. Even though the p-value threshold is reduced the rough positions of the ''lines'' over time remain visually (Figure 2 map plots), as well as in the sum over time for the different regions (Figure 2 bar plots). The same applies for the median-method (Figure 3).

Significance Testing: Evidence for Significant Difference between the TCRpMHC Groups for the Entirety of Regions
Following the methods described above in the methods section, we simulated the distribution of systematicity values s by creating 500 random permutation splits between the groups of TCRpMHC complexes. These distributions of the random splits are depicted as histogram in Figure 4, together with the 95% percentiles as critical values. At threshold 1 (10 -5 M) and using the direct-method only 4 out of 500 random permutations yielded a higher number of hits than the true split (p = 0.008). The results for the square-method were identical. For the median-method the result is marginally better (p = 0.006). At threshold 2 (10 -6 M) all 3 methods yield the true split as the one with the highest number of hits. Note that this does not necessarily mean that the true split has the most extreme differences of all possible combinations; it means that within these 500 random splits none was more extreme than the true split. At threshold 3 (10 -7 M) the results for the median and square-method were about equal (p = 0.016). The directmethods is marginally better in the discrimination (p = 0.014). At threshold 4 (10 -8 M) the direct-method narrowly reaches statistical significance (p = 0.046) while the median and square-method both narrowly fail the significance level (0.050 and 0.054 respectively). This could be caused by the relatively strong imbalance of 33 versus 139 TCRpMHC complexes for this threshold. However, one could still argue that there is a strong tendency. In Figure 5 the corresponding difference image of one such random permutation split is depicted for comparison with the apparent systematic structure observed for the true split. The images depict the structure before ( Figure 5AB) and after the median-method ( Figure 5CD) to illustrate the effect of noise reduction. The systematicity value for the true split is clearly above the critical value (see also Figure 4). Therefore, we can speak of the observed image pattern as being a significant one. In comparison, the random permutation split illustrated in Figure 5 yielded a systematicity value which corresponds to the median systematicity value of all 500 random permutation splits.

Most Frequently Different Regions
Among all 94 investigated regions in the TCR most regions with the largest number of differences over time form parts of the TCR beta chain. Out of the top 10 regions (the 10 regions with the highest amount of differences over time) for each of the 4 thresholds (i.e. 40 in total) 34 regions form parts of the TCR beta chain while 6 formparts of the TCR alpha chain ( Figure 3 and Table 1). Among the top 20 regions over all 4 thresholds (i.e. 80 in total) 67 regions formparts of the TCR beta chain while 13 form parts of the TCR alpha chain. These findings are in agreement with Armstrong et al. who suggested that the beta-chain may play an important role in ligand recognition [32]. In Figure 6 the spatial arrangement of the top 5 regions per threshold and method are illustrated.
If we further investigate clusters of differences, as for example visible in the bar plots of Figure 3, the results point in the same direction. Clear clusters with high amounts of differences over time are visible. In most combinations of methods and thresholds the major differences are in and around the area of the CDRbeta regions. Additional clusters of differences are visible in and around regions 76 and 90 which are both part of the constant TCR beta chain.
This finding is consistent with the known immunological background that the recognition of the pMHC is carried out with the CDRs [1]. For the LC13 TCR in complex with HLA-B*08 detailed investigations on the CDR region were previously reported by Borg et al. [52]. Here our findings partly disagree with their conclusion that both CDR3 regions are the hotspot for the 'energetic landscape' of the pMHC recognition and that changes in the CDR1 and 2 are mainly used as stabilizer for the ligated CDR3s. However, it was also shown by other authors for different TCRpMHC systems that CDR1 and 2 can directly contribute [53].
It was previously reported that the A-B loop (residues 129 to 136) of the constant domain of the TCR alpha chain represents a ''closed'' conformation in the unbound state and switches to an ''open'' conformation upon ligation [54]. It was further reported that antagonistic ligands have a differential ability to change the conformation of the A-B loop. Note, however, that our type of statistical testing only provides significance for the entirety of regions showing a systematic pattern (Figure 3) as defined in the direct-median-and square-methods. No immediate conclusions can be drawn for the statistical significance of single regions contained in this structure.

Evaluation of 20 Trajectories with a Simulation Length of 50 Ns
As mentioned in the methods section we performed an additional set of simulations for a real time of 50 ns each. In this test set we used all possible substitutions of position 7 in the peptide yielding 20 trajectories of the LC13 TCR in complex with HLA-B*08:01 presenting FLRGRAXGL (where X denotes for all 20 canonical amino acids).
Although there seems to be a tendency in the differences between the groups ( Figure 7) the application of our above described methodology for only 20 simulations (14 groupM against 6 groupL at a threshold of 10 -5 M) instead of 172 simulations did not yield statistically significant differences for the entirety of regions. However, this is an expectable result since the effect found in 172 trajectories was also not huge. Therefore it seems likely that 20 simulations are statistically underpowered for this type of analysis. Figure 7 shows the mean RMSD of the TCR alpha and beta chain for the 2 groups as well as the root mean square fluctuation (RMSF) of the CDRs. It can be seen that groupM and groupL differ in several time spans ( Figure 7AB) as well as in their flexibility ( Figure 7C-H).

How Structural Dynamics Could Influence T Cell Triggering
Given the above described differences one could ask how these differences could further trigger T cell activation: There is plethora of different hypotheses for T cell triggering available (reviewed in [2]). In some models mechanical forces and accompanying deformations play a central role. For example Ma et al. propose that the T cell is activated by pulling forces originating from the cytoskeleton that induce conformational changes in the TCR/ CD3 complex [55] [56]. In this model the weak binding between a TCR and a nonspecific pMHC can be ruptured without signalling while a specific pMHC bound to the same TCR triggers the T cell. By this way the TCR acts as an anisotropic mechanosensor [57]. The found differences in the spatial dynamics of the very early relaxation process between more and less immunogenic TCRpMHC complexes would be in line with the above described hypothesis that the TCR acts as a mechanosensor.

Conclusion
MD simulations could provide ultimate details concerning individual particle motions for many aspects of biomolecular functions [58]. They range from the role of solvent in protein dynamics [59] via the description of the structural pathway between the open and closed conformation of GroEL [60], the flexibility of a short peptide linker governing the activation of tyrosine kinases [61], to opening and closing of the long channel in acetylcholinesterase [62], to protein folding [63]. In this study we applied MD to 172 closely related APLs of a well described TCRpMHC system. This approach gave indications that the very early relaxation dynamics differ between groups of TCRpMHC complexes with different levels of peptide immunogenicity. On this basis we conclude that there is a correlation between peptide immunogenicity and the way TCRpMHC complexes start their relaxation process from the initial (perturbed) x-ray structure within the first 10 ns of simulation. It was already shown that different physicochemical properties are related to peptide immunogenicity [9] and that structural rearrangement during TCR binding contributes to T cell activation [6]. On this basis it is consequential that different physicochemical properties cause different structural dynamics and could be another level of T cell regulation. However, we also want to point out that the shortness of the simulations could have introduced a systematic bias, and large conformational changes taking place may not have been sampled by these simulations. Additionally an evaluation of a subset of 20 simulations with a length of 50 ns did show differences between the groups, however, they were not statistically significant, which might be caused by the too small sample size. Hence, further studies with several hundred of complexes with several hundred of nanoseconds each will be necessary to finally prove or disprove differences between more and less immunogenic TCRpMHC complexes.
It is known that the variable loops of the TCR undergo significant changes upon pMHC engagement leading to a loss of flexibility in the loops [64]. On this basis our study gives indications that these changes might differ on the basis of different peptide immunogenicity levels. Thereby our findings agree well with the proposed idea that the TCR functions as a mechanosensor [57] and the beta-chain may play an important role in ligand recognition [32]. On this basis our findings might have implications on the development of predictive methods, since the number of methods directly predicting immunogenicity of pMHC is limited (see Introduction).