Fig 1.
Statistical information on immunogenicity data.
(a) Schematic diagram of TCR–peptide-major histocompatibility complex (pMHC) recognition by CD8+ T-cells. (b) Uniform manifold approximation and projection (UMAP) projections of the predicted epitope-specific TCR clusters. (c) Sequence motifs of CDR3β representing the epitope-specific TCRs for the four protein epitopes. (d) Distribution of pMHC clonality in data from four donors using 10X Genomics. (e) Distribution of pMHC data from the VDJ database (VDJdb) and immune epitope database (IEDB). (f) 3D binding schematic of TCR-pHLA (from PDB ID: 6MTM). Created with BioRender.
Fig 2.
Workflow of THLANet for predicting T-cell receptor (TCR)-peptide-human leukocyte antigen (pHLA) interactions.
(a) The pipeline of THLANet for predicting the TCR-pHLA triad group interaction process. (b) Detailed architecture of THLANet: The training data exhibit a long-tail distribution. Protein sequences are processed through the ESM2 module and a convolutional neural network (CNN) to capture long-distance dependencies and features. Concurrently, the sequences are initially encoded using the BLOSUM62 matrix and embedded through a transformer encoder. The two feature matrices are fused via a bilinear attention network. In the prediction module, a multilayer perceptron (MLP)-based model predicts the interaction scores between TCR and pHLA. Created with BioRender.
Fig 3.
(a) ROC-AUC and PR-AUC in the test dataset including 10X Genomics data. (b) ROC-AUC and PR-AUC in the test dataset from public databases. (c) ROC-AUC and PR-AUC in melanoma and gastrointestinal cancer datasets. (d) ROC-AUC and PR-AUC in prolymphocytic leukemia dataset. (e) The ROC-AUC and PR-AUC in ablation experiment.
Fig 4.
(a) ROC-AUC and PR-AUC for THLANet, PanPep, pMTnet, and TABR-BERT on the unseen COVID-19 dataset from ImmuneCODE (2876 TCR-pHLA pairs).
(b) ROC-AUC and PR-AUC for THLANet and baseline models on low-frequency HLA class I dataset. (c) Comparison of area under the precision-recall curve values derived from PanPep, pMNet, TABR-BERT, and THLANet for 19 epitopes with more than ten binding T-cell receptors (TCRs) in the test dataset. A darker color and larger point size indicate a higher PR-AUC.
Fig 5.
Validation of the T-cell receptor–human leukocyte antigen (TCR-pHLA) network (THLANet) in identifying critical sites within the three-dimensional (3D) crystal structure.
(a) The complementarity-determining region 3 (CDR3) residues in the middle of the TCR sequence exhibited significant changes in scores predicted by THLANet, as validated through alanine scanning mutagenesis. The CDR3 sequence was divided into five equal-length segments for the alanine scanning analysis. (b-c) Predicted score changes for amino acid residues in the CDR3 of an example TCR–peptide-major histocompatibility complex TCR-pMHC structure (PDB ID: 8GON). The 3D structure of 8GON is illustrated: green, CDR3 of the TCRβ chain; magenta, TCRα chain; tints, other regions of the TCRβ chain; violet, antigen.