Figure 1.
Volume representation of the MHC H-2Kb peptide binding cleft derived from a crystal structure model (PDB-ID [63]: 1vac [64]) with the protein-bound epitope SIINFEKL.
Numbers within dashed boxes correspond to sequence positions in the respective epitope. Red boxes and associated amino acid codes indicate anchor positions and preferred amino acid composition respectively. The yellow box indicates the secondary anchor at position 3, according to the H-2Kb canonical sequence motif.
Figure 2.
The Immune Epitope Database (IEDB) served as data source [14]. The murine H-2Kb allele and a length of eight were selected due to good data availability and reference model character. The core set exhibits similar contributions of positive (636) and negative (526) examples, while redundant ambiguous entries were removed. The core set contained binders and non-binders partially, fully and not agreeing with the canonical sequence motif.
Figure 3.
Workflow for the development of the cascaded machine-learning model.
ANN: feed-forward artificial neural network, SVM: support vector machine. AAFREQ, BINAATYPE, BINPEP, PEPCATS, PPCA and PPCALI correspond to the utilized peptide descriptors (cf. Methods). TP, FP, FN and TN correspond to entities of a confusion table with true-positives, false-positives, false-negatives and true-negatives.
Figure 4.
Architecture of the best-performing cascaded machine-learning model based on six first stage classifiers originating from three differing descriptor sets and two learning schemes (ANNs, SVMs) and a jury neural network containing three hidden neurons.
The model delivers a prediction score from the interval [0,1[, with high values indicating MHC-I H-2Kb binding.
Figure 5.
(a) Distribution of prediction scores for sets of n SIINFEKL-fragment containing octapeptides. High scores indicate MHC-I H-2Kb binding predictions, while lower scores indicate non-binding predictions. Scores were computed with the final jury prediction model. (b) Binding of synthesized SIINFEKL-fragments to a MHC-I H-2Kb:IgG1 fusion protein relative to the average binding of the positive reference (SIINFEKL, 100%) and the unloaded fusion protein (NoLigand, 0%). Bars correspond to the arithmetic mean of quadruplicate measurements, with error whiskers depicting the volatility in terms of standard deviation.
Figure 6.
(a) Melting curves of peptide-MHC-I (H-2Kb:IgG fusion protein) complexes depicting the normalized fluorescence F in relative fluorescence units (RFU) for SIINFEKL and NoLigand as positive (red line) and negative (black line) controls, an exemplary epitope fragment (INFE) showing no melting point shift (grey line) and all epitope fragments (IINFEKL, SIINF, INFEKL, SIINFEK, SIINFE) leading to a significant melting point shift (green lines). (b) Analogously the first derivative of F (dF) reveals the melting points as local minima, with Tm denoting the presumable MHC-I heavy chain melting point in the absence of peptide ligand (NoLigand).
Table 1.
Significance analysis examining quadruplicate binding measurements of SIINFEKL fragments.