Table 1.
Dataset Overview
Figure 1.
Comparability of the Binding Affinities between Assays
(A) Scatter plot comparing measured affinities for peptides to MHC recorded in the Buus (y-axis) and Sette (x-axis) assay systems.
(B) The agreement between experimental classifications of peptides as binders/nonbinders at different affinity thresholds (x-axis) is measured by the Matthews correlation coefficient (y-axis). The dashed lines indicates the IC50 = 500 nM cutoff commonly used for classifying peptides into binders and nonbinders, which is used in the ROC analysis.
Figure 2.
ARB, SMM, and ANN Predictions for HLA-A*0201
The first three panels depict scatter plots of the predicted binding scores (x-axis) against the measured (y-axis) binding affinities of 3,089 9-mer peptides to HLA-A*0201. The predictions were obtained in five-fold cross-validation using the ARB/SMM/ANN prediction methods, respectively. In each plot, a linear regression on a logarithmic scale was performed, and the corresponding regression equation and r2 values are given. The bottom right panel contains an ROC analysis of the same data, evaluating how well the three methods can classify peptides into binders (IC50 < 500 nM) and nonbinders. The AUC, which evaluates prediction quality, is given for each method.
Table 2.
Overview of Prediction Performance as Measured by AUC Values
Figure 3.
Prediction Performance as a Function of Training Set Size
For all datasets for which predictions with all three methods could be made, the AUC values obtained with the three prediction methods are included in the graph (y-axis). The x-axis gives the number of peptide affinities in each training set.
Figure 4.
Syfpeithi and Bimas Predictions for HLA-A*0201
The top two panels contain scatter plots of the predicted binding scores (x-axis) against the measured binding affinities (y-axis) for all 3,089 9-mer peptides binding to HLA-A*0201 in our database. Both bimas and syfpeithi do not predict IC50 values, but have output scales in which high scores indicate good binding candidates. Therefore, the regression curves are inverted. The bottom panel contains an ROC analysis of the same data with the classification cutoff of 500 nM.
Table 3.
Prediction Quality of Tools Available Online
Figure 5.
Scheme to Integrate Prediction Methods
Shown is a prediction framework providing a common interface to different prediction methods to generate new tools and retrieve predictions from them. A prediction method has to accept a set of peptides with measured affinities with which it can train a new prediction tool. It returns the URI of the new tool to the evaluation server. Using the URI, the evaluation server can check for the state of the new tool to see if training is still ongoing or if an error occurred during training. Once the tool training is completed, it has to accept a set of peptide sequences and return predicted affinities for them. The format for the data exchanged in each of these steps is defined in an xml schema definition (.xsd file), available at http://mhcbindingpredictions.immuneepitope.org.