PepLM-GNN: A graph neural network framework leveraging pre-trained language models for peptide-protein binding prediction

doi:10.1371/journal.pcbi.1014084

PepLM-GNN: A graph neural network framework leveraging pre-trained language models for peptide-protein binding prediction

Fig 3

Comparison of PepPI predictions based on cluster-split datasets for predicting novel peptides, proteins, and peptide-protein pairs.

Error bars represent the mean ± standard deviation of cross-validation experiments. The cluster-split (cold start) dataset is constructed using the CD-HIT clustering algorithm with four thresholds (0.6, 0.7, 0.8, 0.9), following the CAMP strategy: no entities from the same cluster appear in both training and test sets, resulting in three sub-datasets (“novel peptides”, “novel proteins”, “novel binding pairs”). All performance metrics are the mean ± standard deviation of five folds from five-fold cross-validation.

doi: https://doi.org/10.1371/journal.pcbi.1014084.g003