Fig 1.
The computational framework of GPSuc.
Fig 2.
Sequence logos illustrating the amino acid appearance in the sequences surrounding the succinylation sites (http://www.twosamplelogo.org/).
Nine species: H. sapiens, H. capsulatum, M. musculus, E. coli, M. tuberculosis, T. gondii, S. cerevisiae, S. lycopersicum, and T. aestivum were used.
Fig 3.
Distribution of AAF in the surrounding succinylation (gray color) and non-succinylation (red color) sequences for nine species.
The columns represent AAF, while the rows show each of amino acid residues.
Table 1.
AUC values of different combination of feature scores for training and test dataset in a generic predictor.
Table 2.
Performance of generic and species-specific succinylation site prediction on the training dataset.
Table 3.
Performance of exiting generic tools on the test dataset.
Fig 4.
Performance evaluation using single five features and the ‘combined model’ for prediction succinylation sites in nine species.
Gray colors represent the AUC value of training dataset while red colors show that of the test dataset. ‘Combined’ indicates the performance by the combined five encoding features. The final H. sapiens model was given as a linear combination of the five AAC, AAindex, binary, PSSM, and pCKSAAP features with LR coefficient values of 0.142, 1.566, 0.665, 0.342 and 0.667, respectively. In the same way, the combined H. capsulatum, M. musculus, E. coli, M. tuberculosis, S. cerevisiae, T. gondii, S. lycopersicum and T. aestivum were given with (0.102, 0.466, 0.462, 0.242 and 1.367), (0.155, 1.077, 0.575 and 0.761), (0.121, 0.473, 0.763, 0.230 and 1.214), (0.127, 0.358, 0.404, 0.109 and 1.066), (0.320, 0.391, 0.553, 0.182 and 1.122), (0.117, 0.331, 0.734, 0.139 and 1.014), (0.113, 0.417, 0.818, 0.103 and 1.172), and (0.112, 0.462, 0.723, 0.164 and 1.299), respectively. The LR constant terms for each species were set to zero.
Fig 5.
ROC curve of nine species-specific predictors of GPSuc.
(A)Training data performances over a 10-fold cross-validation test. (B) Test dataset performances.
Table 4.
Performance comparison of a species-specific predictor using the test dataset.