Figure 1.
The analytic flowchart of UbSite.
Table 1.
The statistics of non-homologous training data and independent test data for ubiquitylation and non-ubiquitylation sites.
Figure 2.
The detailed process of generating position specific scoring matrix (PSSM) and encoding the fragment of amino acid sequence by generated PSSM.
Figure 3.
The position-specific amino acid composition, accessible surface area and secondary structure of ubiquitin conjugated lysines and non-ubiquitin conjugated lysines.
Figure 4.
The hypothetic model of identifying the distant sequence features for E3 recognition.
Figure 5.
The statistically significant composition of amino acids for each position in the window length from −20 to +20.
Based on the measurement of F-score, the positions −16, −10, −3, −1, +1, +5, +10, +13, and +17, containing higher value of F-score, are significant for differentiating the ubiquitylation sites from non-ubiquitylation sites.
Figure 6.
The statistically significant evolutionary information of amino acids for each position in the window length from −20 to +20.
Based on the measurement of F-score, the positions −19, −17, −15, −12, −10, −4, −1, +5, +9, +13, +15 and +18, containing higher value of F-score, are significant for differentiating the ubiquitylation sites from non-ubiquitylation sites.
Figure 7.
The predictive performance of the models trained with different window length varying from 11-mer to 41-mer.
Table 2.
The predictive performance of cross-validation using various training features.
Table 3.
Comparison between our method (UbSite) and other ubiquitylation prediction tools.