MFPSP: Identification of fungal species-specific phosphorylation site using offspring competition-based genetic algorithm

doi:10.1371/journal.pcbi.1012607

Fig 1.

MFPSP workflow.

A: sequence collection and redundancy reducing. B: feature representation by physicochemical and embedding methods. C: feature selection based on genetic algorithm with offspring competition and model construction.

More »

Expand

Table 1.

The statistics of number of proteins and phosphorylation S/T/Y sites for different organisms.

More »

Expand

Fig 2.

Comparison results of four feature selection strategies for S. cerevisiae S phosphorylation site.

OriDi: original feature dimension.

More »

Expand

Fig 3.

Accuracy values of the model constructed with embedded features with different k-mer length and window size.

More »

Expand

Fig 4.

Performance comparison of feature selection strategies for S. cerevisiae S phosphorylation site.

More »

Expand

Table 2.

The prediction performance for the fungi phosphorylation S/T/Y site in seven organisms.

More »

Expand

Fig 5.

Feature intersection and sequence patterns for S. cerevisiae S site of five fungal species.

The enrichment and depletion bias of amino acid was calculated by Two-Sample-Logos (http://www.twosamplelogo.org/).

More »

Expand

Fig 6.

Feature importance, contribution and dependency analysis.

A: the 20 most important features. B: summary plot for feature value contribution. The x-axis represents the SHAP values, representing the impact that feature had on the model’s performance. C–H: SHAP dependence plots. These plots show the effect that a single feature has on the model and the interaction effects across features.

More »

Expand

Table 3.

Performance comparison of MFPSP with existing predictors on independent test data.

More »

Expand