Fig 1.
The overall architecture of SuperEdgeGO.
Stage I. The input protein sequence is first sent to the protein language model ESM-2 to generate the feature matrix, and to the protein structure model AlphaFold2 to predict structures, which is eventually processed as the adjacency matrix. Stage II. The two matrices are fed into the model that consists of three graph attention layers, a pooling layer, and a fully-connected classifier. Particularly, the graph attention layer contains both unsupervised and supervised attention modules. Stage III. The model is optimized via minimizing two losses, namely the main task loss arising from the wrong prediction of GO terms, and the self-supervised loss
coming from the deviation of attention scores from the binary label indicating the presence of edges.
Table 1.
Range of hyperparameter comparison experiments.
Table 2.
Performance of SuperEdgeGO under the four supervised attention strategies.
Fig 2.
-
three-dimensional diagram.
Fig 3.
Model performance of different hyperparameter settings.
(a) The model achieves its optimal results when =0.01; (b) The model achieves its optimal results when the dropout rate is set to 0.2.
Fig 4.
The execution time of SuperEdgeGO and other baseline methods on the (a) MF-GO terms, (b) BP-GO terms, and (c) CC-GO terms of the Human dataset.
Note that the evaluation was conducted based on NVIDIA GeForce RTX 4090 and may vary depending on the experimental settings.
Table 3.
Performance comparison of SuperEdgeGO and other baseline methods on the Human dataset.
Table 4.
Performance comparison of SuperEdgeGO and other baseline methods on the MF-GO category of the cross-species dataset.
Table 5.
Performance comparison of SuperEdgeGO and other baseline methods on the BP-GO category of the cross-species dataset.
Table 6.
Performance comparison of SuperEdgeGO and other baseline methods on the CC-GO category of the cross-species dataset.
Table 7.
Ablation experimental results on the human protein dataset.
Fig 5.
Four strategies to generate the supervised attention score .