Skip to main content
Advertisement

< Back to Article

Fig 1.

(A) The complete graph is constructed based on protein tertiary structure, where the adjacency matrix is derived from the intra-residue distance matrix. (B) Raw node features consist of distance-based feature xv and angle-based feature xa.

More »

Fig 1 Expand

Fig 2.

Architecture of GNN-based encoder.

The BiLSTM module extracts low-level node features from the primary structures of proteins. The graph convolution module extracts high-level node features based on the adjacency matrices . The readout module transforms node features to the descriptors by a global max pooling layer. The residual blocks (ResBlock) used in the graph convolutional module consists of two graph convolutional (GC) layers.

More »

Fig 2 Expand

Fig 3.

The contrastive learning framework for protein structure representation learning.

At each iteration, raw features Xq and Xk are extracted from the query protein structure and the key protein structure, respectively. Then, descriptors yq and yk are encoded by GNN encoder and , respectively. The value of loss function guides the optimization of the parameters θq of while the parameters θk are updated based on θq. At the end of the current iteration, yk will enqueue as a negative sample for the next iteration.

More »

Fig 3 Expand

Table 1.

Ablation studies of length-scaling cosine distance, the dynamic training data partition strategy and the GNN-based encoder on SCOPe v2.07 and ind_PDB.

More »

Table 1 Expand

Table 2.

Ranking performance of GraSR and other baseline methods.

More »

Table 2 Expand

Fig 4.

Correlation between distance derived from the representations learned by GraSR/DeepFold and TM-score on (A) SCOPe v2.07 and (B) ind_PDB.

The Pearson correlation coefficient (PCC) is calculated for quantitative assessment.

More »

Fig 4 Expand

Fig 5.

The F1-score of each class in SCOPe of GraSR and other baseline methods.

a: All alpha proteins; b: All beta proteins; c: Alpha and beta proteins (a/b); d: Alpha and beta proteins (a+b); e: Multi-domain proteins (alpha and beta); f: Membrane and cell surface proteins and peptides; g: Small proteins.

More »

Fig 5 Expand

Table 3.

Multi-class classification performance of GraSR and other methods.

More »

Table 3 Expand

Table 4.

Time cost of GraSR and other methods for protein structure retrieval from ind_PDB.

More »

Table 4 Expand

Fig 6.

Visualization of descriptors learned from GraSR and other methods by t-SNE.

a: All alpha proteins; b: All beta proteins; c: Alpha and beta proteins (a/b); d: Alpha and beta proteins (a+b); e: Multi-domain proteins (alpha and beta); f: Membrane and cell surface proteins and peptides; g: Small proteins.

More »

Fig 6 Expand

Fig 7.

Protein structure superposition derived from the residue-level descriptors of GraSR.

(A) SCOPe-sid: d1v59a2 (red) and d1h6va2 (blue) (B) SCOPe-sid: d5dqpa_ (red) and d1ezwa_ (blue).

More »

Fig 7 Expand