E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction

Artificial intelligence-powered protein structure prediction methods have led to a paradigm-shift in computational structural biology, yet contemporary approaches for predicting the interfacial residues (i.e., sites) of protein-protein interaction (PPI) still rely on experimental structures. Recent studies have demonstrated benefits of employing graph convolution for PPI site prediction, but ignore symmetries naturally occurring in 3-dimensional space and act only on experimental coordinates. Here we present EquiPPIS, an E(3) equivariant graph neural network approach for PPI site prediction. EquiPPIS employs symmetry-aware graph convolutions that transform equivariantly with translation, rotation, and reflection in 3D space, providing richer representations for molecular data compared to invariant convolutions. EquiPPIS substantially outperforms state-of-the-art approaches based on the same experimental input, and exhibits remarkable robustness by attaining better accuracy with predicted structural models from AlphaFold2 than what existing methods can achieve even with experimental structures. Freely available at https://github.com/Bhattacharya-Lab/EquiPPIS, EquiPPIS enables accurate PPI site prediction at scale.


Introduction
Protein-protein interactions (PPI) underpin numerous biological processes [1,2].Despite their importance, experimental characterization of PPI remains challenging due to the costly and time-consuming nature of the experimental assays [3].Computational methods offer a cheaper and high-throughput alternative by predicting the bound complex structures of interacting proteins from the sequences and/or the unbound structures of individual protein chains.A closely related problem-and the one addressed in this study-is the prediction of the PPI sites, which are the interfacial residues of the interacting protein chains.
Accurately predicting the interface of interacting proteins and the identification of the PPI sites remain challenging even after decades of research [4][5][6].Various methods have been proposed, but with limited success.Partner-independent PPI site prediction [7][8][9][10][11][12], which involves the prediction of putative interaction sites based only upon the surface of an isolated protein, without any knowledge of the partner or complex, is even more challenging compared to partner-aware PPI site prediction [13][14][15][16][17] due to the absence of any information about the partner protein and auxiliary information on the complex interfaces.In this work, we focus on partner-independent PPI site prediction.
Predicting how proteins interact, and in particular, predicting the PPI sites, has a long history [12,15,[18][19][20][21][22][23][24][25].While initial models focused on feature engineering with machine learning [7,19,26,27], subsequent work sought to capture more complex patterns using deep learning [8,10,13,14,17].The vast majority of the existing methods rely on readily available protein sequence information, but their predictive accuracies are often quite limited [28].Structure-based methods that integrate known structural information from the Protein Data Bank (PDB [29]) are usually more accurate.However, these approaches are limited by the paucity of experimentally solved protein structures in the PDB.In the 14th edition of the Critical Assessment of Structure Prediction (CASP14) experiment, AlphaFold2 [30] attained an unprecedented performance level, enabling highly accurate prediction of single-chain protein structural models at proteome-wide scale [31,32].Given the recent progress, a natural question arises: can we leverage the predicted structural information by AlphaFold2 for accurate partner-independent PPI site prediction at scale?
In the recent past, representation learning with graph structured data has been prevailing in different applications.In particular, graph neural networks (GNNs) have surged as the major choice for deep graph learning [33][34][35].GNNs are permutation equivariant networks that operate on graph structured data, with numerous applications ranging from dynamical systems to conformational energy estimation [36,37].However, off-the-shelf GNNs do not take into account symmetries naturally occurring in 3-dimensional space.That is, they ignore the effects of invariance and equivariance with respect to the E(3) symmetry group, i.e., the group of rotations, reflections, and translations in 3D space.The recent E(n) equivariant graph neural networks [38] address this problem by being translation, rotation, and reflection equivariant in 3D space that can be scaled to higher dimensional spaces (E(n)), while preserving permutation equivariance.SE(3) equivariant neural networks [39] are another recent graph-based models that can deal with the absolute coordinate systems in 3D space, but SE(3) equivariant models do not commute with reflections of the input.E(3) equivariant neural networks, on the other hand, transform equivariantly with translation, rotation and reflections, which make them suitable for molecular data where chirality of the molecules is often important, such as proteins, particularly when predicted protein structures are used as input that may contain mirror images.As such, E(3) equivariant graph neural networks offer an elegant choice for partner-independent PPI site prediction, where the input consists of a 3D structure of an isolated protein, known to be involved in PPIs, but where the structure of the partner or complex is not known.Being designed from geometric first-principles, symmetry-aware models such as E(3) equivariant graph neural networks are highly suitable for 3D molecular data, providing richer representation while avoiding expensive data augmentation strategies.
The contribution of the present work is the introduction of a symmetry-aware PPI site prediction method, EquiPPIS, built on E(3) equivariant graph neural networks that yields stateof-the-art accuracy by substantially outperforming existing approaches based on the same experimental input.What is more striking is that EquiPPIS attains better accuracy with Alpha-Fold2-predicted structural models as input than what existing methods can achieve even with experimental input.We directly verify that the performance gains are connected to the unique E(3) equivariant architecture of EquiPPIS.The robustness and performance resilience of our method enable large-scale PPI site prediction without compromising on accuracy.

Results
Overview of the E(3) equivariant graph neural network architecture for protein-protein interaction site prediction input protein monomer into an undirected graph G ¼ ðV; EÞ, with V denoting the residues (nodes) and E denoting the interaction between nonsequential residue pairs according to their pairwise spatial proximity (edges).The spatial proximity between nonsequential residue pairs (i.e., having sequence separation greater or equal to 6) is determined by calculating the Euclidean distances between the C α atom of all residue pairs and then setting a cutoff distance of 14Å to obtain the interacting pairs, determined through ablation experiments using an independent validation set.The sequence-and structure-based node and edge features (see the Methods section) are then fed into the second module together with the C α coordinates extracted from the input monomer.The second module (Fig 1B) is a deep E(3) equivariant graph neural network that conducts a series of transformations of its input through a stack of equivariant graph convolution layer (EGCL) [38], each updating the coordinate and node embeddings using the edge information and the coordinate and node embeddings from the previous layer.Finally, a sigmoidal function is applied to the last EGCL node embedding to predict the probability of every residue in the input monomer to be a PPI site, thereby converting the PPI site prediction into a graph node classification task (Fig 1C).

Experiments
For training and performance evaluation, we use a combination of three widely used and publicly available benchmark datasets: Dset_186 [7], Dset_72 [7], and Dset_164 [40], named by the number of proteins in the datasets.We follow the same train and test splits from the recent PPI site prediction method GraphPPIS [10], which combines the aforementioned three datasets and subsequently removes redundancy, resulting in 335 targets as the training set (Train_335) and 60 targets as the test set (Test_60).Details of our training procedure are provided in the Methods section.We evaluate our proposed method on a diverse series of challenging test scenarios using standard performance evaluation metrics including accuracy, precision, recall, F1-score (F1), Matthews correlation coefficient (MCC), area under the receiver operating characteristic curve (ROC-AUC), and area under the precision-recall curve (PR-AUC).First, we demonstrate that EquiPPIS improves upon state-of-the-art accuracy on Test_60 dataset by comparing it directly against a wide variety of existing approaches including PSIVER [7] (based on Naïve Bayes), ProNA2020 [41] (based on neural network), SCRIBER [42] (based on two layers of logistic regression), DLPred [43] (based on long short-term memory), DELPHI [9] (based on an ensemble of convolutional and recurrent neural networks), DeepPPISP [8] (based on convolutional neural network), SPPIDER [11] (based on support vector machine, neural network, and linear discriminant analysis), MaSIF-site [44] (based on geometric deep learning), and GraphPPIS (based on deep graph convolutional network).Among the competing methods, PSIVER, ProNA2020, SCRIBER, DLPred, and DELPHI are purely sequence-based methods; whereas DeepPPIS, SPPIDER, MaSIF-site, GraphPPIS, and our new method EquiPPIS additionally integrate structural information.Next, we examine the reasons for such high performance attained by EquiPPIS and verify that it is indeed connected to the equivariant nature of the model used.To broaden the applicability of our method beyond predicting PPI sites based only on experimentally-solved input monomers extracted from the bound complex structures, we explore a number of extensions including using unbound experimental structures and computationally predicted structural models as input.Specifically, using a subset of 31 proteins from the Test_60 dataset with known unbound monomeric structures in the PDB as an additional unbound test set (hereafter called UBt-est_31), we show that EquiPPIS exhibits remarkable robustness and performance resilience compared to the existing approaches.Furthermore, by replacing the experimental input monomers with AlphaFold2 predicted structural models for the Test_60 dataset, we demonstrate that EquiPPIS attains state-of-the-art accuracy, which is better than what the top performing competing method can achieve even with experimental structures.The superior performance of EquiPPIS even when using predicted structural models as input dramatically enhances the scalability of partner-independent PPI site prediction without compromising on accuracy.Finally, we examine the relative importance of each feature we adopted by conducting feature ablation experiments using an independent validation set consisting of 42 targets (hereafter called Validation_42) collected from the Test_315 dataset of the published work of GraphPPIS after filtering out proteins with >25% pairwise sequence identity with our test sets.We also use this validation set for hyperparameter selection.

Test set performance
We compare EquiPPIS with five sequence-based (PSIVER, ProNA2020, SCRIBER, DLPred and DELPHI) and four structure-based (DeepPPISP, SPPIDER, MaSIF-site, and GraphPPIS) predictors on the Test_60 set.As shown in Table 1, in addition to outperforming the sequence-based methods (PR-AUC ranging from 0.190 to 0.319) by a large margin, EquiPPIS significantly improves upon state-of-the-art accuracy by outperforming the structure-based methods.Remarkably, EquiPPIS is the only method attaining ROC-AUC of more than 0.8, which is noticeably better than the closest competing method GraphPPIS.Interestingly, the published work of GraphPPIS sets the goal of achieving ROC-AUC of 0.8 as a motivation for future work, while acknowledging it as one of the current impediments.In summary, EquiP-PIS is a leap forward for partner independent PPI site prediction.EquiPPIS on the other hand attains reasonably accurate predictive performance having Precision, Recall, F1, and MCC of 0.595, 0.564, 0.579, and 0.53, respectively.In both cases, EquiPPIS predictions are strikingly similar to the experimentally observed PPI sites.

Analyzing the importance of equivariance
In the above experiments, EquiPPIS exhibits significantly improved performance.In order to gain insight into the reasons behind such high performance and verify that it is connected to the equivariant nature of the model, we perform a series of experiments by gradually isolating the effect of the equivariant graph convolutions used in EquiPPIS.In particular, we train several baseline models and compare them head-to-head with the full-fledged version of EquiP-PIS.First, we train a baseline network by turning off the coordinate updates of the equivariant graph convolution layers, thus making it an invariant network (hereafter called 'EquiPPIS invariant').Since the full-fledged version of EquiPPIS employs attention operations for aggregated embedding as part of the equivariant message passing, we train another baseline network where attention operation is turned off during equivariant message passing, resulting in an equivariant network but without attention (hereafter called 'EquiPPIS w/o attention').Additionally, we train two off-the-shelf GNNs for PPI site prediction: graph convolution network (GCN) [35] and graph attention network (GAT) [45].All baseline networks are trained on the same Train_335 dataset using the same set of input features and hyperparameters as the fullfledged version of EquiPPIS (see the Methods section).In addition to prediction accuracy, robustness of model is another key aspect to consider.While experimentally-solved bound complex structures are used during EquiPPIS training, protein-protein binding often leads to conformational changes by "induced fit" mechanism (binding first) or "conformational selection" (conformational change first) [46].To evaluate the robustness of EquiPPIS and the effect of conformational changes, we examine the impact on the accuracy when unbound structures are used during prediction instead of their bound states for EquiPPIS as well as the other structure-based PPI site predictors (DeepPPISP, SPPI-DER, MaSIF-site, and GraphPPIS) using the unbound test set (UBtest_31) of 31 proteins.As shown in Fig 3E and 3F, EquiPPIS outperforms all other methods by a large margin, while having the least impact on accuracy when unbound structures are used during prediction.For example, the closest competing methods MaSIF-site and GraphPPIS suffer from significant accuracy drop both in terms of MCC (35%, and 14.6% drop, respectively) and PR-AUC (24.7%, and 18.2% drop, respectively), whereas EquiPPIS experiences only 4.6%, and 5.5% drop in MCC and PR-AUC, respectively.What is most striking is that the accuracy gap between EquiPPIS and the competing methods is so large that EquiPPIS using unbound structures attains much better accuracy even when the competing methods are using the bound structures.That is, EquiPPIS exhibits remarkable robustness and performance resilience compared to existing approaches.

Beyond experimental input: state-of-the-art performance with AlphaFold2
EquiPPIS achieves state-of-the-art accuracy with experimental structures as input in both bound and unbound states.A natural question to ask is can we achieve similar predictive accuracy when computationally predicted structural models are used as input instead of experimental structures?Given the exceptional performance of AlphaFold2 in the CASP14 experiment and the open availability of the AlphaFold2 protocol, it is now possible to predict single-chain protein structural models from the amino acid sequence with high degree of accuracy.In principle, a robust method such as EquiPPIS should be able to generalize when predicted structural models are used without significant drop in accuracy, thereby broadening its applicability beyond experimental input.Motivated by the prospect, we examine the impact on the accuracy by replacing the experimental input with AlphaFold2 predicted structural models for the Test_60 dataset.Table 2 shows the performance of EquiPPIS compared to the closest competing method GraphPPIS.Remarkably, EquiPPIS using AlphaFold2-predicted structural models attains better accuracy (PR-AUC = 0.451) than GraphPPIS using experimental structures (PR-AUC = 0.429), let alone GraphPPIS using predicted structural models (PR-AUC = 0.399).While there is a performance decline for both methods when switching from experimental input to prediction, the performance drop for EquiPPIS is lower (ΔPR-AUC = 0.016) than that of GraphPPIS (ΔPR-AUC = 0.03).The results demonstrate the generalizability of EquiPPIS, thus opening the possibility of large-scale PPI site prediction by utilizing high-throughput computational prediction without compromising on accuracy.

Significance test
To investigate if our performance improvement is significant or due to chance, we perform significance tests, following the procedures of previous studies [47][48][49].Specifically, we randomly sample 70% of the test set (Test_60), and calculate the F1, MCC, ROC-AUC, and PR-AUC scores of EquiPPIS and the closest competing method GraphPPIS with both experimental and AlphaFold2 predicted structures, where the same structures are used as inputs to both the prediction methods.We repeat this 10 times and obtain 10 pairs of scores.If the measurement is normal, which is determined through Anderson-Darling test [50], then paired ttest is used to calculate significance of the measurement.If the measurement is not normal, then we use Wilcoxon rank sum test [51].As reported in Table 3, EquiPPIS is statistically significantly better than GraphPPIS on both experimental and predicted structures at 95% confidence level with p-values < 0.05 for all four metrics.

Impact of secondary structure content on prediction accuracy
To examine the impact of physical characteristics of the input structures on the prediction accuracy of EquiPPIS, we analyze the secondary structure content of the proteins for the

Running time analysis
We analyze the running time of EquiPPIS and the closest competing method GraphPPIS for all targets in the Test_60 set on the same Linux machine with identical hardware environment.

Ablation studies and choice of hyperparameters
To examine the relative importance of the features adopted in EquiPPIS, we   PSSM and protein language model-based ESM2 feature contribute more than just the residue type features.We notice a significant performance drop when all three sequence-based features are isolated (No seq We also use the Validation_42 set to select the hyperparameters.Based on the results of the grid search as shown in Fig 5B to 5G, we select a 10-layer EGCL framework with 256 hidden units and the cutoff distance used to obtain the interacting residue pairs is set to 14Å.We use the hyperparameters selected in the independent validation set during training and testing.

Discussion
This work introduces EquiPPIS, a symmetry-aware deep graph learning model for proteinprotein interaction site prediction based on E(3) equivariant graph neural networks.We demonstrate that EquiPPIS outperforms existing methods and despite being trained on experimental structures, it generalizes extremely well to predicted structural models from AlphaFold2 to the extent that EquiPPIS attains better accuracy with predicted structural models than what existing approaches can achieve even with experimental structures.Through controlled experiments, we verify the importance of equivariance as one of the major driving forces behind the improved performance.In addition to questions around the effect of equivariance on accuracy, our ablation study on an independent validation set confirms the contribution of various features adopted in EquiPPIS.Our study leads to a series of interesting questions to consider: of particular interest is the possibility of broadening the applicability of our method beyond experimental input for large-scale PPI site predictions with high accuracy by utilizing rapid computational prediction.In this regard, considering the diversity of the predictive modeling ensemble and accounting for the conformational states of the interacting proteins having multi-state conformational dynamics may help broaden the horizon of computational PPI site prediction.Further, a promising direction for future work is to investigate the potential benefits of explicitly including multiple sequence alignment (MSA) information and measure the extent to which it may influence the accuracy.While an MSA-free method such as EquiPPIS offers some unique advantages by being broadly applicable even for proteins that do not have homologous sequences in the current sequence databases and bypasses the computational overhead of MSA searching, MSA may still provide a rich source of additional information for further improving the accuracy of PPI site prediction that might be worth exploring.Finally, while we find that EquiPPIS exhibits excellent predictive accuracy and remarkable robustness, an open challenge that remains is the interpretability of our deep learning model.The evolutionary and functional significance of the residues predicted to be in PPI site by means of the latent representation underlying the neural architecture of EquiPPIS still need to be systematically explored.We expect our proposed method can be easily extended to other biomolecular interaction site prediction tasks, including predicting protein-binding sites with other molecules, such as DNA, RNA and small ligands, as well as predicting gene-gene interaction and gene-networks with improved accuracy and robustness.

Materials and Methods
Graph representation and featurization.

Graph representation
We represent the input protein monomer as a graph G ¼ ðV; EÞ, in which a node v 2 V represents a residue and an edge e2 E represents an interacting residue pair.We consider two residues to be interacting if their C α Euclidean distance is no more than 14Å.The cutoff 14Å is chosen on an independent validation set as presented in Fig 5 .To focus on longer-rage interactions, we only consider interacting residue pairs having a minimum sequence separation of 6.

Node features
We use three types of sequence-based node features: (1) one-hot encoding of the residue (i.e., a binary vector of 20 entries indicating its amino acid type), resulting in L×20 feature set, where L is the length of the input monomer; (2) position specific scoring matrix (PSSM) obtained by running PSI-BLAST [52] to obtain L×20 feature set by considering the first 20 columns from the PSSM and normalizing the values by applying a sigmoidal function; and (3) features from ESM2 [53], which is a recent protein language model trained on 15 billion parameters, leading to L×33 feature set, after normalizing the values using sigmoidal function.
Additionally, we extract a total of L×45 structure-based node features from the structure of the input monomer by either calculating various structural information directly from the 3D coordinates or by running the DSSP [54] program.We describe them below.
Secondary structure and relative solvent accessibility.We use one-hot encoding of both 3-state and 8-state secondary structures (SS), leading to L×3 and L×8 feature sets, respectively.Additionally, we use one-hot encoding of 2-state relative solvent accessibility (RSA) by adopting an RSA cutoff of 50 (L×2 feature set), and use finer-grained RSA binning by discretizing into 8 bins as 0-30, 30-60, 60-90, 90-120, 120-150, 150-180, 180-210, and >210, represented by one-hot encoding (L×8 feature set).The rationale of using multiple discretization of SS (e.g., 3-state and 8-state) and RSA (e.g., 2-state and 8-state) is to combine multi-level information, and ablation studies guiding these decisions are presented in Fig 5 .Local geometry.We use a total of L×11 feature set from the local geometries by calculating various planar and torsion angles of the polypeptide chain, including (1) the cosine angle between the consecutive residues of the C = O bond; (2) sine and cosine of the virtual bond and torsion angles formed between the consecutive C α atoms; (3) normalized values of the backbone torsion angles.
Relative residue positioning.To capture the relative positional information for each residue, we extract two types of features: (1) for i th residue, we use the inverse of i to capture the relative sequence position (L×1 feature set); and (2) we use the inverse of the Euclidean distance from the centroid of the input protein monomer to the C α atom of the i th residue to capture the spatial positioning of a residue relative to the overall structure (L×1 feature set).
Residue orientation.To define the orientation of each amino acid residue, we adopted features from a recent work [55] including (1)  Residue virtual surface area.An amino acid residue can be perceived as a virtual convex hull constructed by its atoms.We calculate the virtual surface area of the convex hull and use its inverse as a feature (L×1 feature set).
Contact count.If the Euclidean distance between the C β atoms of a residue pair is within a cutoff of 8Å, the two residues can be considered to be in contact.We calculate the contactcount by calculating the number of spatial neighbors of a given residue (i.e., residues which are in contact) and use the normalized number of contact count per residue as a feature (L×1 feature set).
The sequence-based amino acid (L×20 feature set), PSSM (L×20 feature set), and ESM2 (L×33 feature set) features are concatenated with the structure-based node features (L×45 feature set), leading to a total of L×118 features, which serves as an input to our E(3) equivariant graph neural network model (S2 Fig).

Edge features
As the edge feature for the graph G ¼ ðV; EÞ, we calculate the ratio of the logarithmic sequential separation of two residues (i.e., logarithm of the absolute difference between the two residue indices) corresponding to two nodes in the graph and the Euclidean distance between them, defined as: Here, the numerator captures how the two residues are separated in the primary sequence while the denominator captures their spatial interactions.

Network architecture
We formulate the PPI site prediction into a graph node classification task and predict the probability of every residue in the input monomer to be a PPI site using a deep E(3) equivariant graph neural network.The network architecture consists of a stack of equivariant graph convolution layer (EGCL) [38], performing a series of transformations of its input by updating the coordinate and node embeddings using the edge information and the coordinate and node embeddings from the previous layer.The EGCL operation attains equivariance primarily by changing the standard message passing (m ij ¼ ϕ e ðh l i ; h l j ; a ij ÞÞ [56] to equivariant message passing and by introducing coordinate updates in the graph neural network, as follows: where a ij denotes edge features; h l i , and h l j are node embeddings at layer l for nodes i and j, respectively; ϕ e , ϕ h , and ϕ x are multilayer perceptrons (MLP) for edge, node, and coordinate operations, respectively.Equivariant message passing for an edge (i, j) is attained by considering the squared distance between node i and j: ðkx l i À x l j k 2 Þ in the edge operation.The coordinate update for node i is obtained by the weighted sum of coordinate embedding difference from the previous layer, normalizing with a factor C = 1/(M−1), where M is the number of nodes in the graph.The weights for the sum are generated through the multilayer perceptron (MLP) of coordinate operation applied on the equivariant message passing (m ij ) for each edge (i, j).
Unlike off-the-shelf graph neural networks that aggregate messages only from the neighboring nodes, equivariant graph neural networks aggregate messages from the whole graph.Additionally, an attention embedding (m a ) can be employed through an attention operation (ϕ a ), which is a linear transformation on the aggregated message embedding (m i ), followed by a sigmoidal non-linear transformation.The 'attended' aggregated message embedding is subsequently obtained through a scalar multiplication with the attention embedding.The node embedding is updated by applying an MLP (ϕ h ) on the aggregated message and the node embeddings of the previous layer, as follows: Finally, A linear transformation (ϕ 0 ) is applied to squeeze the hidden dimension (h L i ) of the last EGCL, followed by a sigmoidal function to obtain the node-level classification (p i ) for PPI site prediction: The network architecture of EquiPPIS consists of 10 layers of EGCL with a hidden dimension of 256, where the hyperparameters are chosen on an independent validation set, and empirical results guiding these decisions are presented in Fig 5 .EquiPPIS is implemented on Pytorch 1.12.0 [57] and Deep Graph Library (DGL) 0.9.0 [58].We use binary cross entropy loss function between the node-level prediction and the ground truth, and we utilize cosine annealing scheduler from SGDR [59], ADAM optimizer [60] with a learning rate of 1e-4, and weight decay of 1e-16.The training process consists of at most 50 epochs on an NVIDIA A40 GPU.In addition to the full-fledged version of EquiPPIS, we train several baseline models on the same Train_335 dataset using the same set of input features and hyperparameters including off-the-shelf graph convolution network (GCN) [35] and graph attention network (GAT) [45], both implemented using the DGL [58], as well as two variants of EquiPPIS: (1) 'EquiPPIS invariant', an invariant network with the coordinate updates of the equivariant graph convolution layers turned off; and (2) 'EquiPPIS w/o attention', an equivariant network with the attention operation turned off during equivariant message passing.

Datasets, benchmarking, and performance evaluation
We use a combination of three widely used and publicly available benchmark datasets: Dset_186 [7], Dset_72 [7], and Dset_164 [40].Dset_72 is created based on protein-protein benchmark version 3.0 [61], Dset_186 is constructed through a six-step filtering process which involves the exclusion of structures containing over 30% missing residues, identical Uni-protKB/Swiss-Prot accessions, interface polarity and buried surface accessibility below specific thresholds, as well as oligomeric structures, transmembrane, and redundant protein structures, where 186 targets are collected from known protein complexes, and Dset_164 consists of 164 targets obtained from known heterodimers.While Dset_186, Dset_72, and Dset_164 are nonredundant data sets independently, the combined dataset is further reduced to 395 targets by filtering out redundant chains among the datasets.Following the same train-test split as GraphPPIS [10], we use a train set (Train_335) having 10,374 and 55,992 interacting and noninteracting residues, respectively; and a test set (Test_60) having 2,075 and 11,069 interacting and noninteracting residues, respectively.In the Train_335 set, the average length of protein is ~198 residues ranging from 44 to 869 residues.In the Test_60 set, the average length of protein is ~219 residues ranging from 52 to 766 residues, with no homodimeric protein-protein interaction present within this set.To assess the robustness of EquiPPIS and examine the effect of conformational changes on its performance, we analyze a subset of 31 proteins from the Test_60 set with known unbound monomeric conformations.This additional unbound test set (UBtest_31) having 841 and 5813 interacting and non-interacting residues, respectively, is adopted from the published work on GraphPPIS.Additionally, we adopt a dataset named Test_315 from the published work on GraphPPIS consisting of newly solved protein complexes that are non-redundant to the train set.We filter out 42 targets from the Test_315 set by discarding protein chains having more than 25% pairwise sequence identity with our test set and create an independent validation set (Validation_42) to perform feature ablation and hyperparameter selection.
During prediction, we use both experimentally-solved structures as well as on AlphaFold2-predicted structural models as input.We run AlphaFold2 with default parameter settings by locally installing the officially released version [30] to generate five predicted structural models and then select the model with the highest pLDDT confidence score.For target 4cdgA that failed during the MSA generation stage of the AlphaFold2 pipeline, we run Colabfold [62] that uses MMSeqs2 [63] for MSA generation and subsequently employs AlphaFold2 protocol for structure prediction.
EquiPPIS is compared against both sequence-based (PSIVER [7], ProNA2020 [41], SCRIBER [42], DLPred [43], and DELPHI [9]) and structure-aware (DeepPPIS [8], SPPIDER [11], MaSIF-site [44], and GraphPPIS [10]) PPI site prediction methods.PSIVER employs a Naïve Bayes classifier along with kernel density estimation by utilizing sequence-based features.ProNA2020 combines homology modeling with a neural network for residue-level PPI site prediction.SCRIBER employs two layers of logistic regression, where the first layer utilizes sequence-based features while the second layer combines the output from the first layer for the final prediction.DLPred employs a simplified long-short-term memory model for PPI site prediction.DELPHI uses an ensemble of convolutional and recurrent neural networks architectures with a large feature set.DeepPPISP combines local contextual features with global features and employs convolutional neural networks to predict PPI sites.SPPIDER leverages support vector machine, neural network, and linear discriminant analysis with an extensive feature search and extraction process.MaSIF-site predicts PPI sites by learning protein structural fingerprints through geometric deep learning.GraphPPIS employs structure-aware deep residual neural networks for PPI site prediction.
For benchmarking and performance assessment, we use standard performance evaluation metrics including accuracy, precision, recall, F1-score (F1), and Matthews correlation coefficient (MCC), defined as: Precision � Recall Precision þ Recall MCC ¼ TP � TN À FN � FP ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ðTP þ FPÞ � ðTP þ FNÞ � ðTN þ FPÞ � ðTN þ FNÞ p where, TP denotes the number of true PPI site residues that are correctly predicted, FP denotes the number of non-PPI site residues that are incorrectly predicted to be in PPI sites, TN denotes the number of non-PPI site residues that are correctly predicted, and FN denotes the number of PPI site residues that are incorrectly predicted as non-PPI site.We additionally use area under the receiver operating characteristic curve (ROC-AUC) and area under the precision-recall curve (PR-AUC) for performance evaluation.

Fig 1 Fig 1 .
Fig 1. Illustration of the EquiPPIS method for protein-protein interaction site prediction.(a) The input protein monomer is converted into an undirected graph.(b) Equivariant graph convolutions are then employed on the input graph.(c) PPI sites are finally predicted as a graph node classification task.https://doi.org/10.1371/journal.pcbi.1011435.g001

Fig 2
presents two representative examples from the Test_60 dataset comparing the PPI site predictions using EquiPPIS and GraphPPIS.For the first example of a sugar binding protein of Trichosanthes kirilowii (PDB ID: 1GGP, chain A) having length 234 (Fig 2A), EquiPPIS correctly predicts majority of the observed PPI sites, attaining Precision, Recall, F1, and MCC of 0.8, 0.545, 0.649, and 0.601, respectively; whereas GraphPPIS fails to predict any correct PPI sites with Precision, Recall, F1 and MCC of 0, 0, 0, -0.185, respectively.The second example is a Hydrolase inhibitor of Triticum aestivum in complex with Bacillus subtilis (PDB ID: 2B42, chain A) having length 364 (Fig 2B), where GraphPPIS predicts many false positive PPI sites, resulting in low Precision, Recall, F1 and MCC of 0.105, 0.231, 0.144, and -0.004, respectively.
Fig 3A to 3D show the performance of EquiPPIS compared to the baseline networks on the Test_60 set.The results demonstrate that the full-fledged version of EquiPPIS outperforms all baseline models.For example, we observe that the full-fledged version of EquiPPIS attains an ROC-AUC of more than 0.8, which is the best accuracy compared to all baseline models.The 'EquiPPIS invariant' baseline, however, falls short of achieving an ROC-AUC of 0.8, suggesting that it is the equivariant nature of EquiPPIS that is responsible for the accuracy gain.Turning off the attention operation as done

Fig 4
presents a target-wise running time comparison between EquiPPIS and GraphPPIS.Not surprisingly, free from time-consuming MSA-searching, EquiPPIS demonstrates noticeably lower running time compared to GraphPPIS.Overall, EquiPPIS is considerably efficient in terms of the running time.
conduct feature ablation experiments by gradually isolating the contribution of individual feature or groups of features during model training and evaluating the accuracy on the independent validation set Validation_42. Fig 5A shows the accuracy decline measured in terms of ΔPR-AUC and ΔROC-AUC when various features are isolated from the full-fledged version of EquiPPIS.The results demonstrate that all features contribute to the overall accuracy achieved by EquiPPIS.For example, we notice accuracy decline when we isolate the sequence-based features one by one including amino acid residue type (No AA), position specific scoring matrix (No PSSM), and protein language model ESM2 (No ESM2).Not surprisingly, the evolutionarily feature

Fig 4 .Fig 5 .
Fig 4. The running time of EquiPPIS and GraphPPIS on Test_60 set.For each target, input protein length versus runtime (in seconds) of EquiPPIS (blue) and GraphPPIS (red) are shown.https://doi.org/10.1371/journal.pcbi.1011435.g004 the forward and reverse unit vectors in the directions of C α (i+1) − C α i and C α (i−1) − C α i , respectively (L×6 feature set); and (2) the unit vector in the imputed direction of C β i − C α i , computed by assuming tetrahedral geometries and normalization (L×3 feature set).

Table 1 . PPI site prediction performance on the Test_60 dataset for various methods.
Note: Except EquiPPS, results for the other methods are obtained directly from the published work of GraphPPIS; values in bold represent the best performance.https://doi.org/10.1371/journal.pcbi.1011435.t001

Table 2 . Performance comparison between EquiPPIS and GraphPPIS with experimental input and AlphaFold2 predicted structural models for the Test_60 dataset.
Results obtained directly from the published work of GraphPPIS; values in bold represent the best performance.
Test_60 set.The targets are divided into three groups: (1) helices (referred to as 'Primarily helix'), (2) beta strands (referred to as 'Primarily beta'), and (3) mixture of helices and beta strands (referred to as 'Mix').We calculate the ROC-AUC scores for each group and compare them with the results obtained from the full set.S1 Fig presents a comparison of our prediction accuracy in terms of ROC-AUC scores across the groups with different physical characteristics based on secondary structure.EquiPPIS achieves a higher ROC-AUC of 0.855 in the 'Primarily helix' group compared to the full set's ROC-AUC of 0.805.However, in the 'Primarily beta' group, EquiPPIS attains a somewhat lower ROC-AUC of 0.783.As for the 'Mix' group, EquiP-PIS achieves the ROC-AUC of 0.799, which is comparable to that obtained on the full set.The results demonstrate that secondary structure content has a minor impact on the prediction accuracy with primarily helical proteins yielding the best performance, whereas there is still room for improvement for the proteins that are primarily made of beta strands.

Table 3 . Significance test between EquiPPIS and GraphPPIS using experimental and AlphaFold2 predicted structures.
For all four metrics, the two numbers reported are mean and variance respectively; values in bold represent the best performance in terms of mean. https://doi.org/10.1371/journal.pcbi.1011435.t003 ).Similarly, we notice consistent accuracy decline when we discard the structure-based features individually including secondary structure (No ss), relative solvent accessibility (No rsa), local geometry (No local geom), residue orientation (No orient), contact count (No contact count) as well as relative residue positioning and residue virtual surface area (No res pos + area).Because we use multi-level discretization of secondary structure (e.g., 3-state and 8-state) and relative solvent accessibility (e.g., 2-state and 8-state) as well as backbone torsion angles, which are closely related to the secondary structure, we conduct feature ablation experiments by discarding the 8-state secondary structure, 8-state relative solvent accessibility, and backbone torsion angles.The resulting model (No multi-level) shows accuracy decline compared to the full-fledged version of EquiPPIS, indicating the effectiveness of combining multi-granular information.Finally, we also notice an accuracy drop when we isolate the edge feature (No edge) that takes into account the contributions of sequence separation and spatial interaction.