Predicting protein complexes using a supervised learning method combined with local structural information

The existing protein complex detection methods can be broadly divided into two categories: unsupervised and supervised learning methods. Most of the unsupervised learning methods assume that protein complexes are in dense regions of protein-protein interaction (PPI) networks even though many true complexes are not dense subgraphs. Supervised learning methods utilize the informative properties of known complexes; they often extract features from existing complexes and then use the features to train a classification model. The trained model is used to guide the search process for new complexes. However, insufficient extracted features, noise in the PPI data and the incompleteness of complex data make the classification model imprecise. Consequently, the classification model is not sufficient for guiding the detection of complexes. Therefore, we propose a new robust score function that combines the classification model with local structural information. Based on the score function, we provide a search method that works both forwards and backwards. The results from experiments on six benchmark PPI datasets and three protein complex datasets show that our approach can achieve better performance compared with the state-of-the-art supervised, semi-supervised and unsupervised methods for protein complex detection, occasionally significantly outperforming such methods.


Introduction
A group of proteins that interact with one another for specific biological activities is called a protein complex [1]. Predicting protein complexes is helpful for understanding the principles of cellular tissue [2,3], predicting protein functions [4], identifying disease genes [5] and discovering drug-disease associations [6]. Modern experimental techniques have revealed a large amount of protein interactions, thereby enabling protein complexes to be predicted from protein-protein interaction (PPI) networks.
In recent years, automatic computational approaches have increasingly been proposed for detecting protein complexes from PPI networks [7]. A PPI network can be represented as an undirected graph, where the nodes denote the proteins and the edges denote the interactions PLOS ONE | https://doi.org/10.1371/journal.pone.0194124 March 19, 2018 1 / 23 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 [8]. Existing protein complex detection approaches can be broadly grouped into two categories: unsupervised and supervised learning methods. The majority of the unsupervised methods detect protein complexes by discovering the densely connected subgraphs in the PPI network using predefined rules. The Markov clustering method (MCL) partitions the PPI network into densely connected subgraphs by simulating random walks within the graph [9]. The molecular complex detection (MCODE) method isolates the dense regions by growing the local weighted seeds [10]. The restricted neighbourhood search clustering (RNSC) method partitions networks into clusters based on a cost function, which is assigned to each partitioning [11]. The CFinder method discovers clusters by combining adjacent k-cliques identified via clique percolation [12,13]. The clustering based on maximal cliques (CMC) [2] method is also a clique-based method that detects complexes by removing or merging cliques based on their inter-connectivity. The repeated random walks (RRW) [14] method implicitly utilizes network topology, edge weights and long-range interactions by repeated random walks to identify protein complexes. The clustering with overlapping neighbourhood expansion (Clus-terONE) [15] method finds subgraphs with high cohesiveness by greedy adding or removing nodes starting from seed nodes. The majority of the unsupervised methods are simply based on the topological structure of the PPI network and do not utilize the information of the existing true complexes [7]. These methods assume that protein complexes are in dense regions of PPI networks, but in fact, many true complexes are in sparse regions. Fig 1 shows two sparse complexes from the Munich Information Center for Protein Sequences (MIPS) complex catalogue database [16]. Therefore, using topological attributes alone is not sufficient for efficiently detecting protein complexes [8].
In recent years, supervised learning methods have been developed to detect complexes by utilizing the informative properties of known complexes. These types of methods consist of three main steps: 1) extract useful features from the known complexes and denote them as vectors, 2) train a supervised classification model or score function to distinguish the true protein complexes from random subgraphs based on the extracted features, and 3) search for protein complexes from PPI networks using the trained classification model or score function as a guide. For example, SCI-BN [17] is a supervised method that trains a probabilistic Bayesian network to score the subgraphs. RM [7] trains a regression model to score the subgraphs. NN [18] is a semi-supervised method, and it trains a neural network model on the training sets and uses the trained model to detect new protein complexes. Then, NN uses the new predicted complexes to adjust the parameters of the model recursively until the model converges. The final converged neural network model is used to guide the search process for detecting protein complexes. ClusterEPs [8] defines an integrative score of emerging patterns (EPs) to measure the likelihood of a subgraph being a complex.
The supervised methods extract features from true complexes and learn a prediction model, and then they use the model as a guide in the protein complex search process. However, the PPI data contain considerable amounts of noise, and many of the benchmark clusters are incomplete; thus, the trained prediction model is inaccurate. The existing supervised methods only use the prediction model to guide the search process for detecting protein complexes. In this paper, we first define a new score function that combines a supervised model with unsupervised structural information. Based on this score function, we propose a search algorithm that works both forwards and backwards to identify protein complexes from PPI networks. We use a neural network as the classification model, and we adjust the output of the neural network at each step using the local structural information. Our method is named Clus-terSS (clustering with supervised and structural information).
To assess the performance of ClusterSS, we compared ClusterSS with supervised, semisupervised and unsupervised learning methods. First, we compared ClusterSS with three supervised learning methods, namely, ClusterEPs [8], SCI-BN [17] and RM [7], and with the semi-supervised learning method NN [18]. The results showed that ClusterSS achieved considerably better performance (precision, recall and F1) on the commonly used DIP PPI network [19]. We then compared ClusterSS with seven unsupervised learning methods: MCL [9], MCODE [10], RNSC [11], CFinder [12,13], CMC [2], RRW [14] and ClusterONE [15]. The PPI datasets are five large-scale yeast PPIs, including Collins, Krogan core, Krogan extended, Gavin and BioGRID. The two protein complex datasets are the MIPS complex catalogue database [16] and the Saccharomyces Genome Database (SGD) [20]. Comparative experiments showed that ClusterSS achieved the highest fraction score and maximum matching ratio (MMR) score among all seven literature methods on all five PPI datasets and a higher composite score than the other methods.
In case studies, we analyzed the prediction results of ClusterSS, ClusterEPs and ClusterONE on five protein complexes. The results indicated that only ClusterSS could detect the origin recognition complex (ORC) and the Pwp2p-containing subcomplex of 90S preribosome complex completely and correctly. From the gene ontology (GO) analysis, we obtained four predicted clusters that have not previously been identified as complexes. However, their low pvalues suggest that these clusters are very likely complexes in the biological sense.
The algorithm has been implemented in Python, and both the software and source code are available from the authors.

Methods
A PPI network can be represented as an undirected graph G = (V, E, W), where V denotes the set of nodes (proteins), E denotes the set of edges (interactions), and W denotes the weights of the edges. Let S = (V S , E S , W S ) be a subgraph of G, and let N ext (G, S) be the external neighbours of S of G, which is defined as follows: Opposite to N ext (G, S), we define N int (G, S) as follows: that is, the nodes in N int (G, S) have edges with the nodes in V − V S . Illustrations of N ext (G, S) and N int (G, S) are shown in Fig 2. Our ClusterSS method includes three main steps: 1) extracting features, 2) determining the score function, and 3) searching for complexes in PPI networks. These steps are described in detail in the following subsections.

Extracting features
To measure the similarity between complexes, we represent each subgraph as a feature vector. First, we select 24 features, which are divided into 9 groups, as follows: 1) node size (the number of nodes in subgraph S), 2) graph density (the density of subgraph S), 3) degree statistics, 4) edge weight, 5) degree correlation statistics, 6) clustering coefficient statistics, 7) topological coefficients, 8) first eigenvalues, and 9) protein weight/size statistics (see S1 Table for details). Second, by performing a sequential backward feature selection, we remove two feature groups: degree correlation and protein weight/size. Thus, the feature vector contains 7 groups, which include 17 features. Finally, we extract all 17 features from the subgraphs, mapping from the true complexes in the training set, and we denote them as the positive instances. For each true complex in the training set, we produce 20 complex-unlikely random subgraphs with the same size, and we extract features from these subgraphs as the negative instances. Therefore, the negative instances are 20 times the number of positive instances and obey the same distribution. We place the positive and negative instances together and denote it as dataset D for training the supervised neural network model.

Determining the score function
For dataset D derived from the input PPI network, a neural network model is trained to fit the probability of subgraph S belonging to true complexes (as shown in Fig 3). We choose a threelayer fully connected neural network. The input layer contains 17 nodes according to the number of features in D, the hidden layer contains 9 nodes, and the output layer contains two nodes, O 1 and O 2 (the details of the parameters are shown in S1 Text). Given a subgraph S of G, we can calculate the two outputs O 1 (G, S) and O 2 (G, S). By normalizing O 1 (G, S), we obtain the supervised score, where O 1 (G, S) denotes the probability of subgraph S belonging to true complexes and O 2 (G, S) denotes the probability of subgraph S belonging to false complexes. The higher the supervised score is for subgraph S, the higher is the probability that it belongs to true complexes.
To improve the supervised model, we introduce the structural score, which was used by ClusterONE [15] to make the score function more robust and accurate. The structural score is defined as follows: where w in (G, S) denotes the total weight of edges of subgraph S of G and w out (G, S) denotes the total weight of edges between node sets V s and V − V s in G. The higher the structural score is for subgraph S, the higher is the probability that it belongs to true complexes. The final clustering score f(G, S) used to guide the search process is defined as follows: The values of f(G, S) range from 0 to 1. A larger f(G, S) suggests that S is more likely to be a protein complex.

Searching for new complexes
The clustering score f(G, S) is used as a heuristic function in the process of searching for new complexes in graph G. First, we need to determine the start nodes of the search process, and we consider them to be the initial clusters. Then, these clusters are updated according to the heuristic function f(G, S).
We choose a both forwards and backwards strategy in the complex search process. At each search step, the current candidate cluster is denoted as C. We go through all the nodes in N ext (G, C) to find a node u that maximizes the score function f(G, C[{u}) and go through all the nodes in N int (G, C) to find a node v that maximizes the score function f(G, C − {v}).
is higher than f(G, C − {v}), and v is deleted from C otherwise. The asymptotic time complexity of this process is approximately O(n 3 ); this process is very time consuming, particularly when the scale of G is relatively large. Thus, we design a trick called top-k to accelerate the search process. Let BS be the bipartite subgraph of G induced by node sets N ext (G, C) (see Eq 1) and N int (G, C) (see Eq 2), as shown in Fig 2. We sort the nodes of N ext (G, C) and N int (G, C) in descending order according to their degrees in BS, and we take the first k nodes in each of the two sets as the candidate node sets, denoted as N k ext ðG; CÞ and N k int ðG; CÞ, respectively. At each search step, the candidate nodes are selected from N k ext ðG; CÞ (or N k int ðG; CÞ) rather than N ext (G, C) (or N int (G, C)); thus, the search process becomes quicker. In fact, we take k = 5 in our experiments (S6 Table presents the running time comparison of our top-k trick). We also design a hyper-parameter α to control the growth scale of candidate complexes. The search process will stop when the new score function is less than α times the old function. A larger value of α will cause the search process to complete earlier. We set the value of α to 1.02 in all the experiments (S2 and S3 Tables present the performance comparison of ClusterSS with different values of α). The details of the algorithm are shown in Algorithm 1.
The final step is to merge the highly overlapping clusters as in [15]. We also merge each pair of cluster with an overlapping score ω [10] that is no less than the threshold of 0.9. The overlapping score of two clusters A and B is defined as follows: If we find two clusters in which their overlapping score is not less than the threshold, we merge them and add them to the protein complex candidates. This process is performed iteratively until there is no pair of clusters with an overlapping score that satisfies the threshold.

Algorithm 1 The algorithm of ClusterSS
Input: G: the PPI network; T: the training set containing known complexes; α: a hyper-parameter that controls the growth scale of candidate complexes; Output: P: the set of predicted complexes; 1: for each cluster 2 T do 2: Extract 17 features from cluster and treat them as positive instances and add them to instance set D; 3: Generate 20 subgraphs of G randomly with the same size as cluster; extract 17 features from each of them and treat them as negative instances and add them to instance set D; 4: end for 5: Train a neural network model on dataset D; then, determine the supervised score function based on Eq 3, and then obtain the adjusted score function f(G, S) based on Eq 5; 6: Find the nodes in G with a degree of greater than 1 as the start nodes and denote it as ST; 7: for each v 0 2 ST do 8: Initialize candidate cluster C = {v 0 }, and calculate the score function f(G, C); 9: repeat 10: αf(G, C)) 21: Add candidate cluster C to set P; 22: end for 23: for each pair of clusters c i and c j in P do 24: if w(c i , c j ) > 0.9 according to Eq 6 then 25: Merge c i and c j and add it to P; 26: end if 27: end for 28: return P

Results and discussion
This section consists of four parts. We first compare the performance of ClusterSS with those of supervised and semi-supervised methods. Then, we present our results of the comparison with the unsupervised methods. In the third part, we analyze two examples of detected protein complexes. Finally, we present the GO analysis on the novel protein complexes predicted by our method.

Evaluation measures and datasets
We used six PPI datasets and three benchmark protein complex datasets in all the experiments. The PPI datasets include the DIP dataset [19], the Gavin dataset [21], the Krogan core dataset [22], the Krogan extended dataset [22], the Collins dataset [23] and the Bio-GRID dataset [24].
The detailed properties of the PPI datasets are shown in Table 1. The benchmark protein complex datasets include the TAP06 [21] dataset, the MIPS dataset [16] and the SGD [20] dataset.
Similar to ClusterEPs, we use three measures, namely, precision, recall and F1-measure, to evaluate the performance of the supervised learning methods. Recall measures the ratio of complexes in the benchmark datasets that match at least one complex in the predicted protein complex datasets, and precision measures the ratio of complexes in the predicted protein complex datasets that match at least one of the complexes in the benchmark datasets. F1-measure is the harmonic mean of precision and recall. Let denote the benchmark complex datasets, and let P = {p 1 , p 2 , Á Á Á, p j , Á Á Á, p n } denote the protein complex sets predicted by a method, where b i and p j represent the i th and j th complexes in B and P, respectively, and m and n represent the number of complexes in B and P, respectively. For two protein complexes b i and p j , if the overlapping score ω(b i , p j ) [25] as defined in Eq 6 is greater than or equal to 0.25, then b i and p j are considered to be matching. Let N bp be the number of the benchmark complexes that match at least one predicted complex, and let N pb be the number of the predicted complexes that match at least one of the benchmark complexes; then, N bp and N pb are defined as follows: N bp ¼ jfbjb 2 B; 9p 2 P; oðb; pÞ ! 0:25gj; ð7Þ The precision, recall and F1-measure are defined as follows: To compare with unsupervised methods, we use three measures: fraction (Frac), geometric accuracy (ACC) and MMR [8]. The definition of fraction is the same as that of recall. ACC is the geometric mean of clustering-wise sensitivity (Sn) and clustering-wise positive predictive value (PPV) [3]. Let T be an n × m matrix, and let T ij represent the number of proteins found in both b i and p j . Then, Sn(B, P), PPV(B, P) and ACC(B, P) are calculated as follows: PPVðB; PÞ ¼ AccðB; PÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi SnðB; PÞ Â PPVðB; PÞ where |b i | represents the number of proteins in complex b i . The MMR [15] is a measure that is based on the maximal one-to-one mapping between B and P, and it explicitly penalizes cases where a benchmark complex is split into two or more parts in the predicted set because only one part is allowed to match the benchmark complexes [15]. The MMR is calculated as follows: 1) construct a bipartite graph BG between B and P, in which each cluster is represented as a node; 2) for each cluster b i in B and each cluster p j in P, connect b i and p j by an edge with a weight of ω(b i , p j ) if ω(b i , p j ) > 0; 3) select disjoint edges from BG to maximize the sum of their weights; and 4) the MMR is the total weights of the selected edges divided by |B|.

Comparison with supervised and semi-supervised learning methods
In this part, we first compare the prediction performance of ClusterSS with three existing supervised methods, namely, SCI-BN [17], RM [7] and ClusterEPs [8], and with the semisupervised method NN [18] on the DIP [19] dataset, which follows the approach used by Clus-terEPs. Considering that ClusterEPs is the most recent supervised method, we subsequently compared it with ClusterSS in detail on the other five datasets, including the Gavin dataset [21], the Krogan core dataset [22], the Krogan extended dataset [22], the Collins dataset [23] and the BioGRID dataset [24].
Because the programs of SCI-BN and RM are not available, ClusterEPs compared them based on their published results; therefore, we also compared with their published results. The PPI dataset for the test is the DIP [19] dataset. SCI-BN used an SVM-based method to filter out the interactions that have a score below 1.0. RM used a GO-based method to filter out the interactions that have a GO score of less than 0.9. ClusterEPs preprocessed the PPI network using the topological clustering semantic similarity (TCSS) [26] method and filtered out the interactions that have a biological process (BP) score of less than 0.5. ClusterSS employed the same processing method as ClusterEPs.
The true protein complex datasets for the test are the two independent datasets MIPS [16] and TAP06 [21]. We removed the complexes composed of a single or pair of proteins from the two datasets. There are 195 complexes remaining in MIPS and 193 complexes remaining in TAP06 after preprocessing. There are a total of 1579 proteins in the MIPS and TAP06 complex datasets, and we extracted a PPI subgraph of these proteins from DIP. Then, we tested Clus-terSS on this PPI graph.
To assess the protein complex identification performance, we performed the experiments using MIPS as the positive training set and TAP06 as the test set and vice versa. We chose three measures, namely, precision, recall and F1, to evaluate the performance. The results are presented in Table 2. As shown in this table, when MIPS was considered as the training set and TAP06 as the test set, ClusterSS achieved the highest scores on all three measures. Specifically, the F1 measure of ClusterSS was 12.0 percentage points higher than that of ClusterEPs, 61.2 percentage points higher than that of SCI-BN and 43.1 percentage points higher than that of RM. When TAP06 was used as the training set and MIPS as the test set, the F1 measure of ClusterSS was slightly higher than that of ClusterEPs. Clus-terEPs has a higher precision score; however, ClusterSS has a considerably higher recall score. Both of these methods have higher scores compared with SCI-BN and RM on all three measures.
As a semi-supervised learning model, NN [18] was evaluated using MIPS as both the training set and test set; thus, we tested ClusterSS under the same settings. The results are presented in Table 3. As shown in this table, ClusterSS has considerably higher scores compared with NN and other supervised methods on all three measures. Specifically, the F1 measure of Clus-terSS was 90.4 percentage points higher than that of NN, 8.8 percentage points higher than that of ClusterEPs, and substantially better than those of SCI-BN and RM.
In the following, we conducted further comparisons between ClusterEPs and ClusterSS on the other five PPI datasets. ClusterEPs trained models on the training sets and then searched for complexes on the subgraphs of PPI networks. The subgraphs only consist of those proteins that exist in the training set or in the test set. We compared the performances of ClusterSS and ClusterEPs under the same conditions and same measures for a fair comparison. The measures include fraction, accuracy and MMR, and the sum of these three measures is denoted as the composite score. Because ClusterSS and ClusterEPs need negative instances selected randomly in the training process, we ran ClusterEPs and ClusterSS 20 times to calculate the average performance. We first conducted the experiment using MIPS as the positive training set and using SGD as the test set. The results are presented in Table 4. As shown in this table, ClusterSS outperformed ClusterEPs on all five datasets. We then conducted the experiment using SGD as the positive training set and using MIPS as the test set. The results are presented in Table 5.
As shown in this table, ClusterSS achieved a higher fraction score, accuracy score and composite score than ClusterEPs on all five datasets. Except for Gavin, ClusterSS achieved a higher MMR score on the other four datasets.

Comparison with unsupervised learning methods
In this part, we compare the performance of ClusterSS with seven representative unsupervised approaches: MCL [9], MCODE [10], RNSC [11], CFinder [12,13], CMC [2], RRW [14] and ClusterONE [15]. The experiment was conducted on five large-scale yeast PPI networks, including the Gavin dataset [21], the Krogan core dataset [22], the Krogan extended dataset [22], the Collins dataset [23] and the BioGRID dataset [24]. The benchmark complex sets are the MIPS dataset [16] and the SGD dataset [20]. The three evaluation measures are the Frac, the Acc and the MMR, and we denote the sum of the three measures as the composite score. For a fair comparison, all parameters of the other seven methods on every PPI dataset were the same as those used in ClusterONE. To compare with the unsupervised methods, ClusterSS searched for complexes on the entire PPI networks rather than their subgraphs. We first conducted the experiment using MIPS as the positive training set and using SGD as the test set. The results are presented in Fig 4. As shown in this figure, ClusterSS achieved the highest fraction, MMR and composite score on all five PPI datasets. We then conducted the experiment using SGD as the positive training set and using MIPS as the test set. The Detect protein complexes combining supervised method and local structural information results are presented in Fig 5. As shown in this figure, ClusterSS achieved the highest fraction score and MMR score on all five datasets. ClusterSS did not achieve the highest accuracy and composite score on the Collins and Gavin datasets, but the scores are close to the highest score and are significantly higher than those of the other six methods. We do not provide the results Detect protein complexes combining supervised method and local structural information of CFinder and CMC on the BioGRID dataset because CFinder did not provide any results within 24 hours and CMC predicted an exorbitantly large number of clusters (more than 6000) [8]. In addition, the composite scores of ClusterSS in Figs 4 and 5 are slightly lower than those in Tables 4 and 5. The main reason for this result is that ClusterSS searched for complexes on the entire PPI networks in this section, whereas it searched for complexes on the subgraphs in the previous section.

Case study
ClusterEPs and ClusterONE are the latest supervised and unsupervised protein complex detection methods; thus, we present a detailed case study of ClusterSS, ClusterEPs and ClusterONE on three non-overlapping complexes and a pair of overlapping complexes. The three nonoverlapping complexes include the retromer complex, the Pwp2p-containing subcomplex of 90S preribosome and the DASH complex. The pair of non-overlapping complexes are the RSC and the SWI/SWF complexes.
The retromer complex is a central component for eukaryotic DNA replication, and it remains bound to chromatin at replication origins throughout the cell cycle [27] and contains 6 proteins. The Krogan extended PPI dataset contains the subgraph of this complex. Figs 6-8 show the predicted subgraphs of this complex by ClusterSS, ClusterEPs and ClusterONE, respectively. As shown, ClusterSS could detect the retromer complex completely and correctly. ClusterEPs missed one protein and added three unrelated proteins. Although Cluster-ONE found all the proteins of ORC, three unrelated proteins were added in the detection result.
The Pwp2p-containing subcomplex of 90S preribosome contains 6 proteins, and the Collins PPI dataset contains the subgraph of this complex. Figs 9-11 show the predicted subgraphs of this complex by ClusterSS, ClusterEPs and ClusterONE, respectively. As shown, ClusterSS could detect the Pwp2p-containing subcomplex of 90S preribosome completely and correctly. ClusterEPs missed one protein and added fifteen unrelated proteins. Although ClusterONE found all the proteins of ORC, thirty-seven unrelated proteins were added in the detection result.
The DASH complex has been taken as a case study for ClusterONE and ClusterEPs, and the complex was embedded in the Krogan extended PPI dataset. Both ClusterONE and Clus-terEPs can detect the complex correctly and clearly. ClusterSS can also detect this complex but adds an additional protein into the prediction result. The detail prediction results are shown in S1-S3 Figs. Detect protein complexes combining supervised method and local structural information The RSC and the SWI/SNF complexes were contained in the Collins PPI dataset and were also examined as a case study using ClusterONE and ClusterEPs. All three methods, Cluster-ONE, ClusterEPs and ClusterSS, can obtain prediction results close to the true complexes. The detailed detection results are shown in S4-S6 Figs. Table 6 presents the GO analysis results of four complexes identified by ClusterSS. The match score represents the maximum overlapping score of predicted complexes with the MIPS and SGD complex datasets, which is calculated using Eq 6. The fourth column presents the minimum p-value of the matched GO terms, and the fifth column presents the corresponding descriptions.

GO analysis of the new predicted complexes
The first complex and the second complex are embedded in the Collins and Krogan extended PPI datasets, respectively, and they have no overlap with existing complexes in MIPS and SGD. The third complex and the fourth complex are embedded in the Gavin and Krogan core PPI datasets, respectively, and they have low overlapping scores with existing complexes. Table 6 presents the GO analysis results for the four predicted complexes obtained using BINGO [28]. All 6 proteins of complex-1 are enriched in 25 GO terms that are mostly related to rRNA processing, rRNA metabolic process or ncRNA processing (with a p-value < 6.88 Ã 10 −3 , and the minimum p-value is 4.22 Ã 10 −9 ). All 11 proteins of complex-2 are enriched in 6 GO terms that are mostly related to ribosome biogenesis, ribonucleoprotein complex biogenesis or cellular component biogenesis (with a pvalue < 5.25 Ã 10 −3 , and the minimum p-value is 3.11 Ã 10 −14 ). All 12 proteins of complex-3 are enriched in 30 GO terms that are mostly related to small nuclear ribonucleoprotein complex, spliceosomal complex, RNA splicing or mRNA metabolic process (with a pvalue < 7.43 Ã 10 −3 , and the minimum p-value is 3.19 Ã 10 −25 ). All 10 proteins in complex-4 are enriched in 23 GO terms that are mostly related to spliceosomal complex, RNA splicing, mRNA processing or mRNA metabolic process (with a p-value < 8.47 Ã 10 −3 , and the minimum p-value is 6.19 Ã 10 −20 ).
From the above GO analysis results, we observe that the proteins of each subgraph have close relationships according to the enriched GO terms. Although these subgraphs have not yet been characterized as complexes, they are very likely complexes in the biological sense. S8-S11 Tables provide detailed results of these GO enrichment analyses.

Conclusion
The existing protein complex detection methods can be divided into two groups: unsupervised clustering methods and supervised search methods. Unsupervised clustering methods divide the PPI network into groups based on its topological structure, and most of these Detect protein complexes combining supervised method and local structural information types of methods are sensitive to the density of the PPI network. The complexes that are too sparse or only a small part of highly dense subgraphs in the PPI network are difficult to detect using density-sensitive types of methods. The supervised search methods can learn a prediction model using the true complexes, but the noise in the PPI data and incomplete benchmark may cause the trained model to be inaccurate, which may misguide the protein complex search process.
In this paper, we provide a protein complex detection method that integrates these two types of methods by designing a score function that combines a classification model and structural information. We train a supervised neural network model on known protein complexes to obtain the supervised score, and we use a local structural score function to adjust the output of the neural network on each step of the protein complex search process. Based on the score function, we design a search method that works both forwards and backwards to detect the protein complexes. We conduct several comparative experiments on six benchmark PPI datasets and three complex datasets. Compared with the latest supervised method ClusterEPs, our method, ClusterSS, achieves a higher fraction score and composite score on all the PPI datasets under the same conditions. ClusterSS outperforms the semi-supervised method NN on the measures of precision, recall and F-measure. Compared with the unsupervised method, Clus-terSS achieves the highest fraction, MMR and composite scores on all five PPI datasets when using SGD as the test set. Finally, we provide four examples of new predicted complexes, and Detect protein complexes combining supervised method and local structural information the GO enrichment analysis shows that these complexes are very likely true complexes in the biological sense.
In future studies, we will integrate additional information, such as subcellular localization information and gene expression data, to make the classification model more accurate for detecting protein complexes.