A unified framework for link prediction based on non-negative matrix factorization with coupling multivariate information

Many link prediction methods have been developed to infer unobserved links or predict missing links based on the observed network structure that is always incomplete and subject to interfering noise. Thus, the performance of existing methods is usually limited in that their computation depends only on input graph structures, and they do not consider external information. The effects of social influence and homophily suggest that both network structure and node attribute information should help to resolve the task of link prediction. This work proposes SASNMF, a link prediction unified framework based on non-negative matrix factorization that considers not only graph structure but also the internal and external auxiliary information, which refers to both the node attributes and the structural latent feature information extracted from the network. Furthermore, three different combinations of internal and external information are proposed and input into the framework to solve the link prediction problem. Extensive experimental results on thirteen real networks, five node attribute networks and eight non-attribute networks show that the proposed framework has competitive performance compared with benchmark methods and state-of-the-art methods, indicating the superiority of the presented algorithm.


Introduction
As a very important research direction in complex networks, link prediction is attracting a large number of researchers from different disciplines, including computer science, biology, physics and sociology, because of its wide application. It aims to infer the likelihood of the existence of a link between two nodes unconnected by means of the known structure information in the network [1][2][3]. Link prediction can be used to explore the evolution mechanism of the network [4,5], recommend trusted partners in business trade [6], recommend travel hotspots [7,8], mine suspects in counterterrorism networks [9][10][11], analyse criminal networks [12,13] and so on.
In recent years, with the development of complex network research, people have proposed many ways to predict the links for specific networks in different fields from various PLOS  perspectives [14][15][16]. In simple terms, the existing methods for link prediction can be divided into three categories: unsupervised, supervised and other mixed methods. i) The first computes similarity scores between two nodes based on the known topological structure of the network. It is one of the most widely used methods in recent years and methods such as Common neighbour(CN), Adamic-Adar index(AA), and Resource Allocation index(RA), became the baseline for judging new methods [1]. This kind of method only depends on the information of known topology structure in network. Therefore, its prediction results are easily affected by network data sparsity (The number of edges known to be present is often significantly less than the number of edges known to be absent.). In fact, this is still the biggest challenge in the current research of link prediction. ii) The supervised approaches, on the other hand, attempt to be directly predictive of link behaviour. They generally need to find the characteristics of the node interaction and learn latent features from the topological structure of network [17][18][19].
Our work is to use this method to achieve multiple attribute fusion techniques to improve prediction performance. iii) The mixed methods include many methods, such as those mainly based on the probability model, perturbation-based frameworks, and matrix completion, etc. The probability model is inherently high cost in computational complexity since its application is limited [20,21]. In addition, structural perturbation-based and matrix completion methods are the most recently proposed the state-of-the-art approaches. Lü LY et al. [22] assumed that the regularity of a network is reflected in the consistency of structural features before and after a random removal of a small set of links. Based on the perturbation of the adjacency matrix, they proposed a universal structural consistency index that is free of prior knowledge of the network organisation. Furthermore, Xu XY [23] and Wang WJ et al. [24] proposed a perturbation framework based on matrix decomposition for link prediction. On the other hand, Pech Ratha et al. [25] proposed a method for link prediction based on matrix completion. Although these methods can achieve prediction tasks, there is still a shortcomings of insufficient useful information to some extent. Moreover, they are always challenged by high computational costs and data sparsity and network noise. In addition, with the increase of data scale, how the proposed method can be scalable, transplantable and robust in large-scale networks becomes the evaluation basis of the algorithm. Therefore, how to mine the network features, solve the above challenges and improve the performance of link prediction become the main concerns in this paper.
In fact, a complex network is an abstraction of real world, where the nodes represent entities that have very rich attribute information in the real environment. For example, individuals in online social networks have sociological characteristics such as gender, age, religious belief, educational background, and hobbies. The principle of social influence and homophily show that users with similar attributes, or in some cases antithetical attributes, are likely to link to one another [26][27][28], motivating the use of attribute information for link prediction. Additionally, some previous studies have also empirically demonstrated that non-topological information such as node attributes has a certain impact on the formation and evolution of social networks [29][30][31][32]. Therefore, network structure and node attribute information can be considered when predicting links.
In recent years, with the development of other fields related to complex networks, some methods of link prediction have been proposed based on the attribute information of nodes [33,34]. These methods, such as relational learning [35][36][37], semantic mining [16,33,38]. random walk [39,40], matrix factorization [41], have been proposed to leverage attribute information for link prediction. However, due to the diversity and heterogeneity of information and the difference of fusion methods, the overall effect of these algorithms is insufficient. Therefore, the algorithmic question of how to simultaneously incorporate these two sources of information remains largely unanswered. More recently, Gong N Z et al. [39] proposed an approach based on random walk algorithm to predict links as well as to infer node attributes, it suffers from scalability issues. Backstrom and Leskovec [42] presented a supervised random walk algorithm for link prediction, but this approach only incorporates node information for neighboring nodes. Taking these influence into account, we would like to consider: Can this external information about the nodes contribute to infer an interaction relationship between the nodes? What is the role of this external auxiliary information in predicting the interaction of nodes? How much dependency exists between external information and internal interaction? What methods of fusion are the most effective?
Because non-negative matrix factorization (NMF) [43,44] has the advantages of non-negative, extensibility and interpretability of physical phenomena, it has been widely used in the study of complex networks [45][46][47]. For example, Yang et al. [48] designed a probabilistic latent variable model which combined the NMF and block structure of matrices for link prediction, but they did not use the node attribute information. Chen BL et al. [41] proposed a non-negative matrix factorization for link prediction that combines network structure and node-attribute information, but this approach does not fully explore the combination form of structure and attribute information in depth, and the complexity is high. As previous studies have shown that node sociological information can assist prediction, and NMF based on matrix decomposition not only has non-negative and interpretable advantages, but also can easily integrate heterogeneous information, make multiple information work together. Inspired by the advantages of non-negative matrix factorization, in this work, we use it to fuse heterogeneous multi-source information for link prediction problem.
In this paper, we propose a unified framework, SASNMF, for link prediction of coupled multivariate information based on NMF. The framework combines local information of a node attribute with global information of the topological structure to solve the link prediction problem from a new perspective of the macro/micro-level. Furthermore, the effects of different combinations of multivariate information on the prediction results are verified under the same framework. Experimental results on 13 real-world network datasets display that the proposed framework has competitive performance compared with baseline and several state-ofthe-art algorithms, indicating the superiority of our algorithm. Specifically, this paper makes the following contributions.
First, we develop a prediction framework based on NMF, and auxiliary information from two different levels of macroscopic and microscopic information is coupled to realize the purpose of node relationship prediction.
Second, two kinds of auxiliary information are mined and used to alleviate the problem that the structural information cannot be fully utilized due to data sparsity and reduce the effect of the noise in the forecast.
Third, several different combination modes of auxiliary information are proposed, and the performance is compared and analysed separately under the same framework for the datasets with and without attributes.

Preliminaries
In this section, we first describe the problem of link prediction. In addition, we review the conventional NMF method.
Problem description. For a social network can be represented as an undirected graph G = (V,E), where V = {v 1 ,v 2 ,� � �v n } is the set of users (nodes) and E � V × V is the set of existing relations (edges) between users. The interaction relation between nodes is formally marked as an adjacency matrix A n×n in network with n vertices. The element of the i th row and the j th column in the matrix correspond to the link between node i and j in the network, where A ij = 1 if there is a link from i to j and A ij = 0 otherwise. Generally, the adjacency matrix A represents the macro-relations of the network topology. The problem of link prediction is inferring the probability of an existent link between nodes x and y based on known information in the network, and the probability is expressed as score P xy . The score can be viewed as the similarity of nodes x and y. The higher P xy is, the more similar x is to y. According to the score, all nonexistent links in the network can be sorted in descending order. The links at the top are the most likely to exist. In this paper, we compute the score P xy based on NMF.
To test the algorithm's accuracy, the observed links, E, are randomly divided into two parts: the training set, E train is treated as known information, while the probe set, E test has no known information and is used for testing in the prediction experiment. The proportion of links in these two parts ranges from 90% to 20%. Thus, when the training set consists of 90% of links, the remaining 10% of links constitute the test set. Furthermore, in the experiment, we conducted the simulations of SASNMF 100 times for each network and only report the average values in this paper.
NMF review. Given a matrix V 2 R n�m þ , the NMF aims to find two nonnegative factor matrices W 2 R n�k þ and H 2 R k�m þ that make V � V 0 = WH. In general, the k, (m + n)k � mn, is the number of latent features or the inner rank of V. The matrix W is called the basis matrix, and H is the coefficient matrix. The column vector of the original matrix V is the weighted sum of all column vectors of matrix W, while the weighted coefficient is just the elements of the corresponding column vector of matrix H.
The optimization problem of NMF is a convex optimization problem [49]. Due to its NPhardness and lack of appropriate convex formulations, the nonconvex formulations with relatively easy solvability are generally adopted, and only local minima are achievable in a reasonable computational time. Hence, the classic and also more practical approach is to perform alternating minimization of a suitable cost function as the similarity measures between V and the product WH [44].In this paper, our goal is to find V 0 as an approximation of V to implement the task of link prediction. Then, the problem of link prediction in networks can be cast as the following NMF problem: where 'ð; Þ is a general loss function. Generally speaking, the form of Euclidean distances are commonly used as this function. Assuming that there are two matrices X and Y, according to the definition of Euclidean distance, this loss function can be written as following form: In this work, we will also make use of such Euclidean loss. Then, our problem of link prediction is to solve the following optimization problem:

Prediction framework: SASNMF
Because of the influence of the data sparsity, and that the observed links are only a small proportion of all possible links, the methods that rely solely on network structural information have the problem of low prediction accuracy. According to the introduction above, the influence of data sparsity can be alleviated, and the link prediction accuracy can be improved by using the auxiliary information of the network. Therefore, in this paper, we attempt to fully integrate the auxiliary information to make up for the incomplete topology information so that the prediction performance is improved. According to the NMF algorithm, we use the adjacent matrix A n×n , which represents the macroscopic information of the network topology structure, and the auxiliary attribute similarity matrix S n×n , which represents the microcosmic information, to create the NMF framework. Here, we need to find two nonnegative factors matrices W and H to satisfy the form of V � WH. Thus, the matrix A is decomposed into In the same way, the similarity matrix S is Then, we map these two pieces of information into two low-rank approximation spaces, in which W 1 and W 2 represent the bases in their latent spaces. According to formula (3), we have However, our goal is to develop an indicator that can couple multivariate information to help improve the accuracy of link prediction. Therefore, formula (4) and (5) are combined into the following new form The information shown in the above formula (6) are only a simple combination of both the topological structure and auxiliary attribute, and they are not fully integrated into the same feature space. Therefore, we need to find a common factor matrix W to combine this information and then to make it a guider within the processing of the link prediction problem. That is, we develop a framework for link prediction that can employ a low-rank latent feature space representation to realize network structure prediction and add the lack of information within the network. Furthermore, let W = W 1 = W 2 to indicate that the two pieces of information in the network are mapped to the same feature space. At the same time, to avoid overfitting and to leverage the effects extent between the topology information and auxiliary attribute information in the link prediction results, we need to constrain and mediate the framework through setting up parameters. Finally, the objective function is created as follows: α is an equilibrium parameter for mediating the effect of the structure and attribute, and β is a regularization parameter to avoid overfitting.
Although it is difficult to obtain the global optimal solution of Q, the local can be implemented by a multiplicative iteration method. To (7) decompose, by introducing the Lagrangian multiplier ψ,φ,ϕ for the nonnegativity of W, H 1 and H 2 ; we obtain the loss function without constraints: Then, taking partial derivatives of L with respect to W, H 1 and H 2 , we have In terms of the Karush-Kuhn-Tucker (KKT) complementary slackness condition ψW = 0, φH 1 = 0 and ϕH 2 = 0, and Let @L @W ¼ 0, @L @H 1 ¼ 0 and @L @H 2 ¼ 0, we can derive the following updating rules with respect to W, H 1 and H 2 : where . � and ./ represent the elementwise multiplication and division, respectively. The score between nodes can be obtained by W and H 1 . Then, we can predict the edges.
To sum up, pseudo code of the proposed Link prediction algorithm based on NMF with coupling multivariate information is described as follows: Algorithm Name: SASNMF Input: A: the adjacency matrix of the given network, S: the auxiliary information matrix, k: number of features, α and β: parameters.

Computational complexity analysis
The computational complexity of SASNMF algorithm mainly comes from two parts. One is to extract auxiliary information, including external auxiliary information from node sociological attributes and internal auxiliary information extracted from topology structure. The second is iterative update matrices W, H 1 and H 2 at the same time.
Given an attributed network with n nodes, m attributes, then the matrix of attributes similarity, S n×n , is obtained by using cosine similarity algorithm based on node's attribute vectors. So the time complexity is O(n 2 ). Similarly, the time complexity of the internal auxiliary information extracted based on topology structure is also O(n 2 ).
When updating W, H 1 and H 2 , to reduce the time overhead, we utilizes the objective relative error as the stopping criterion and set to less than 10 −6 in experiment. In addition, the decomposed dimension is a k-dimensional vector, their time complexities are O(n 2 k) time. So the total time cost of the algorithm is O(n 2 + n 2 + n 2 k). Since k can be treated as constants, complexity of the step is O(n 2 ). To sum up, the computational cost of our approach is nearly to O(n 2 ).
Of course, we can also improve our algorithm according to the relevant literature to achieve parallel computing [50], so as to obtain performance optimization. This is what we want to do in the future.

Auxiliary information preprocessing
Here, we propose that the auxiliary information can be derived not only from external data but also from internal network structure information. SASNMF allows us to directly model such information into the framework to enhance the prediction performance. To distinguish sources of multivariate auxiliary information, we call those extracted from the network structure as internal auxiliary information and attributes of nodes as external auxiliary information.
It is an essential of our work that this external auxiliary information, node properties, is preprocessed. Considering the privacy of users, these information has been treated anonymously. When pretreated these attribute values, such as age, using directly actual measure values. Others, such as religious belief, are assigned a determined value in term of an appointed numerical range required. In addition, the numerical 0 or 1 is employed also to express two kinds of different status value. For these information, we use the vector Z m to denote that the node has m attributes. All of the node's attribute information in network G is represented as matrix Z n×m . The matrix element Z ij represents the j th attribute value of the i th node. However, owing to the heterogeneity of node attribute, it is impossible that exert the better indicative effect of attributes on the prediction results through using a linear combination. Therefore, all of the attributes are normalized by the column of attribute matrix, that is, formula Although it has been processed, the effectiveness of this attribute matrix in prediction is still very poor. Therefore, it is necessary to calculate the similarity between the attribute vectors Z m of each node and to form the attribute similarity matrix before it can be applied to the prediction framework. To compute the similarity between attributes, the Euclidean distance, cosine similarity or Pearson method can be used to calculate. Here, the three common similarity measures were tested and analyzed respectively. Finally, we use the measure of similarity based on cosine, S ij ¼ This internal auxiliary information is actually the latent feature of node, which the local structure information for the nodes themselves need be extracted from the input network by unsupervised structure similarity methods. In this work, for analysing the influence of node latent feature on the prediction performance, we employ seven similarity indices to compute the score, Sim, of the structure similarity between any two nodes as the internal auxiliary information. Furthermore, the prediction performance are analysed by comparing the node attribute with the structure information.

Multivariate information combination mode
To test the effectiveness and analyse the influence to predict under different coupling modes of auxiliary information, we propose the following combination methods.
i. A+S mode: the adjacent matrix A and external auxiliary information S are combined to input into the proposed framework. This method is directly marked as SASNMF.
ii. A+Sim mode: the adjacent matrix A and internal auxiliary information Sim are combined to input into the proposed framework. The Sim is regarded as matrix S in the proposed framework. Thus, this method is marked as � +SASNMF, where � represented any similarity methods.
iii. Sim+S mode: the adjacent matrix A is replaced as the internal auxiliary information Sim. This method is marked as A (= � )+SASNMF, where � represented any similarity methods.
For two types of network datasets: the second combination method, ii), is only used for the network without node attributes, while all of the methods are used for a network with realworld node attributes. Our experiments show that both types of auxiliary information can increase the performance of link prediction.

Datasets description
We consider the following 13 real-world networks drawn from disparate fields. Among them, one contains external attributes, and we generate internal attributes for all of them.
The five networks with external attribute information: i) Lazega-lawyers [51]: The network is a social network between 71 partners and associates in some New England law firms. In addition, each entity in the network is described by features such as gender, office-location, age, and years employed. We did some preprocessing of the features (binarized the features such as the age and years employed) and then constructed a kernel matrix of pairwise similarities. In this article, we choose seven attributes to calculate. ii) Facebook [52]: The network is extracted from the Facebook online social network. A user can provide profile information (e.g., age, gender, education and information). By selecting some informative attributes in this profile information, we create a feature vector for each user. iii) WebKB [53]: The network consists of 4 subnetworks (Cornell, Texas, Washington and Wisconsin) gathered from 4 universities. The node represents a webpage that is annotated by 1703-dimensional binary valued word attributes. The first three of them are used for our experiments.
The eight networks without external attributes information: i) Karate [54]-social network of friendships between 34 members of a karate club at a US university in the 1970s; ii) Jazz [55]-jazz musician network, the link denotes the relationship between two persons if they played together in the same band; iii) USAir The basic topology features of these networks are summarized in Table 1. The symbol N and E are the total number of nodes and links, respectively. <K> is the average degree. <d> is the mean shortest distance. C is the clustering coefficient, and #attributes is the number of node attributes.

Evaluation metrics
Like many existing prediction studies [1], in our work adopts also the most frequently-used metrics AUC (area under the ROC curve) to measure the performance of link prediction [60]. This metric is viewed as a robust measure in the presence of data imbalance [19].
The AUC can be interpreted as the probability that a randomly chosen missing link (a link in E test ) is given a higher score than a randomly chosen nonexistent link (a link in U\E, where U denotes the universal set). In the implementation, among n independent comparisons, if there are n 0 occurrences of the missing link having a higher score and n@ occurrences of the missing link and nonexistent link having the same score, we define the accuracy as: If all the scores are generated from an independent and identical distribution, the accuracy should be approximately 0.5. Therefore, the degree to which the accuracy exceeds 0.5 indicates how much better the algorithm performs than pure chance.
In addition, we have adopted the Precision metric, which is also one of the most popular index of evaluation link prediction [61]. Given the ranking of the non-observed links in decreasing order according to their scores. The precision is defined as the ratio of relevant items selected to the number of items selected. That is to say, if we take the top-L links as the predicted ones, among which ' links are right, then, Clearly, a higher value of precision means a higher prediction accuracy. Although the computing result is not unique through taking different L values for a single algorithm, in order to ensure the fairness for all comparison algorithms, the same value can be taken for L. This value does not affect the final comparison. Therefore, in our work, for the convenience of comparison, all the algorithms are unified to take the value of L = 100.

Comparison methods
In this section, we mainly evaluate the performance of our algorithm. According to the way in multivariate information coupling mode, our methods are represented as SASNMF and � +-SASNMF. More specifically, there are three types of coupling mode for auxiliary information using our framework, namely, i) Global network structure information coupling external auxiliary information from node attributes (A+S). ii) Global network structure information coupling internal auxiliary information from local structure latent feature (A+Sim). iii) Internal auxiliary information from local structure latent feature and external auxiliary information from node attributes are fused (Sim+S).
To analyse performance of algorithm proposed, we adopt two kinds of comparison methods. One is baseline algorithms, such as CN, AA, etc., which are often used for existing methods as benchmark to evaluate these approaches. We used seven here. In this work, they are also used to extract local structural latent features of nodes to act as internal auxiliary information.
The second is several state-of-the-art methods. These are divided into two categories: both structural information and node attribute information are adopted and only structural information is utilized.

Baseline methods
We list four types of link prediction methods as the baseline methods, including five local algorithms based on the number of common neighbours between pairs of nodes (CN,AA,RA,Salton and Jaccard), a global random walk method(ACT) and a local path method(Katz) and NMF method based on matrix factorization with the Frobenius norm. The mathematical expressions of these methods are shown in Table 2. Their detailed definitions can be found in ref. 1-3 and 43.

Common neighbour (CN)
S xy = |Γ(x) \ Γ(y)| Where Γ(x) denotes the set of neighbours of node x, | � | is the cardinality of the set � , and k(x) is the degree of node x.
Where l þ xy represents the elements of matrix L + , the pseudoinverse of the Laplacian matrix. Link prediction based on NMF with multivariate attributes

State-of-the-art methods
In addition, apart from the baseline methods, we also further compare the performance of the proposed SASNMF method with the other three state-of-art competitive algorithms. The structure perturbation method (SPM) based on nonnegative matrix factorization [24], which is based on the perturbation of the adjacency matrix, assumes that the regularity of a network is reflected in the consistency of structural features before and after a random removal of a small set of links. In particular it outperforms state-of-the-art link prediction methods both in accuracy and robustness [22,23]. In the SPM method, we use the method of NMF-D1 with random deletion perturbation. And the perturbation ratio is 0.04, the default value of perturbation times is 20.
Matrix completion (MC) [25] is a global information-based prediction algorithm based upon the low-rank and sparse property of the adjacency matrix. It employ the robust principal component analysis method through minimizing the nuclear norm of the matrix which fits the training data to reconstruct a network that is close to the original network and accordingly identify the missing links. In the MC method, in addition to the partial values of the parameter λ provided in the literature, we also perform an optimal analysis of the parameter and finally select the best one. The parameter values of this method are referred to in the S1 File.
In addition, Chen BL et al. [41] proposed a link prediction method based on NMF (NMF-LP), which adopted node attributes. Therefore, we compare this method with our framework.

Experiments results
Parameters setting: In order to achieve good prediction results, before the whole experiment, we analyzed the sensitivity of the model parameters α and β. We set the proportion of training set as 0.9, and the range of the two parameters are set from 1 to 100, respectively. And then take the widely used evaluation index AUC and Precision for link predication as evidence. The values of AUC and precision are calculated on 13 networks, and compared with each other. Finally, the optimal range of parameters is gradually obtained. Furthermore, we select five networks including Lazega, Facebook, Cornell, Texas, four networks with node attributes and Kate, one non-attributes from the all networks, and analyze the experimental sensitivity of α and β in the performance of link predication in a smaller range. As represented in Fig1, it is obvious that the performances on Lazega, Facebook, Cornell, Texas and Kate are gradual stable. Although the different settings of α and β have significant influence on the predict results, we also know that our framework has equally better performance than other baseline methods. Without losing generality, we set α = 4, β = 32 in subsequent experiments.
Using optimized parameter results, in this section, we show the AUC and precision results of our proposed methods based on NMF with coupling multivariate information and other comparison methods on the 13 real network data in Tables 3-6.  Tables 3 and 4 show the results calculated on five networks with external auxiliary information (namely, node attributes), while Tables 5 and 6 show the eight networks with only internal information. To facilitate comparison, we add Mode column to the table, and classify it according to different combination mode and different comparison method to show the difference. In the four tables, the presented links for every dataset are partitioned into a training set (90%) and a probe set (10%). From these tables, we can see that the prediction results by means of various combination formulas under the SASNMF framework are significantly better than the other comparison methods. In addition, these methods using external auxiliary information are generally superior to the baseline methods that use only structure information.
To further test the overall prediction effect of the three combination methods proposed, we give only the results of precision and AUC based on four baseline methods, AA, CN, RA and Salton on real networks in Fig 2. Here, we use a baseline method and its two combinations, namely, A+Sim and Sim+S, to compare with SASNMF.
Similarly, to compare the overall performance of the combined mode A+Sim with the baseline method and the state-of-the-art methods on 13 real networks, we consider four baseline methods (AA, CN, RA and Salton) and their combined modes. The AUC and precision results are shown in Figs 3 and 4.
In addition, the selection of the rank after the matrix decomposition was also important because of its effect on the prediction result and the number of latent features k in the SASNMF framework is different for each dataset. Here, to illustrate the problem, the results of different k for the Lazega-lawyer dataset are shown as follows in Fig 6. In the figure, the training sets are from 90% to 20% and only a network dataset-Lazegalawyer.
As seen in Figs 2 and 3, the methods in which the mode is A+S, A+Sim and Sim+S are better than the corresponding benchmark methods. Especially, through our framework, the  prediction effect of using node attributes as auxiliary information is competitive compared to those baseline methods.
To better test the extensibility and robustness, Fig 5 shows  We find that the performance of all methods declines obviously as the E train ratio decreases in Fig 5. However, there is a gentle trend decline under the SASNMF method. Moreover, from the whole process of dataset partitioning to analyse the results synthetically, its prediction effect is obviously superior to other baseline methods. This finding indicates that these methods that rely only on structural information can make the prediction worse as the number of connected sets in the training set decreases. Our framework can alleviate the problem of data sparsity by coupling multivariate auxiliary information. Especially, on the Lazega-lawyer and Facebook datasets, the impact of using SASNMF on the results is obviously better than that of other comparison methods. Although the precision test of the Cornell, Texas and Washington datasets is inferior to that of AA and RA, our model is far better than that of these two methods Link prediction based on NMF with multivariate attributes under the corresponding AUC evaluation. It can be said that the overall effect of our method is good under the AUC index.
Therefore, why does our method not work well on these three datasets? Through in-depth analysis, we think that the main reason for this phenomenon lies in the attribute information. In fact, the attribute values used in these three datasets are simply quantized whether the words in the article appear or not, compared with the first two data sets. However, the attribute values of the first two datasets are true social attributes. Therefore, the attribute of these three networks cannot be said to better reflect the true similarity between nodes.
In addition, the number of latent features k in the SASNMF framework is different for each dataset. Moreover, the determination of the latent features k is a very important and difficult problem in matrix factorization. Fig 6 shows only the results under different k for the Lazegalawyer dataset. In this paper, because it is not our primary focus, we take an easy and effective method for automatic determination of k, by Colibri [62], which seeks a nonorthogonal basis by sampling the columns of the input matrix. However, to observe the influence of different k in the process of matrix factorization for the prediction effect, we take some of k's value by means of the limitative form of k(m + n) � mn provisionally. Due to the adjacent matrix A being symmetrical here, the k is far less than n/2. Fig 6 shows that the influence of the selection of k on the prediction results is obvious.

Conclusion
In recent years, link prediction based on network topology has been one of the research hotspots in the field of data mining. However, in many instances, algorithms that use only network structure do not provide the precision needed for link prediction. At present, with the development of mobile Internet, the more descriptive information owned by the entities in the network is becoming an asset to be used. Inspired by this, based on the advantages of NMF such as interpretability, nonnegativity and information fusion, a unified framework of link prediction is proposed in this paper. By this framework, the adjacency matrix A, which represents the macroscopic information of a network topology, and the auxiliary information matrix S, which represents the microscopic information of the network, are mapped to the same low-rank latent feature space to realize the multivariate information coupling. Then, the link prediction task can be realized by merging into a prediction matrix that can infer the missing relationship of the network. At the same time, to further analyse the usability of the network auxiliary information, we not only use the external attributes of the nodes but also explore the latent features of the nodes that are extracted as internal auxiliary information by some traditional structural similarity indices from local and global perspectives. On the basis of multivariate information, we further propose three different combinations. We used three class combination forms as the simulation cases of the proposed framework and experiments to show the feasibility, effectiveness, and competitiveness of the framework. Moreover, a large number of experiments on five networks with node sociological attributes and eight networks without node attributes show that the prediction performance under this unified framework is competitive compared with seven baseline methods and three state-of-art methods on the whole according to the different combination patterns proposed by us. This finding demonstrates that the proposed framework has advantages in combining the structure and attribute information for link prediction. Furthermore, the framework is easy to extend to directed and weighted networks by letting the matrix V be directed and weighted because it is based on NMF.
In the future, there are some limitations and improved studies for our proposed framework. One of which is how to set parameters α and β to be adaptive on different networks. Furthermore, we will extend our methods to more generalized situations such as extending the model to edge attributes and combination attributes of edges and nodes and dynamic network link prediction. Designing efficient methods to solve these issues will be interesting.
Supporting information S1 File. This is the data source for Figs 4 and 5. (XLSX)