Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A multi-attribute method for ranking influential nodes in complex networks

  • Adib Sheikhahmadi,

    Roles Data curation, Formal analysis, Methodology, Software, Visualization, Writing – original draft

    Affiliation Department of Computer Engineering, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran

  • Farshid Veisi,

    Roles Resources, Software, Visualization, Writing – review & editing

    Affiliation Department of Computer Engineering, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran

  • Amir Sheikhahmadi ,

    Roles Conceptualization, Methodology, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    asheikhahmadi@iausdj.ac.ir

    Affiliation Department of Computer Engineering, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran

  • Shahnaz Mohammadimajd

    Roles Data curation, Formal analysis, Resources, Validation

    Affiliation Department of Mathematics, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran

Abstract

Calculating the importance of influential nodes and ranking them based on their diffusion power is one of the open issues and critical research fields in complex networks. It is essential to identify an attribute that can compute and rank the diffusion power of nodes with high accuracy, despite the plurality of nodes and many relationships between them. Most methods presented only use one structural attribute to capture the influence of individuals, which is not entirely accurate in most networks. The reason is that network structures are disparate, and these methods will be inefficient by altering the network. A possible solution is to use more than one attribute to examine the characteristics aspect and address the issue mentioned. Therefore, this study presents a method for identifying and ranking node’s ability to spread information. The purpose of this study is to present a multi-attribute decision making approach for determining diffusion power and classification of nodes, which uses several local and semi-local attributes. Local and semi-local attributes with linear time complexity are used, considering different aspects of the network nodes. Evaluations performed on datasets of real networks demonstrate that the proposed method performs satisfactorily in allocating distinct ranks to nodes; moreover, as the infection rate of nodes increases, the accuracy of the proposed method increases.

1. Introduction

Many people use Social networks to communicate with friends, exchange opinions, and share information. The appealing environments of these networks have encouraged companies, political figures, and others to employ them for broadcasting innovations, advertising, and promoting their products [1]. Given people’s tendency to have more trust in friends and acquaintances, many companies prefer to spread out their messages through individuals in a network [2]. Finding individuals who can maximize diffusion has always been of a great concern to these companies [3]. Such people are referred to as influential nodes. Finding influential nodes and utilizing them to indicate the advertisement process is a remarkably effective way of increasing the number of people who become aware of the advertised content [4]. Therefore, evaluating and ranking nodes’ diffusion power in a network to propagate messages in online social networks have become a critical research topic in various sciences [5]. This problem comprises two sub-problems: 1- assessing the diffusion power of network nodes and ranking users based on it. 2- selecting an optimal subset of users to maximize the diffusion process [6]. The present study focuses on the first sub-problem. Thus far, nobody has presented a comprehensive and acceptable definition for influential nodes [7]. Some studies label high diffusion power as influential, while others label opinion leaders as people who can make others accept something by accepting it themselves as such [8]. This study uses the first definition, similar to many other studies Therefore, influential nodes are individuals who can propagate an advertisement message in the network with a high diffusion power.

There have been many methods to evaluate the diffusion power of network users that primarily use structural network information because they lack access to network information [9]. These methods consider nodes with a better place in the network as more influential [10]. However, the main problem of these methods is selecting the proper attribute to determine the diffusion power of nodes, considering the relatively high number of nodes and connections between them [11]. Many of these methods for assessing the diffusion power of nodes regard node from one aspect to calculate its influence based on an attribute [12]. These methods are only well-suited to some networks [13], and lose their effectiveness when the network changes.

These attributes could be local, semi-local, and global [14]. In the local attribute, the power of diffusion is calculated based on the neighbors of nodes. In contrast, global attributes measure the impact of the node using all nodes’ information. The third class of attributes, known as a semi-local attribute, has been presented to reach a compromise between these two groups. This attribute takes into account information from multiple levels of a node’s neighborhood to calculate diffusion power. For large-scale complex networks, global feature-based methods are unsuitable due to the high time complexity [15]. The local and semi-local methods are adequately faster, even though using only one local or semi-local cannot provide sufficient accuracy in dealing with various types of networks.

Ranking influential nodes can be considered as a Multi-Attribute Decision Making (MADM) problem in which the different attributes of each node can be used as influential criteria in decision-making. Thus, the primary hypothesis is that considering multiple local and semi-local features and treating them as a MADM problem can improve the performance of the method in comparison to methods that consider only one feature. This present study presents a method for determining and ranking the diffusion power of nodes that utilizes several different attributes. For comparing and ranking nodes according to their various dimensions, the proposed method uses the Elimination and Choice Translating Reality (ÉLECTRE) method, a family of MADM techniques. The ÉLECTRE method, also known as approximate dominance, is one of the MADM methods. It was first introduced by Benayoun in 1966 and then developed by researchers named Roy and Van Delf. This method evaluates all options by unranked comparisons, and the uninfluential ones are eliminated. All these steps are based on a coordinated set and an uncoordinated set, which gives the method its alternative title of coordination analysis. Concerning the time complexity of the ÉLECTRE method, the present study employs the simplified ÉLECTRE method improving computational efficiency and reducing time complexity while delivering the same performance as the ÉLECTRE method. The innovations introduced in this paper are as follows:

  1. Identifying and extracting structural attributes from the network.
  2. Ranking nodes based on different aspects of the network structure using several attributes.
  3. Comparing and ranking network nodes using the simplified ÉLECTRE Multi-Attribute Decision Making method.

The related works will be reviewed first in the rest of this study then; section 3 introduces the proposed method and its components. In section 4, the proposed method will be evaluated, and a summary of the work will be presented in section 5.

2. Related works

Many methods have been proposed to measure the diffusion power of the nodes in a network. In most of these methods, the network structure and the strategic location of nodes have been used to determine their diffusion power. In these methods, the better position of the node, the more diffusion power in the following, some of these methods are mentioned.

In High Degree, which uses the degree of each node to calculate its centrality, it assumes nodes with a higher number of connections or friends are more influential [16]. In degree centrality, local information of nodes is used. In Closeness Centrality, which is a global method for identifying influential nodes in complex networks, the average distance between each node and all the other nodes in the graph is calculated. The less distance between a given node and others, the more influential it is. This method is highly time-consuming in large dynamic networks and has high computational complexity. Efforts have been made to improve the closeness centrality using the local structure of nodes, aiming to reduce its computational complexity. In [17], a new ranking method called Bridge Rank is proposed that calculates the local centrality of each node. Ref [18], first specifies all communities in the network and, by ignoring the relationships between communities, identifies a node as the local critical one according to the applied centrality metric. Next, by taking into account the edges between communities, a node is selected as the gateway, and the network nodes will be ranked based on the sum of the shortest distances from obtained critical nodes.

The K-Shell method claims that nodes in the center have a higher diffusion power [10]. Therefore, it allocates a number to each node based on its closeness to the center. Then, it uses these numbers to rank nodes and determines their diffusion power. In other words, nodes with higher numbers are stronger in this method. K-Shell ranks Nodes in the same Shell. It is assumed that the nodes in the higher Shell have higher diffusion power. The Mixed Degree Decomposition method (MDD) was proposed to improve K-Shell. In this method which is based on the K-Shell, the number of remaining edges kr and removed edges ks of each node are taken into account [19]. The corness method has also been proposed to improve the K-Shell method assumes nodes with more connections to neighbors located in the network center are much more powerful [11].

K-Shell IF method works based on the K-Shell method; however, separates nodes with the same ks by considering the iterations in each step of K-Shell; Then, it determines the diffusion power of each node by using the neighborhood concept up to one step [20]. In the Extended Weighted Degree Centrality method to determine the influence of nodes’ diffusion, an extended weighted degree centrality method based on the degree of a node and its neighbors has been proposed [16]. In H-Index Centrality [21], The diffusion power of a graph node is calculated using a function based on its neighbor’s degree. If y neighboring nodes have a degree greater than or equal to y, then the node y’s H-index is considered. A metric is presented in the Extended H-index method [22], which uses the neighbors’ information to determine the centrality of nodes through an expansion of the H-index concept. Sheikhahmadi et al. [5] Proposed the Mixed Core, Degree, and Entropy (MCDE) method. In this multi-attribute method, the diffusion power of neighbors is measured based on a combination of features including core number, degree, and level of Dispersion. Entropy-based Ranking Measure (ERM) is a semi-local method based on the hypothesis that nodes with high diffusion power have neighbors with high degrees; additionally, the neighbors of these nodes possess a degree of monotonicity. ERM calculates the degree entropy of one- and two-step neighbors of a node. Then the centrality of each node is calculated based on these two criteria [22].

Due to the lack of information provided by the K-Shell attribute about the topological positions of nodes in the graph, an index called Hierarchical K-Shell (HKS) [23] has been proposed. This method aims to determine a nodes’ topological position by extracting structural information ignored by K-Shell, then estimating the diffusion power of each node using that information and the nodes are ranked.

Namtirtha et al. [24] proposed the K-Shell degree neighborhood method by assigning weights to graph edges using node degree and the K-Shell index of the nodes at the ends of each edge. Then, to measure the influence power of each node, they calculated the sum of the weights of all edges connected to that node. Maji [25], In a similar work to [24], However, instead of adjusting parameters, used a measure based on the network’s average degree and K-Shell and a combination of a K-Shell index and degree of nodes to weigh the graph edges.

The gravity formula states that the force which two objects exert on each other is directly related to their mass and inversely related to their distance. Based on this fact, Ma et al. [26] observed a nodes’ effect on spreading activity. In order to propose a gravity measurement formula, the K-Shell value of a node was used as the mass and the shortest path distance between each pair of nodes as the distance.

Li et al. [27] proposed a gravity centrality (GC) model based on the gravity formula, which assumes a node degree as its mass and its shortest path distance as the distance between each pair of nodes. With gravitational centrality, nodes are only interactive based on their degrees and distances, indicating they have the same gravity. Each node may have a different absorption capacity in the real world. Liu et al. [28] improved this model by considering the weight of each node in the network and identified a new centrality measure called WGC that is more relevant to real-world networks.

Yang et al. [29] also took the location of nodes into consideration, it means a node in the center of the network’s center is more likely to attract other nodes than a node on the periphery. Therefore, they proposed an improved gravity model; based on the K-shell algorithm to identify influential nodes in networks. The differences in location between nodes, modeled by differences in K-shell values, are used as attraction coefficients, which adjust the attractiveness of central nodes in the networks. The proposed approach combines Local and global information.

MADM methods can be used to evaluate the diffusion power of network nodes based on a variety of dimensions. Du et al. [30] used the Technique for Order Preference By Similarity to Ideal Solution(TOPSIS) method to identify influential nodes in complex networks. They chose nodes with the least distance from the optimal solution and the most distance from the worst solution simultaneously. Liu et al. [31] utilized a combination of relative entropy and TOPSIS to evaluate the diffusion power of nodes and applied their method to several real-world complex networks. Yang et al. [32] employed gray correlation analysis to determine the weights of evaluation indices and presented a dynamic weighted TOPSIS algorithm for finding nodes with high diffusion power in complex networks. Yang et al. [33] presented an integrated measurement method for identifying influential nodes in a complex network by combining the entropy weighting method with the Vlse Kriterijumska Optimizacija Kompromisno Resenje (VIKOR) method, which means multi-criteria optimization and compromise solution, in Serbian.

3. The proposed method

Fig 1 depicts the general procedure of the proposed method. The proposed method extracts important structural attributes that identify nodes from the input social network. As extracting and using all the attributes to compare and rank the influential nodes is time-consuming, a subset of more accurate features is selected. In the next step, the ÉLECTRE method is used for comparing and ranking node scores. The following section will examine each part of the proposed method in detail.

thumbnail
Fig 1. General procedure of identifying and ranking influential nodes.

https://doi.org/10.1371/journal.pone.0278129.g001

3.1. Input network

The input network is a two-column file where the first column contains the source node’s number and the second column contains the destination node’s number. For example, Table 1 shows part of the input used in the method.

For example, there is a link between nodes 1 and 2, as shown in the first row of Table 1.

3.2. Extracting the structural network attributes

There are several methods for calculating the diffusion power of nodes based on the network structure and the position of each node. Many of these methods are single-attribute methods. In other words, these methods calculate diffusion power for nodes in the network by only using one attribute. As pointed out earlier, these methods are only effective in some networks and will not work if the network changes. In this section, several methods with sufficient accuracy and acceptable execution time have been selected. The methods utilized in this section are as follows: degree [34], K-Shell (ks) [10], Coreness [11], MDD [19], K-Shell IF [20], H-index [35], HKS [23], ERM [9], and Gravity [27]. It should be noted that due to many available methods, this section only considered local or semi-local methods whose reported time for calculation is acceptable.

3.3. Selecting the effective attribute subset in node diffusion evaluation

A number of effective features are selected based on the diversity of extracted features in this part to be used in the next step. To provide better understanding, data belonging to the Zachary karate club is shown in the graph in Fig 2.

In the following, the structural features discussed in section 3.2 will be calculated for this graph, and a method for selecting the most effective subset. The obtained values of the other calculated characteristics for each node are shown in Table 2. Apart from the values obtained for each attribute, the diffusion power of each node is also calculated and displayed in the last column of Table 2. To evaluate the spreading power of a node, either the network must be monitored in real-time, or diffusion models must be employed. Since a network cannot be monitored except by network owners in most cases, researchers tend to use epidemic models to measure the diffusion power of nodes. Throughout this section, the susceptible-infected-recovered (SIR) diffusion model is used. This model identifies the diffusion power of nodes by repeating the spreading process many times for each node, likely to be in keeping with reality.

thumbnail
Table 2. Obtained values for other structural characteristics.

https://doi.org/10.1371/journal.pone.0278129.t002

In the next step, to determine the diffusion power, the correlation level between the list ranked by each feature and the list ranked by the SIR diffusion model is utilized to select the effective subset of indices. A higher correlation between these two lists indicates a more accurate attribute for determining node diffusion power. Here, Kendall’s tau correlation coefficient is applied to see whether two ranking lists are correlated. Suppose (x1, y1), (x2, y2),…(xn, yn) are a set of pairs of ranks in two separate ranking lists, X and Y. For each pair (xi, yi) and (xj, yj) if (xi>xj) and (yi>yj) or (xi<xj) and (yi<yj) as concordant and If (xi>xj) and (yi<yj) or (xi<xj) and (yi>yj) are considered as discordant. Then the Kendall Tau value [36, 37] of two ranking lists, X and Y, is calculated using the relation which nc and nd are the number of positive and negative pairs in the two ranking lists, respectively, and n is the size of the ranking vector.

The degree of correlation between the attributes extracted from Table 2 is presented in Fig 3.

thumbnail
Fig 3. The degree of correlation between the list ranked by each feature and the list ranked by the SIR diffusion model.

https://doi.org/10.1371/journal.pone.0278129.g003

Values in Fig 3 demonstrate that HKS, k-shell IF, Coreness, Gravity, and ERM are more accurate at ranking nodes than other features. Therefore, they can be considered effective subset features. The high correlation between the list ranked by these measures and real-world spreading is among the reasons for this selection. As an additional guarantee supporting this selection of features, Fig 4 illustrates the degree of correlation between each measure and the SIR model calculated for some of the datasets in Table 3.

thumbnail
Fig 4. Degree of correlation between each attribute and the SIR diffusion model.

https://doi.org/10.1371/journal.pone.0278129.g004

Fig 4 demonstrates that HKS, k-shell IF, Coreness, Gravity, and ERM structural measures produce more accurate node rankings than others.

3.4. Calculating the node diffusion power using the ÉLECTRE method

AS Previously, five structural indices were selected from nine features as effective sets of features: HKS, k-shell IF, Coreness, Gravity, and ERM. the simplified ÉLECTRE method will be used to rank network nodes based on these attributes. The ÉLECTRE method or approximate dominance is a multi-criteria decision-making method.

The most significant advantage of the ÉLECTRE method over other decision support techniques is that it can be used to examine options for ordinal and more or less descriptive data. This method demonstrates the degree of dominance of one option over the others and is capable of utilizing incomplete data.

This method is implemented through the following steps:

Step One—Creating the Decision Matrix.

The decision matrix is created.

The number of nodes in the graph represents the number of rows, and the number of indices extracted from the network is the number of columns. Therefore, the decision matrix is created according to Eq 1.

(1)

Where xij is the value of the j-th index for the i-th node.

Step Two–Normalizing the Decision Matrix.

Due to the differences in dimensions between various centrality indices, the values for different measures will be normalized in this step. Normalization is done according to Eq 2: (2)

Step Three—Determining the criteria Weight Matrix.

This step determines the attribute importance coefficient vector of criteria. Different methods, such as AHP and Shannon Entropy, can determine the attribute weights. In this study, Shannon’s entropy method has been employed.

Step Four—Determining the Normalized Weighted Decision Matrix.

The weighted decision matrix is obtained by multiplying the scale-free decision matrix with the criteria weights.

Step Five—Forming a set of concordant and discordant criteria.

The attribute sets are divided into concordant and discordant subsets for each pair of nodes, k and e. The concordant set (Ske) is a set of attributes that prefer node k to node e with the discordant set (Dke) as its complementary set. The concordant set for positive and negative measures, respectively, is given by Eq 3.

(3)

The discordant set for positive and negative attributes is defined by Eq 4.

(4)

Step Six—Creating the Concordant Matrix.

The concordant matrix is a square matrix as large as the number of options or graph nodes. Each element in this matrix is the concordant attribute between two nodes. The value of this attribute is the sum of the weights of the criteria in the concordant set. In other words, calculating the Cke concordant attribute requires a comparison between the k and e nodes and adding the attribute weights where k is preferred to e. In mathematical terms, the concordant attribute is calculated using Eq 5.

(5)

The concordant attribute indicates the superiority of node k over node e, and its value ranges from zero to one.

Step Seven—Determining the Discordant Matrix.

The discordant matrix is a square matrix whose dimension is the number of nodes in the graph. Each element in this matrix is referred to as the discordant index between the two nodes. The value of this index can be calculated using Eq 6.

(6)

Step Eight—Creating the Concordant Dominance Matrix.

Step six depicted how to calculate the concordant attribute (Cke). Now, this stage will determine a value for the concordant attribute known as the concordant threshold shown with . This concordant threshold is obtained by averaging all concordant attributes (the concordant matrix elements). In mathematical terms, the concordant threshold is calculated according to Eq 7.

(7)

The concordant dominance matrix (F) is created based on the value of concordant threshold. If Cke is larger than , the superiority of node k over node e is acceptable.; Otherwise, node e has no superiority over e node. Therefore, the concordant dominance matrix elements are determined according to Eq 8.

(8)

Step Nine—Creating The Discordant Dominance Matrix.

The discordant dominance matrix (G) is created similarly to the concordant dominance matrix. Therefore, it must start by calculating the discordant threshold () by averaging all discordant attributes (discordant matrix elements). In mathematical terms, the discordant threshold value is calculated using Eq 9.

(9)

As stated in step seven, lower discordant attribute values dke are better because discordant determines the superiority of node k over node e. If dke is larger than , then the discordant value is too high, and it cannot be ignored. Therefore, the elements in the discordance domination matrix G are given by Eq 10.

(10)

Each member of matrix G determines the dominance relationship between nodes.

Step Ten—Creating the Final Dominance Matrix.

The final dominance matrix (H) is obtained according to Eq 11 by multiplying each element in the concordant dominance matrix (F) with the discordant dominance matrix (G).

(11)

Step Eleven—Selecting the Best Option.

The final dominance matrix (H) expresses the partial preferences of nodes. For instance, if hke is one, in this case, the superiority of node k over node e is acceptable in both concordant and discordant states (superiority is larger than the concordant threshold and inferiority, or lack of concordant, is also less than the discordant threshold). However, node k still has a chance to dominate through other Nodes. The options can be ranked according to which node is more defeated over the other, dominates. Consequently, the sum of the rows of the H matrix represents the dominance of a node, whereas the sum of the columns represents the defeats of a node, which is derived from these two rank values assigned to each node. A positive number indicates more dominant nodes than defeated ones, while a negative number means the defeated nodes are more.

4. Evaluation

In order to evaluate the proposed method in this paper, the other compared methods have been implemented in Python 3.8 language programming and run on a system with a core i7 2.3 GHz processor and 16 GB of memory. For this evaluation,12 real-world datasets used, with their characteristics listed in Table 3. The features for each dataset presented in Table 3 are, from left to right, as follows: the network name, the number of nodes, the number of edges, the highest network node degree, the average degree, and assortativity [26].

4.1. Evaluation criteria

The proposed method in this paper has been compared with other methods based on criteria used in other papers. The following criteria:

  • Comparing the Node Diffusion Power obtained Using Different Methods with Their Real Diffusion power: This study uses the SIR diffusion model [38, 39] to calculate the real node diffusion power. The reason behind choosing this model is its widespread application in papers proposed in recent years [40]. This model simulates the message diffusion process in the real world and determines the real diffusion power of each node with many iterations for each node. Then, to evaluate the veracity and accuracy of the proposed algorithms, the ranking list proposed by the algorithm is compared with the ranking list calculated with the help of diffusion models. A high correlation between these two lists depicts the high algorithm accuracy in determining the node diffusion power and ranking them. This study uses Kendall’s Tau [41] correlation coefficient to analyze the proposed algorithm’s accuracy and correlation with the real ranking list. Given that the top-ranking nodes are more important than the low-ranking ones in these lists, a portion of the tests is reserved for examining the veracity of higher ranks in the list for this purpose, the similarity between the top c elements of list R ranked by each method and the top c elements in the real ranking list σ is calculated. The Jaccard similarity coefficient [42] is used in this section. This coefficient for the first c elements in lists X and Y is calculated using Eq (12).
(12)

X(c) is the set of elements in the list X at its initial rank.

  • Allocating Distinct Ranks to Nodes with Different Diffusion Effects: according to this criterion, a method is better if it assigns fewer nodes in each rank. To assess the resolution of ranking, the monotonicity parameter (M) has been employed, which is defined according to Eq 13
(13)

Where, N is the number of distinct ranks in list Rand nr is the number of nodes with a similar r rank in the list. The value of M will be zero if all nodes have the same rank, and M will be one if all nodes have a distinct rank. Also, to examine the performance of the proposed algorithms, each algorithm is executed 100 times on different networks, and their average execution time is compared with the other methods.

4.2. Test results

The results obtained from the tests conducted on the proposed method as compared with other methods. The methods are first compared by the accuracy of each method in ranking and then based on the resolution of node ranking.

4.2.1. Method accuracy in ranking nodes.

To determine the accuracy of the methods, the ranking list produced by each method is compared with the ranking of influential nodes obtained from the SIR model. The SIR model determines the real diffusion power of all nodes with many iterations, and based on that, the ranking list σ will be obtained. Given the stochastic nature of the process and in order to bring the results closer to reality, the SIR model is repeated 103 times for each node vi in the graph, and the average number of improved nodes will be taken as the diffusion power of node vi.

The Kendall tau correlation coefficient has been employed to determine the degree of correlation between the ranking list obtained from each method and the ranking list σ [43]. Table 4 depicts the Kendall-Tau correlation coefficient values between ranked nodes using each method, and the SIR ranked list. Each row in this table depicts the values for each network. Notably, higher vales determine a bigger similarity between obtained raking and reality.

thumbnail
Table 4. The correlation coefficient between the ranked lists using each method and the ranked list using the SIR model.

https://doi.org/10.1371/journal.pone.0278129.t004

The results from Table 4 show that the proposed method has a higher ranking accuracy than others in most datasets except the Netscience and Elegans, where it still had a performance close to the top method. Considering that different networks have diverse structural attributes and a single attribute performs well just in some network, using diverse structural attributes in the proposed method, which remarkably increases of the networks’ accuracy. In other words, changing the network structure, unlike other methods, have no significant effect on the accuracy obtained by the proposed method.

The infection rate is an effective parameter in the SIR mode; therefore, the following section analyzes the β (infection rate) parameter effect on the proposed method’s accuracy, and the results are presented in Fig 5. Considering numerous applied datasets, variations of this parameter are only analyzed on the Dolphins, Netscience, and PowerGrid datasets.

thumbnail
Fig 5. Parameter change effects on the proposed method’s accuracy.

A. Dolphins network; B. Netscience network; C. Power Grid network.

https://doi.org/10.1371/journal.pone.0278129.g005

By increasing β, the infection rate of nodes will be increased, even though the spreading process will influence nodes in farther proximity. Furthermore, this method has a higher correlation than others because it consists of multiple attributes with the ÉLECTRE method to determine the node diffusion power; therefore, it will still have a higher correlation than others by increasing β and exerting changes in Networks. In the next test, the validity of the top c ranks of the ranking lists obtained from different methods is examined using the Jaccard similarity coefficient. The results of this test on the three networks of Netscience, Elegans, and PowerGrid are illustrated in Fig 6. In this test as well, the similarity coefficient of the top c ranks of the ranking list σ and the lists presented by various methods are examined by altering c. A shown in Fig 6 that the proposed methods have a higher validity and accuracy in the top ranks compared to other similar methods.

thumbnail
Fig 6. Accuracy of the proposed methods in assigning the top c ranks compared to different methods.

a. NetScience; b. PowerGrid; c. Elegans.

https://doi.org/10.1371/journal.pone.0278129.g006

Considering that the goal of most methods for measuring the diffusion power of nodes to select influential nodes among the top nodes of the list for further applications such as viral marketing, controlling outbreaks, and publishing innovations. Therefore, the proposed method has been able to increase the accuracy of ranking nodes, specially the top nodes of the in the first step by electing high-quality attributes and in the next step with an optimal combination list.

4.2.2. Method separability value in ranking nodes.

Distinct rank allocation is another criterion for comparing node diffusion evaluation methods; in other words, for ranking methods, it is preferred if fewer nodes are assigned to each rank. Therefore, ideal methods that allocate every rank to a single node are ideal for this criterion. Tests use the monotonicity parameter (M) [43] to analyze different methods’ node ranking distinguishability and separability.

The monotonicity (M) of each ranking method executed on various datasets is shown in Table 5.

thumbnail
Table 5. Monotonicity of methods in assigning distinct ranks to nodes.

https://doi.org/10.1371/journal.pone.0278129.t005

The results from Table 5 depict the proposed method’s proper performance in most datasets; The quality of the method in allocating distinct ranks to nodes increases due to the method performance into the attention to the different nodes based on their local position and neighboring structure lake of attention to these features makes other methods accuracy decreased considering the same nodes in the same ranks. The proposed method had similar or slightly different separability values in multiple datasets with the EW and MCDE methods.

The next test has been performed to determine whether the proposed methods are time-efficient. Fig 7 illustrates the average 100 execution times for different methods across different networks. Based on the results of this experiment, the proposed method has an acceptable time efficiency by changing the size of networks, despite using a combination of different indices. The main reason for the appropriate execution time of the proposed method is due to the selection of local and semi-local indicators with linear time complexity.

thumbnail
Fig 7. Average execution time of the proposed method in comparison with other methods in different networks.

https://doi.org/10.1371/journal.pone.0278129.g007

5. Conclusion

This paper presented a method based on the simplified ÉLECTRE method to compare and rank nodes based on various indices. Index calculation time and accuracy were considered in selecting the effective structural indices to compare nodes. Therefore, the selected indices had a linear calculation time, and it was possible to extract them in large-scale networks with adequate speed. Regarding the high correlation of some indices with each other and their lower accuracy, a subset of the extracted indices was selected for the proposed method, and Shannon’s entropy was used to determine the weight of each index. Results obtained based on various parameters indicated that the proposed method assigned distinct rankings to the nodes, such that it rarely occurred for two nodes to be ranked the same. Also, by increasing the infection rate of nodes, it was observed that the proposed method achieved better performance in ranking nodes. In addition, the method also performed very efficiently in ranking highly influential nodes. Given the power law distribution of node degrees in complex networks, the computation speed for the proposed method can be remarkably increased by removing nodes with a lower degree that generally have low diffusion power. This paper only uses structural features extracted from unweighted and directionless networks to present a multi-index method. To use it in weighted and directed networks, features related to the centrality index in these networks can be extracted and utilized.

References

  1. 1. Wang F., et al. Influential node identification by aggregating local structure information. Physica A: Statistical Mechanics and its Applications, 2022. 593: p. 126885.
  2. 2. Zareie A., Sheikhahmadi A., and Sakellariou R. A composite centrality measure for improved identification of influential users. arXiv preprint arXiv:2111.04529, 2021.
  3. 3. Zareie A., Sheikhahmadi A., and Jalili M. Identification of influential users in social network using gray wolf optimization algorithm. Expert Systems with Applications, 2020. 142: p. 112971.
  4. 4. Mochalova A. and Nanopoulos A. A targeted approach to viral marketing. Electronic Commerce Research and Applications, 2014. 13(4): p. 283–294.
  5. 5. Sheikhahmadi A. and Nematbakhsh M.A. Identification of multi-spreader users in social networks for viral marketing. Journal of Information Science, 2017. 43(3): p. 412–423.
  6. 6. Sheikhahmadi A. and Zareie A. Identifying influential spreaders using multi-objective artificial bee colony optimization. Applied Soft Computing, 2020. 94: p. 106436.
  7. 7. Guo L., et al. Identifying multiple influential spreaders in term of the distance-based coloring. Physics Letters A, 2016. 380(7–8): p. 837–842.
  8. 8. Chen Y.-C. A novel algorithm for mining opinion leaders in social networks. World Wide Web, 2019. 22(3): p. 1279–1295.
  9. 9. Zareie A., Sheikhahmadi A., and Fatemi A. Influential nodes ranking in complex networks: An entropy-based approach. Chaos, Solitons & Fractals, 2017. 104: p. 485–494.
  10. 10. Kitsak M., et al. Identification of influential spreaders in complex networks. Nature physics, 2010. 6(11): p. 888–893.
  11. 11. Bae J. and Kim S. Identifying and ranking influential spreaders in complex networks by neighborhood coreness. Physica A: Statistical Mechanics and its Applications, 2014. 395: p. 549–559.
  12. 12. Sheikhahmadi A., Nematbakhsh M.A., and Shokrollahi A. Improving detection of influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications, 2015. 436: p. 833–845.
  13. 13. Hu J., et al. A modified weighted TOPSIS to identify influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications, 2016. 444: p. 73–85.
  14. 14. Wang X., et al. Effective identification of multiple influential spreaders by DegreePunishment. Physica A: Statistical Mechanics and its Applications, 2016. 461: p. 238–247.
  15. 15. Li Q., et al. Identifying influential spreaders by weighted LeaderRank. Physica A: Statistical Mechanics and its Applications, 2014. 404: p. 47–55.
  16. 16. Liu Y., et al. Identifying influential spreaders by weight degree centrality in complex networks. Chaos, Solitons & Fractals, 2016. 86: p. 1–7.
  17. 17. Salavati C., Abdollahpouri A., and Manbari Z. BridgeRank: A novel fast centrality measure based on local structure of the network. Physica A: Statistical Mechanics and its Applications, 2018. 496: p. 635–653.
  18. 18. Salavati C., Abdollahpouri A., and Manbari Z. Ranking nodes in complex networks based on local structure and improving closeness centrality. Neurocomputing, 2019. 336: p. 36–45.
  19. 19. Zeng A. and Zhang C.-J. Ranking spreaders by decomposing complex networks. Physics Letters A, 2013. 377(14): p. 1031–1035.
  20. 20. Wang Z., et al. Fast ranking influential nodes in complex networks using a k-shell iteration factor. Physica A: Statistical Mechanics and its Applications, 2016. 461: p. 171–181.
  21. 21. Lü L., et al. The H-index of a network node and its relation to degree and coreness. Nature communications, 2016. 7(1): p. 1–7.
  22. 22. Zareie A. and Sheikhahmadi A. EHC: Extended H-index centrality measure for identification of users’ spreading influence in complex networks. Physica A: Statistical Mechanics and its Applications, 2019. 514: p. 141–155.
  23. 23. Zareie A. and Sheikhahmadi A. A hierarchical approach for influential node ranking in complex social networks. Expert Systems with Applications, 2018. 93: p. 200–211.
  24. 24. Namtirtha A., Dutta A., and Dutta B. Weighted kshell degree neighborhood: A new method for identifying the influential spreaders from a variety of complex network connectivity structures. Expert Systems with Applications, 2020. 139: p. 112859.
  25. 25. Maji G. Influential spreaders identification in complex networks with potential edge weight based k-shell degree neighborhood method. Journal of Computational Science, 2020. 39: p. 101055.
  26. 26. Ma L.-l., et al. Identifying influential spreaders in complex networks based on gravity formula. Physica A: Statistical Mechanics and its Applications, 2016. 451: p. 205–212.
  27. 27. Li Z., et al. Identifying influential spreaders by gravity model. Scientific reports, 2019. 9(1): p. 1–7.
  28. 28. Liu F., Wang Z., and Deng Y. GMM: A generalized mechanics model for identifying the importance of nodes in complex networks. Knowledge-Based Systems, 2020. 193: p. 105464.
  29. 29. Yang X. and Xiao F. An improved gravity model to identify influential nodes in complex networks based on k-shell method. Knowledge-Based Systems, 2021. 227: p. 107198.
  30. 30. Du Y., et al. A new method of identifying influential nodes in complex networks based on TOPSIS. Physica A: Statistical Mechanics and its Applications, 2014. 399: p. 57–69.
  31. 31. Liu Z., et al. The node importance in actual complex networks based on a multi-attribute ranking method. Knowledge-Based Systems, 2015. 84: p. 56–66.
  32. 32. Yang P., Liu X., and Xu G. A dynamic weighted TOPSIS method for identifying influential nodes in complex networks. Modern Physics Letters B, 2018. 32(19): p. 1850216.
  33. 33. Yang Y., et al. A novel method to evaluate node importance in complex networks. Physica A: Statistical Mechanics and its Applications, 2019. 526: p. 121118.
  34. 34. Easley D. and Kleinberg J. Networks, Crowds, and Markets. 2010.
  35. 35. Billah S.M. Identifying Emerging Researchers using Social Network Analysis. 2013: University of Arkansas.
  36. 36. Kendall M. The treatment of ties in ranking problems. Biometrika, 1945. 33(3): p. 239–251.
  37. 37. Knight W. A computer method for calculating Kendall’s tau with ungrouped data. Journal of the American Statistical Association, 1966. 61(314): p. 436–439.
  38. 38. Huang C.-Y., et al. A computer virus spreading model based on resource limitations and interaction costs. Journal of Systems and Software, 2013. 86(3): p. 801–808.
  39. 39. Pastor-Satorras R. and Vespignani A. Epidemic dynamics and endemic states in complex networks. Physical Review E, 2001. 63(6): p. 066117.
  40. 40. Zhang H., et al. Recent advances in information diffusion and influence maximization in complex social networks. Opportunistic Mobile Social Networks, 2014. 37(1.1): p. 37.
  41. 41. Newson R. Parameters behind “nonparametric” statistics: Kendall’s tau, Somers’ D and median differences. The Stata Journal, 2002. 2(1): p. 45–64.
  42. 42. Hébert-Dufresne L., et al. Global efficiency of local immunization on complex networks. Scientific reports, 2013. 3(1): p. 1–8.
  43. 43. Boccaletti S., et al. Complex networks: Structure and dynamics. Physics reports, 2006. 424(4–5): p. 175–308.