A degree-based block model and a local expansion optimization algorithm for anti-community detection in networks

Jiajing Zhu; Yongguo Liu; Changhong Yang; Wen Yang; Zhi Chen; Yun Zhang; Shangming Yang; Xindong Wu

doi:10.1371/journal.pone.0195226

Abstract

Anti-community detection in networks can discover negative relations among objects. However, a few researches pay attention to detecting anti-community structure and they do not consider the node degree and most of them require high computational cost. Block models are promising methods for exploring modular regularities, but their results are highly dependent on the observed structure. In this paper, we first propose a Degree-based Block Model (DBM) for anti-community structure. DBM takes the node degree into consideration and evolves a new objective function Q(C) for evaluation. And then, a Local Expansion Optimization Algorithm (LEOA), which preferentially considers the nodes with high degree, is proposed for anti-community detection. LEOA consists of three stages: structural center detection, local anti-community expansion and group membership adjustment. Based on the formulation of DBM, we develop a synthetic benchmark DBM-Net for evaluating comparison algorithms in detecting known anti-community structures. Experiments on DBM-Net with up to 100000 nodes and 17 real-world networks demonstrate the effectiveness and efficiency of LEOA for anti-community detection in networks.

Citation: Zhu J, Liu Y, Yang C, Yang W, Chen Z, Zhang Y, et al. (2018) A degree-based block model and a local expansion optimization algorithm for anti-community detection in networks. PLoS ONE 13(4): e0195226. https://doi.org/10.1371/journal.pone.0195226

Editor: Sebastián Gonçalves, Universidade Federal do Rio Grande do Sul, BRAZIL

Received: September 17, 2017; Accepted: February 24, 2018; Published: April 18, 2018

Copyright: © 2018 Zhu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data are available from Harvard Dataverse (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/O1R1QG) with DOI 10.7910/DVN/O1R1QG. All other relevant data are within the paper and its Supporting Information files.

Funding: This research was supported in part by the National Science and Technology Major Project of the Ministry of Science and Technology of China under grant 2018ZX10715003-002 (WY), the National Key Research and Development Program of China under grant 2017YFC1703900 (SY), the Sichuan Science and Technology Program under grant 2018PTDJ0084 (YL), and the US National Science Foundation (NSF) under grant 1652107 (XW). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The recent researches on complex networks have made significant advancements to our understanding of complex systems [1–3]. Nodes in networks represent the objects, while edges represent the relationships between objects. One of the most important characteristics in complex networks is community structure, i.e. assortative structure [4–6], where nodes share most of their connections inside the groups they belong to. Detecting community structure can reveal the organizational and functional characteristics of underlying systems [7–11]. In this paper, we pay attention to another important structure of complex networks, called anti-community structure, i.e. disassortative structure [12], where nodes have no or few connections with each other inside their group but share most of their connections to the rest of the network as shown in Fig 1. Many real-world networks own the characteristics of anti-community structure [13], such as sexually transmitted disease network, book selling network, and divorce network, etc. Detecting anti-community structure in networks can help reveal some interesting relations, such as non-cooperative relation, competitive relation, and even hostile relation among individuals, corporations, or countries. For example, Karate describes the friendship relations between 34 members of a karate club at an American university in the 1970s, which is split into two communities due to the disagreement between the administrator and the instructor [14]. Detecting anti-community structure in Karate can divide the members into several groups with no or few friendship relations inside. In each group, some negative relations can be explored among the members, such as the disagreement between the administrator and the instructor.

Download:

Fig 1. An example of anti-community structure.

https://doi.org/10.1371/journal.pone.0195226.g001

Several anti-community detection methods have been developed in past few years. These methods attempt to explore anti-community structure in networks from different perspectives. The traditional methods divide a network into two groups to find the largest bipartite structure, which are similar to but not equivalent to the problem of searching for the maximum cut in networks [15–17]. Spectral methods detect anti-community structure by using the negative eigenvalues and eigenvectors of modularity matrix [12, 18]. Label propagation algorithms spread the labels of nodes to the non-neighbor ones to explore multipartite structure in networks [13]. Multipartite structure consists of several groups without internal edge, which is a special case of anti-community structure. Recently, several block models have been proposed for exploring structural regularities in networks [19–26]. These models regard the network structure as observed quantities and take the group membership of nodes as hidden quantities. The structural regularities can be inferred from the group membership. And the group membership of nodes can be inferred by fitting the models to the observed structure based on the method of maximum likelihood such as expectation-maximization (EM) algorithm [27].

However, the above researches suffer from some limitations. First, there is no universally definition for anti-community and no widely-accepted objective function for evaluation. Second, the proposed works [12–13, 15–17, 18–27] do not consider the impacts of node degree on the methods, leading to poor performance especially when they are applied to real-world networks. Thirdly, the efficiency of these methods is comparatively low due to the massive computational cost for calculating of eigenvalues and eigenvectors of modularity matrix in spectral methods and repeated iterations of EM algorithm in block models. In addition, the results provided by block models are highly dependent on the observed structure of a network. For example, block models cannot identify the disassortative structure in Karate, because the observed structure in Karate is assortative and these methods are incapable of exploring the particular structure that is inconsistent with the observed one. Meanwhile, it is necessary for EM algorithm of block models to run several times with different initial values of parameters to avoid convergence to local optima and find the quantities that fit the observed structure to the most, which also leads to the high computational cost when applied to large networks.

In this paper, we first introduce a definition of anti-community. And then, we propose a Degree-based Block Model (DBM) for anti-community structure, which takes the node degree into consideration and evolves an objective function Q(C) for anti-community structure evaluation. Due to that the nodes with high degree have greater impacts on Q(C) than the ones with low degree, a Local Expansion Optimization Algorithm (LEOA), which preferentially considers the nodes with high degree, is proposed for anti-community detection. In LEOA, we first detect structural centers by node influence. Then, LEOA expands each structural center into anti-community by a local search method. Finally, we adjust group membership of nodes by maximizing Q(C) so as to detect a better anti-community structure. Inspired by the formulation of DBM, a new synthetic benchmark DBM-Net is developed for testing algorithms in detecting known anti-community structure. Experimental results on DBM-Net with up to 100000 nodes and 17 real-world networks demonstrate the effectiveness and efficiency of LEOA for exploring anti-community structure in networks.

The remainder of this paper is organized as follows. We present the related works about anti-community detection in Section 2. Section 3 introduces the definition of anti-community, the formulation of DBM model and the details of LEOA algorithm. The experimental results are described in Section 4. Section 5 gives the conclusions.

Related works

Some approaches have been proposed for anti-community detection in networks. When a network consists of two anti-communities, the problem is to explore the largest bipartite subgraph in a given network. The detection of bipartite or approximately bipartite structure has attracted attention in the recent literature [15–17]. Searching for the max-cut is an approximate method for solving this problem. Trevisan [15] proposed an approximate algorithm for max-cut by the smallest eigenvalue with approximation ratio of 0.531. Alon and Sudakov [16] obtained two results of dealing with the relation between the smallest eigenvalue of the adjacency matrix of a graph and its bipartite subgraphs. The first result is that the smallest eigenvalue μ of the adjacency matrix of any non-bipartite graph with n nodes, diameter L and its maximum degree d_max satisfied μ≥−d_max+1/((L+1)n). The other is that they determined the approximation of the max cut algorithm [28] for graph G = (V,E),in which the size of the max-cut is αm, where m = |E|and α ∈[0.845,1].Newman [12] used the least negative eigenvalue of modularity matrix for bipartite structure detection in networks. By applying the proposed algorithm to the co-occurrence network of Nouns and adjectives in the novel David Copperfield, the author found that the obtained partition is approximately bipartite, where one group is almost composed of adjectives and the other of nouns. In addition to the algorithms for bipartite networks, a label propagation algorithm LPAD is proposed by Chen et al. [13] for detecting the partition with more than two anti-communities. LPAD defines the compatible relationship and update rules of labels among nodes, which avoids oscillation in label propagation. The experimental results show that LPAD can detect bipartite and simple multipartite structure in networks but its results are affected by the order of label propagation.

Block models are promising methods for exploring modular regularities in networks [19–26]. However, most of the models focus on the detection of community structure and only two researches can discover disassortative structure [23, 26]. Newman and Leicht [23] proposed a mixture model for exploring broad types of structure in networks. This model takes the assumption that the nodes in the same group have similar connection preference. Due to that this model only considers the relationship between groups and nodes, it may generate the results with mixture of several types of structures, such as assortative structure, disassortative structure, hierarchical structure and core-periphery structure, etc. Shen et al. [26] modified this model and proposed general stochastic block model (GSBM) to detect intrinsic structural regularities of networks. By utilizing the block matrix to indicate the relationship among groups, GSBM can output the types of identified structural regularities.

In this paper, we propose a Local Expansion Optimization Algorithm (LEOA) for anti-community detection in networks by preferentially considering the nodes with high degree, which improves its effectiveness for anti-community detection in synthetic and real-world networks. By first detecting structural centers, then expanding structural centers into anti-communities, and finally adjusting group membership of nodes, LEOA achieves good performance and overcomes the shortcomings of the existing algorithms, such as poor performance in real-world networks, great requirement of computational cost, and high dependency of the observed structure.

Methods

Anti-community

Generally, an anti-community can be defined as a group of nodes with most of their connections outside and few or no connections inside. Inspired by the definition of community proposed by Radicchi et al. [29], we provide a quantitative description for anti-community in this subsection.

Consider an undirected and unweighted graph G = (V,E) with V being the set of nodes with n nodes and E = {(v_i,v_j)|v_i,v_j∈V} being the set of edges with m edges, which can be represented as an adjacent matrix A such that if there is an edge between node v_i and node v_j, a_ij = 1,otherwise a_ij = 1. Let us consider a group c_r∈V, which v_i belongs to, the degree of node v_i can be written as (1) where m_i(s) is the number of edges connecting node v_i to the nodes in group c_s (2) Thus, group c_r is an anti-community if it satisfies the constraint as follow (3) where is twice the number of edges inside group c_r, is the number of edges connecting the nodes in group c_r and the nodes in group c_s(s≠r). Eq (3) is regulated by the factor λ(λ ≥ 1). Given the value of , the larger the factor λ, the less the number of edges inside group c_r, and the better the anti-community c_r. And given the value of λ, the higher the value of the better the anti-community c_r.

Degree-based block model

In DBM, given K anti-communities, a K×K matrix Ωis adopted and its element ω_rs denotes the probability of edges connecting group c_r and group c_s, r,s = 1,2,…,K. Specifically, ω_rr is the probability of edges inside group c_r. The probability of an edge connecting node v_i and node v_j is d_id_j/(2m)² if edges are placed at random. Thus, the probability of an edge connecting node v_i and node v_j with v_i∈c_r,v_j∈c_s is (4) Since the probability of an edge connecting node v_i and node v_j independently meets a Poisson distribution [22] with the mean of P_ij, the possibility of generating graph G with edges inside and among anti-communities can be written as follows (5) (6) where a_ij∈{0,1} and a_ij! = 1. Eqs (5) and (6) can be written as follows after manipulations of the equations (7) (8) where m_rr is twice the number of edges inside group c_r, m_rs is the number of edges between group c_r and group c_s, D_r is the group degree of group c_r, is the number of edges connecting node v_i to the nodes not belonging to c_r. These variables are calculated as follows (9) (10) (11) (12) Thus, the probability of generating graph G parameterized by Ω and g can be written as follow after multiplying Eqs (7) and (8) (13) Eq (13) is to be maximized with respect to the matrix Ω and group membership g. However, likelihood maximization cannot be carried out directly with the likelihood itself, but with its logarithm. Neglecting constants and the terms independent of Ω and g, we obtain the logarithm of Eq (13) as follow (14) Here, we first maximize this expression with respect to the matrix Ω By using the method of maximum-likelihood estimate, we take partial derivative of the elements in the matrix Ω and obtain the estimation values of ω_rr and ω_rs (15) By first substituting Eq (15) into Eq (14) and then neglecting the constant 2m, we obtain the maximization of Eq (14) with respect to group membership g (16) Given the network partition C, we normalize lnP(G|g) by dividing it by a constant, twice the number of edges 2m, to constrain the value of lnP(G|g) within relatively tight bounds. The normalized objective function can be written as follow (17) Eq (17) can be considered as a new objective function for evaluating anti-community structure. In Figs 1 and 2, two anti-community structures own the same number of edges and different number of edges inside and among anti-communities. The number of internal edges for each anti-community and the values of Q(C) for Figs 1 and 2 are shown in Table 1. We observe that the partition in Fig 1 owns the less number of internal edges and a higher value of Q(C), which indicates that the higher the value of Q(C), the less the number of internal edges, and the better the anti-community structure. In addition, we find that the nodes with different degree have different impacts on Q(C). Here, we respectively remove nodes v₁, v₂, v₃ and v₄ from Fig 1 and calculate the values of Q(C) for the remaining networks as shown in Fig 3. It can be seen that the higher the degree of the removed node, the lower the value of Q(C) in the remaining network, which indicates that the nodes with high degree have greater contribution to Q(C) than the ones with low degree. In the proposed algorithm LEOA, we preferentially consider the nodes with high degree so as to be effective for anti-community detection in networks.

Download:

Fig 2. An example of anti-community structure.

https://doi.org/10.1371/journal.pone.0195226.g002

Download:

Fig 3. Four anti-community structures.

The degree of the four removed nodes v₁, v₂, v₃, v₄ and the values of Q(C) for the remaining networks are shown in (a), (b), (c), (d) respectively.(a) d₁ = 8, Q(C) = 4.324. (b) d₂ = 7, Q(C) = 4.357. (c) d₃ = 6, Q(C) = 4.448.(d) d₄ = 5, Q(C) = 4.480.

https://doi.org/10.1371/journal.pone.0195226.g003

Download:

Table 1. The number of internal edges and the values of Q(C) for Figs 1 and 2.

https://doi.org/10.1371/journal.pone.0195226.t001

Local expansion optimization algorithm

In this paper, we decompose an anti-community into two parts: a central node and several periphery nodes. As shown in Fig 4, node v₁,node v₅ and node v₉ are the central nodes of red, yellow and green anti-communities, respectively, which have no connection to their periphery nodes and are highly connected with each other. Here, we call these central nodes as structural centers. Detecting structural centers plays an important role in anti-community detection. Once structural centers are detected, the number of anti-communities can be determined.

Download:

Fig 4. Structural centers and periphery nodes.

The nodes in blue boxes are structural centers and the nodes in orange boxes are periphery nodes.

https://doi.org/10.1371/journal.pone.0195226.g004

In this subsection, we propose a Local Expansion Optimization Algorithm (LEOA) for detecting anti-community structure in networks. In LEOA, we first detect structural centers by the node influence, which is controlled by a cutoff distance l_c. And then, we employ a local search method to detect periphery nodes to expand structural centers into anti-communities. Finally, we adjust the group membership of nodes by maximizing Q(C) so as to detect a better anti-community structure. The main steps of the proposed algorithm LEOA are given in Algorithm 1.

Algorithm 1. Local Expansion Optimization Algorithm (LEOA).

Input: (G,A,l_c) /* A is the adjacent matrix of graph G = (V,E),and l_c is a cutoff distance. */

Output:C = {c₁,c₂,…,c_K} /* C is the final anti-community structure. */

1: (S,K) = Structural Center Detection(G,A,l_c)./* S is the set of structural centers and K is the number of structural centers.*/

2: C* = Local Anti-community Expansion(A,l_c,S,K).

3: C = Group Membership Adjustment(C*).

4: return C.

Structural Center Detection (SCD).

Definition 1. (Node Influence) Consider a graph G = (V,E), the influence η_i of node v_i is a set of nodes within the distance l_c to node v_i, which is defined as follow (18) where δ(x) = 1 if x≥0, and δ(x) = 0 otherwise. l_c is a cutoff distance, and l_ij denotes the distance between node v_i and node v_j. If l_ij≤l_c, node v_j is influenced by node v_i. |η_i| is the number of nodes influenced by node v_i. The higher the value of l_c, the more the number of nodes influenced by node v_i, and the higher the value of |η_i|. When l_c = l,only adjacent nodes of node v_i are influenced by node v_i and |η_i| = d_i. When l_c = L, where L is the diameter of the network, |η_i| = n.

In SCD, structural centers are a set of nodes that influence each other, i.e., the distance among structural centers is no more than l_c When l_c = l, structural centers are highly connected with each other and constitute a complete subgraph. Here, we propose an iterative method for structural centers detection. Given the set of structural centers S, we define a set of candidate structural centers CSC to record the nodes that are influenced by S, CSC = {v_j|l_j,S≤l_c}, where In SCD, the node v_j with is repeatedly added into S until CSC = ∅. The main steps of structural centers detection are provided in Algorithm 2. At the beginning, S = ∅, CSC = ∅ and K = 0. K is the number of structural centers. First, we calculate the influence of nodes by the breadth-first search method. And then, the node v_i with is selected as the first structural center and added to S. And we set CSC= η_i. Next, the node v_j with is chosen as the second structural center and added into S. And we remove node v_j from CSC. Since some nodes in CSC may not be influenced by node v_j, the nodes satisfying {v_k|v_k∈CSC,l_jk>l_c} are deleted from CSC so as to maintain that the nodes in CSC are influenced by S. We repeatedly execute this operation until CSC = ∅ and all structural centers are detected.

Algorithm 2. Structural Center Detection (SCD).

Input:(G,A,l_c) /* A is the adjacent matrix of graph G = (V,E), and l_c is a cutoff distance. */

Output:(S,K)/* S is the set of structural centers and K is the number of structural centers. */

1: S = ∅,CSC = ∅,K = 0./* CSC is the set of candidate structural centers. */

2: Calculate the influence of nodes by the breadth-first search method.

3: S = {v_i}, K = K+1, and CSC = η_i.

4: while CSC ≠ ∅ do

5: CSC = CSC−{v_j}.

6: S = S+{v_j}, K = K+1.

7: for each node v_k∈CSC do

8: if (l_jk>l_c) then

9: CSC = CSC−{v_k}.

10: end if

11: end for

12:end while

13:return (S,K).

Here, we take Fig 4 with cutoff distance l_c = 1 as an example to present the procedure of structural centers detection, as shown in Table 2. Initially, S = ∅ and CSC = ∅. First, we calculate the influence of nodes and find that nodes v₁, v₅ and v₉ own the maximal influence in Fig 4. Then, we randomly select node v₁ as the first structural center and add it to S. And the nodes that are influenced by node v₁ are regarded as candidate structural centers and added to CSC. In CSC, nodes v₅ and v₉ have the maximal influence and we randomly select node v₅ as the second structural center. Thus, we add node v₅ to S and remove it from CSC. It can be found that nodes v₆, v₇ and v₈ are not influenced by node v₅ due to that the distances between node v₅ and nodes v₆, v₇ and v₈ are more than l_c. Therefore, we delete them from CSC so as to maintain that the nodes in CSC are influenced by S. Next, node v₉ has the maximal influence in CSC and we select node v₉ as the third structural center and remove it from CSC. Due to that distances between node v₉ and nodes v_10, v₁₁ and v₁₂ are more than l_c, we delete nodes v₁₀, v₁₁ and v₁₂ from CSC. Finally, CSC = ∅ and nodes v₁, v₅ and v₉ are detected as structural centers in the network.

Download:

Table 2. The procedure of structural centers detection in Fig 4.

https://doi.org/10.1371/journal.pone.0195226.t002

Local Anti-community Expansion (LAE).

In SCD, K structural centers have been detected for K anti-communities. In this subsection, we aim to expand the structural centers into anti-communities by a local search method. Here, we define a local anti-community measure, i.e. disassortative density, for local anti-community expansion.

Definition 2. (Disassortative Density) For group c_r with n_r nodes and m_redges inside, the disassortative density is defined as follow (19) If l_c = 1, Given the value of , the higher the value of B_r, the less the number of edges inside group c_r, and the more disassortative the group c_r.

In LAE, we preferentially consider the nodes with high degree. For each unassigned node v_j, we first calculate the increment of disassortative density when node v_j is added into group c_r, r = 1,2,…,K. And then we add node v_j into the group c_r with . If different groups have the same maximal increment of disassortative density, we break this ties by favoring the influence of the group . The increment of disassortative density can be calculated in Eq (20) and the main steps of LAE are given in Algorithm 3. (20) where m_j(r) is the number of edges connecting node v_j and the nodes in group c_r.

Algorithm 3. Local Anti-community Expansion (LAE).

Input: (A,l_c,S,K)

Output: C* = {c₁,c₂,…,c_K} /*C* is the anti-community structure after local anti-community expansion. */

1: C* = ∅ and r = 1.

2: for each node v_i∈S do /* Assign K structural centers into K anti-communities. */

3: c_r = {v_i}.

4: C* = C*∪{c_r}.

5: r = r+1.

6: end for

7: Sort the unassigned nodes in a descending order by the node degree, denoted as V.

8: for each node v_j∈V do

9: Calculate r = 1,2,…,K.

10:

11: c_r = c_r+{v_j}.

12:end for

13:return C*.

Group Membership Adjustment (GMA).

As mentioned above, the higher the objective function Q(C), the better the anti-community structure. In GMA, we aim to adjust the group membership of nodes by maximizing Q(C) so as to explore a better anti-community structure.

For node v_i, we calculate the increment of Q(C) when node v_i is removed from the group c_r it belongs to and added into a new group c_s. The increment value can be calculated as follows (21) where and are twice the number of edges inside group and group respectively, and are group degree of group and group respectively, is the number of edges between group and group is the number of edges between group and group c_k, is the number of edges between group and group c_k. These variables can be computed as follows (22) where m_i(r) is the number of edges connecting node v_i and the nodes in group c_r, m_i(s) is the number of edges connecting node v_i and the nodes in group c_s, and m_i(k) is the number of edges connecting node v_i and the nodes in group c_k.

For the convenience of calculating in the latter group membership adjustment, we need to update the values of and (k = 1,2,…K,k ≠ r,s and a_ij = 1), when node v_i is moved from group c_r to group c_s.The first seven variables can be updated by Eq (22). and are updated as follows (23)

Due to that the nodes with high degree have greater impacts on Q(C) than the ones with low degree, the nodes with high degree are preferentially considered here. For each node v_i, we calculate (s = 1,2,…K, and s ≠ r) and then move node v_i to group c_s with and . This operation is repeated until no increment of can be found. The main steps of GMA are provided in Algorithm 4.

Algorithm 4. Group Membership Adjustment (GMA).

Input: C*

Output: C = {c₁,c₂,…,c_K}/* C is the final anti-community structure. */

1: Initialize m_rr, m_rs and m_i(r), r,s = 1,2,…,K,r ≠ s, and i = 1,2,…,n.

2: Sort nodes in a descending order by the node degree, denoted as V, and C = C*.

3: repeat

4: Δ = 0. /* Δ is used for calculating the sum of for each iteration. */

5: for each node v_i∈V do

6: Calculate s = 1,2,…,K, and s ≠ r./* c_r is the anti-community which node v_i belongs to. */

7:

8: if then /* Move node v_i from group c_r to group c_s.*/

9: c_r = c_r−{v_i},c_s = c_s−{v_i}.

10: Update the variables by Eqs () and ().

11:

12: end if

13: end for

14: until Δ = 0.

15: return C.

Complexity analysis

In this subsection, we analyze the computational complexity of the proposed algorithm LEOA. Given graph G = (V,E) with n nodes and m edges, the complexity of calculating the influence of node v_i is where is the average degree of nodes. Thus, it needs to detect structural centers. In LAE, it needs O(nlogn) to sort the unassigned nodes in a descending order by the node degree. And for each unassigned node v_i, the complexity of assigning node v_i to the group with the maximal increment of its disassortative density is O(d_i+K), where d_i is the degree of node v_i. So the complexity of local anti-community expansion is O(nlogn+m+nK). In GMA, the complexity of calculating is O(d_i+K) and the complexity of updating variables by Eqs (22) and (23) is O(d_i). Thus, it requires O(mK+nK₂) to adjust the group membership of nodes. The total complexity of LEOA is In our experiments, we find that LEOA achieves the best performance when l_c = 1, so the time complexity of LEOA is O(nlogn+nK²+mK).

Experiments

In this section, we evaluate the performance of LEOA on synthetic benchmark DBM-Net and 17 real-world networks [30–32]. The experiments on DBM-Net aim to test the ability of LEOA to detect known anti-communities, while the experiments on real-world networks are to access its performance in real applications. Here, we compare LEOA with its variant LEOA* and five state-of-the-art anti-community detection algorithms: Spectral [18], Di-Spectral [12], E-Model [26], M-Model [23] and LPAD [13]. LEOA* does not take the node degree into consideration and randomizes the node order for LAE and GMA. Spectral and Di-Spectral utilize negative eigenvalues and eigenvectors of modularity matrix for anti-community detection. E-Model and M-Model are two block models for structural regularities detection optimized by EM algorithm. LPAD is a recently proposed anti-community detection algorithm based on label propagation. Due to that EM often converges to local optima, we repeatedly carry out EM algorithm 20 times with different initial values for E-Model and M-Model and output the best result for each network. All algorithms are independently run 20 times for each experimental network. The comparison algorithms are conducted by C# on a PC with Intel (R) Core i5-4460 3.20 GHz and 4GB real memory.

As DBM-Net and real-world disassortative networks have known anti-community structures, we adopt the Normalize Mutual Information [33] (NMI) to estimate the similarity between the true partition and the detected one. Assuming that the true partition of a network with n nodes is C₁ and the detected one is C₂, NMI(C₁,C₂) can be computed as (24) where F is a confusing matrix, its element f_ij records the number of the same nodes of the ith group of C₁ and the jth group of C₂, f_i·(f_.j) is the sum of the elements of the ith row (jth column) in F, and represents the number of groups in partition C₁(C₂). The value of NMI is between [0,1] and the larger value of NMI indicates that the detected structure is more accordant with the true one.

Datasets

Synthetic benchmark DBM-Net.

To our knowledge, there is no benchmark designed for anti-community detection. Inspired by the formulation of DBM, we develop a new benchmark called DBM-Net for comparison algorithms in detecting known anti-community structures.

Most of complex networks in real-world are scale-free networks [34], where node degree follows a power law distribution. Thus, we set that the node degree for DBM-Net follows a power law distribution with exponent β and coefficient α, which means that the probability of randomly selecting a node with d_i degree is P(d_i) = α(d_i)^−β. Given the value of exponent β, the maximal degree d_max and the minimal degree d_min, the coefficient α can be calculated as follow (25) So the number of nodes with d_i degree is n(d_i) = ⌊n×P(d_i)⌋, d_i∈[d_min,d_max], and the number of edges m can be calculated as follow (26)

Given the number of groups K, the number of edges inside and among groups m_rr, m_rs (r,s = 1,2,…,K, and r ≠ s) are constrained by Eq (27). (27) For simplicity, we set that the values of m_rr are the same for r = 1,2,…,K, and the values of m_rs are also the same for r,s = 1,2,…,K, r ≠ s. Thus, we obtain (m_rr)_min = 0 and (m_rr)_max = ⌊2m/(K+λK²−λK)⌋. Given the degree of each node, the number of nodes n_r in group c_r satisfies the following constraints (28) where Here, we take the assumption that the group degree follows a uniform distribution, i.e., the group degree for group c_r is D_r = ⌊2m/K⌋, r = 1,2,…,K. The main steps of establishing synthetic benchmark DBM-Net are described in Algorithm 5.

Algorithm 5. DBM-Net Establishment.

Input: (n,K,m_rr,β,d_min,d_max,λ)

Output: (C = {c₁,c₂,…,c_K},A) /* C is the anti-community structure, A is the adjacent matrix. */

1: Calculate the coefficient α according to Eq ().

2: Calculate the values of n(d_i)and randomly assign n(d_i) nodes with d_i degree, d_i∈[d_min,d_max].

3: Calculate the number of edges m according to Eq ().

4: Randomly assign n_r nodes into group c_r with the group degree D_r = ⌊2m/K⌋, r = 1,2,…,K.

5: Calculate the number of edges m_rs between group c_r and group c_s, , r,s = 1,2,…K,r ≠ s.

6: Calculate the estimation values of ω_rr and ω_rs according to Eq ().

7: for r = 1 to K do

8: for each pair of nodes v_i,v_j∈c_r do

9: Calculate the probability of an edge connecting node v_i and node v_j,

10: Generate a random number P∈[0,1].

11: if (P≤P_ij) then

12: a_ij = 1./* There is an edge connecting node v_i and node v_i.*/

13: else

14: a_ij = 0. /* There is no edge connecting node v_i and node v_j.*/

15: end if

16: end for

17:end for

18:for r, s = 1 to K do /* r ≠ s*/

19: for each pair of nodes v_i∈c_r,v_j∈c_s do

20: Calculate the probability of an edge connecting node v_i and node v_j,

21: Generate a random number P∈[0,1].

22: if (P≤P_ij) then

23: a_ij = 1.

24: else

25: a_ij = 0.

26: end if

27: end for

28:end for

29:return (C = {c₁,c₂,…,c_K},A).

Real-world networks

In this paper, we adopt 17 real-world networks [30–32] to evaluate the performance of LEOA, which are divided into two categories: disassortative network and assortative network as shown in Tables 3 and 4, respectively. The experiments on disassortative networks aim at validating the effectiveness of LEOA in exploring known partitions in real applications. Due to that the observed structure in an assortative network is a community structure, the experiments on assortative networks are to test whether LEOA is capable of detecting anti-community structure when the detected structure is inconsistent with the observed one. Here, we adopt NMI and Q(C) for evaluation in disassortative and assortative networks, respectively.

Download:

Table 3. Disassortative network.

https://doi.org/10.1371/journal.pone.0195226.t003

Download:

Table 4. Assortative network.

https://doi.org/10.1371/journal.pone.0195226.t004

In disassortative networks, (1) Southern women describes the participation of 18 women in 14 social events in 1930s. (2) Divorce in US illustrates the relationship of 9 main causes of the divorce cases in 50 states of USA. (3) Cities and services provides the distribution of offices for 46 global advanced producer service firms over 55 cities. (4) Nouns and adjectives describes a co-occurrence network of Nouns and adjectives in the novel David Copperfield. (5) Interlocks in Scotland characterizes the relationship between 108 Scottish firms and 136 multiple directors during 1904–1905. (6) Unicode languages illustrates the usage of 254 languages over 614 territories around the world. Due to that Interlocks in Scotland contains 15 isolated nodes and Unicode languages consists of 5 connecting components, their diameters L are ∞.

In assortative networks, (1) Karate is a friendship network between 34 members of a karate club at a US university in the 1970s, which is divided into two communities due to the disagreement between the administrator and the instructor. (2) Dolphin is a social network of frequent associations among 62 dolphins living in Doubtful Sound, New Zealand and it is divided into two communities according to their age. (3) US politics books describes a frequent co-purchasing network of US politics books by the same buyers in Amazon. The books fall into three types: liberal, neutral, and conservative. (4) Football is a network of American football games among 115 Division IA teams during regular season in Fall 2000. The teams are divided into 12 conferences and the games are more frequent among the teams in the same conference than the ones in different conferences. (5) Elegans describes the relationship between 453 metabolic molecules in a metabolic process. (6) Air traffic control is a network of travel routes among 1226 airports and service centers. (7) Political blogs describes a hyperlinks network among 1490 weblogs on US politics. (8) Netscience is a collaboration network of scientists working on network theory and experiment. (9) Human protein illustrates interactions among 4941 proteins of human; (10) Power represents the topology of the Western States Power Grid of USA. (11) DBLP cite is a network describing the citations among 12591 publications.

Performance evaluation

The cutoff distance l_c has great impacts on the number of anti-communities K, the computational cost and effectiveness of LEOA. As mentioned in complexity analysis of LEOA, the higher the value of l_c, the higher the computational cost of LEOA. As DBM-Net and real-world disassortative networks have known anti-community structures, we analyze the impacts of cutoff distance l_c on NMI and the number of anti-communities K in DBM-Net and real-world disassortative networks. Here, four datasets DBM-Net (n = 500, K = 2, m_rr = 0, β = 2, d_min = 10, d_max = 50) with L = 5, Southern women, Cities and services and Unicode languages are selected for performance evaluation.

Fig 5 shows the results of NMI and K for different values of l_c It can be observed that the increase of l_c leads to the decrease of NMI and the increase of K. The reason is that as l_c increases, |η_i| is also increases, i = 1,2,…n, leading to the increase of the nodes that influence each other and the increase of the structure centers explored by SCD, which results in the decrease of NMI. When l_c = 1, LEOA outputs two anti-communities in these four networks and the values of NMI are higher than those when l_c = 1. Thus, we set l_c = 1 in this paper. When l_c = L, all nodes influence each other and each node forms an anti-community, which leads to the lowest NMI. In addition, we find that the number of nodes that influence each other increases greatly in cases of DBM-Net and Unicode languages when 3≤l_c≤4. This may explain the results that K increases greatly in these two networks when 3≤l_c≤4.

Download:

Fig 5. The results of NMI and the number of anti-communities K for different values of l_c.

(a) DBM-Net. (b) Southern women. (c) Cities and services. (d) Unicode languages.

https://doi.org/10.1371/journal.pone.0195226.g005

Performance comparison on DBM-Net

In this subsection, comparison algorithms are applied to DBM-Net to evaluate their performance in detecting known anti-community structure. We first evaluate the performance of comparison algorithms on DBM-Net with the increase of twice the number of internal edges m_rr. When m_rr = (m_rr)_min, no edge can be found in each group and DBM-Net degenerates into a multipartite network. When (m_rr)_min<m_rr≤(m_rr)_max, λm_rr is less than or equal to m_rs (s = 1,2,…,K, and r ≠ s) and DBM-Net is a network with anti-community structure according to Eq (3). When m_rr>(m_rr)_max, DBM-Net does not have the characteristics of anti-community structure anymore. For comparison, we set n = 500, K = 2, β = 2, d_min = 10, d_max = 50, λ = 2 and m_rr varies from (m_rr)_min to (m_rr)_max with an increment of (m_rr)_max/10. For each value of m_rr, 20 networks are generated and the results of comparison algorithms are shown in Fig 6. It can be observed that the increase of m_rr leads to the decrease of NMI because internal edges weaken the anti-community structure and increase the difficulty of anti-community detection. It can be seen that Spectral outputs higher values of NMI than LEOA except m_rr = (m_rr)_min. The reason is that when m_rr = (m_rr)_min, the number of structural centers detected by SCD is equal to the number of groups in the true partition, which helps LAE and GMA to find the true partition. When m_rr>(m_rr)_min, there are some edges inside each group in the true partition and the number of structural centers detected by SCD may be more than the number of groups in the true partition, which results in that some groups in the true partition may be split into several small groups and the values of NMI decrease. We observe that the higher the value of m_rr, the more the number of structural centers detected by SCD, and the lower the value of NMI. Due to that the number of anti-communities explored by Di-Spectral is much more than the one in the true partition, its values of NMI are lower than those output by Spectral and LEOA. Although EM algorithm is repeatedly carried out with different initial values for E-Model and M-Model, it is still easy for them to fall into local optima and the results output by these two algorithms rely on the threshold of EM algorithm. In addition, we find that the values of NMI output by LPAD are lower than those output by other algorithms in most cases. On one hand, LPAD selects compatible nodes for label updation but the order of compatible nodes selection has great impacts on its accuracy. On the other hand, no internal edge is allowed in the results output by LPAD, which leads to that the higher the value of m_rr, the more the number of groups detected by LPAD, and the lower the value of NMI. It can be seen that the values of NMI provided by LEOA* are lower than those provided by LEOA, which indicates that consideration of node degree in LAE and GMA can improve the effectiveness of LEOA for anti-community detection in DBM-Net.

Download:

Fig 6. The results of NMI of comparison algorithms on DBM-Net for different values of m_rr.

https://doi.org/10.1371/journal.pone.0195226.g006

To further verify the effectiveness of LEOA in detecting known anti-community structures, we apply the comparison algorithms to DBM-Net with the increase of the number of groups K. When K = 1, DBM-Net consists of only one anti-community. And when K = n, each node forms an anti-community. For comparative experiments, we set n = 500, m_rr = 0, β =2, d_min = 10, d_max = 50, and K varies from 2 to 10. The NMI results of comparison algorithms are shown in Fig 7. It can be seen that with the increase of K, it becomes more and more difficult for the algorithms to detect the true partition. The reason is that as K increases, each node has a higher probability to be assigned to a wrong group, especially in the early stage of the algorithms. And when K≥7, all algorithms fail to find the true partition (NMI≈0). It can be observed that when 2≤K≤4, the NMI results of LEOA fall more slowly than those of other algorithms, but when 4<K≤6, the NMI results of LEOA fall faster than those of other algorithms. The reason is that when 2≤K≤4, the number of structural centers detected by SCD is equal to the number of groups in the true partition, leading to high values of NMI (NMI≥0.8) and a slow descent of NMI. In cases of 4<K≤6, LEOA cannot detect the structural centers for some groups in the true partition, because all nodes in these groups are not highly connected with the structural centers in other groups, which leads to the wrong assignments of the nodes and a fast descent of NMI.

Download:

Fig 7. The results of NMI of comparison algorithms on DBM-Net for different values of K.

https://doi.org/10.1371/journal.pone.0195226.g007

As mentioned above, the factor λ in Eq (3) controls the number of edges inside and among anti-communities. Here, we evaluate the performance of comparison algorithms in DBM-Net with the increase of the factor λ For comparison, we set n = 500, K = 2, β =2, d_min = 10, d_max = 50, λm_rr = m_rs (s = 1,2,…,K, and r ≠ s)and λ varies from 1 to 10. The results of NMI of comparison algorithms are shown in Fig 8. It can be observed that the increase of λ leads to the increase of NMI. Given the number of edges m, the higher the value of λ, the fewer the number of edges inside groups, and the more the number of edges among groups, which is easier for the algorithms to detect the true partition and leads to high values of NMI.

Download:

Fig 8. The results of NMI of comparison algorithms on DBM-Net for different values of λ.

https://doi.org/10.1371/journal.pone.0195226.g008

Performance comparison on real-world networks

Table 5 shows the results of comparison algorithms on 6 disassortative networks. It can be observed that all algorithms output the true partitions for the first three networks. In the remaining networks, LEOA provides the highest values of NMI. It can be found that the NMI results of all algorithms on Nouns and adjectives are less than 0.4. The reason is that there are some edges among nouns nodes and some edges among adjectives nodes, which leads to an incomplete bipartite network and increases the difficulty of the algorithms to explore the true partition. As LAE and GMA may generate some edges inside groups, which is suitable to Nouns and adjectives, LEOA provides a higher NMI than others. We observe that the values of NMI of all algorithms on Interlocks in Scotland are less than 0.5. The main reason is that Interlocks in Scotland contains 15 isolated nodes, which affect the calculation of eigenvalues and eigenvectors of modularity matrix for Spectral and Di-Spectral and the calculation of maximum likelihood optimized by EM algorithm for E-Model and M-Model. Due to that the isolated nodes are compatible with any other node, LPAD cannot accurately determine the labels for these nodes. In addition, LEOA always assigns the isolated nodes to the group with the maximal group size so as to output higher Q(C). These reasons result in the wrong assignments of isolated nodes and even affect the assignments of other nodes, leading to the low values of NMI. In addition, we find that all algorithms cannot detect the true partition in Unicode languages. The reason is that Unicode language consists of 5 connected components with bipartite structure, leading to that 16 different partitions can be obtained by randomly combining the connected components into a final bipartite structure. And the bipartite structures detected by the comparison algorithms are different from the true one. It can be observed that the NMI results provided by LEOA are higher than those provided by LEOA* in the last three networks, which demonstrates that node degree factor in LEOA can enhance the accuracy of LEOA. From these results, we can see that LEOA achieves good performance for anti-community detection in experimental disassortative networks.

Download:

Table 5. Experimental results of comparison algorithms on disassortative networks.

https://doi.org/10.1371/journal.pone.0195226.t005

Table 6 shows the results of the comparison algorithms on 11 assortative networks. Due to that the observed structure in an assortative network is a community structures and the results output by E-Model and M-Model are highly dependent on the observed one of a network, they cannot output anti-community structure on an assortative network and their results are not considered here. It can be seen that the values of Q(C) provided by LEOA are higher than those provided by other algorithms, which indicates that LEOA is superior to other algorithms for experimental assortative networks.

Download:

Table 6. Experimental results of comparison algorithms on assortative networks.

https://doi.org/10.1371/journal.pone.0195226.t006

To further compare the comparison algorithms, we take assortative network Karate as an example and their results are shown in Fig 9. In Karate, the disagreement between the administrator (node v₁ and the instructor (node v₃₄) leads to the division of the network into two groups. We observe that the partitions output by Spectral, Di-Spectral, LPAD, LEOA* and LEOA are anti-community structures, while the partitions output by E-Model and M-Model are community structures. These results indicate that LEOA is capable of exploring anti-community structure in assortative networks. It can be seen that some groups detected by Spectral, Di-Spectral and LPAD consist of two or three nodes, leading to that a few negative relations can be explored in these groups. In addition, we find that only LEOA assigns node v₁ and node v₃₄ into the same anti-community and reveals the negative relation between the administrator and the instructor. The reason is that node v₃₄ owns the highest degree (d₃₄ = 17) in Karate. In SCD, node v₃₄, node v₃₂ and node v₃₃ are regarded as structural centers. And then node v₁ is first considered in LAE because it owns the highest degree (d₁ = 16) in the remaining nodes. We find that node v₁ outputs the highest increment of disassortative density when it is added into the group of node v₃₄ and the group of node v₃₃. Due to that |η₃₄|>|η₃₃|, node v₁ is added into the group of v₃₄. In GMA, the group memberships of node v₁ and node v₃₄ are not changed. These results demonstrate that the consideration of node degree in LEOA can help explore the negative relations among objects.

Download:

Fig 9. The results of comparison algorithms for Karate.

(a) Spectral. (b) Di-Spectral. (c) E-Model. (d) M-Model. (e) LPAD. (f) LEOA*. (g) LEOA.

https://doi.org/10.1371/journal.pone.0195226.g009

Efficiency analysis

In this subsection, we compare the running time of the comparison algorithms on DBM-Net to evaluate the efficiency of LEOA. First, we apply them to DBM-Net with K = 2, m_rr = 0, β =2, d_min = 10, d_max = 50, and n∈[500,5000] as shown in Fig 10(A). It can be observed that the running time of E-Model gets close to that of LPAD as n increases, but when n≥1500, E-Model is more efficient than LPAD. The reason is that LPAD needs O(n) to determine whether the label of each node is changed in each iteration, so it requires more computational cost than E-Model. In order to validate the performance of comparison algorithms in larger networks, we apply the comparison algorithms to DBM-Net with n∈[10000,100000] as shown in Fig 10(B). We find that Spectral and Di-Spectral cannot output the results within 24 hours when n≥30000, because with the increase of the number of nodes n and the number of edges m, the scale of DBM-Net increases and then the running time for calculating the eigenvalues and eigenvectors of the modularity matrix increases greatly. It can be seen that LEOA* requires less running time than LEOA, because the complexity of sorting the nodes in a descending order by the node degree is O(nlogn), while the complexity of randomizing the node order for LEOA* is O(n). From the curves, we can conclude that LEOA is more efficient than five state-of-the-art algorithms in DBM-Net.

Download:

Fig 10. The running time of comparison algorithms on DBM-Net.

https://doi.org/10.1371/journal.pone.0195226.g010

Conclusions

In this paper, we propose a Degree-based Block Model (DBM) for anti-community structure. In DBM, we take the node degree into consideration and obtain a objective function Q(C) for evaluation. A local expansion optimization algorithm LEOA is designed, in which the nodes with high degree are preferentially considered. Based on the formulation of DBM, a synthetic benchmark DBM-Net is developed for evaluating the algorithms in detecting known anti-community structures. The proposed algorithm LEOA is applied to DBM-Net with up to 100000 nodes and 17 real-world networks and compared with its variant LEOA* and five state-of-the-art anti-community detection algorithms. The experimental results demonstrate the effectiveness and efficiency of LEOA for anti-community detection in networks and exploring negative relations among objects.

There are still some problems to be solved in our future work. First, we find that the edges inside groups have great impacts on the number of structural centers detected by SCD, which leads to the low performance when LEOA is applied to the networks with edges inside groups. In our future work, we plan to employ some priori information by merging some nodes into small groups not to be divided in later operations. This strategy will further improve the effectiveness and efficiency of the algorithm. Second, we find that the number of structural centers detected by SCD is less than the number of anti-communities K in the true partitions when K is large. In the future, we will divide some groups into two subgroups when the number of edges inside group is more than a certain threshold. Third, it can be seen that the preferential consideration of nodes with high degree can improve the effectiveness of LEOA. However, the node order sorted by the node degree may not output the best result for each network. In the future, we aim to analyze the order of node and select the best node sequence for each network so as to output a better anti-community structure. Finally, DBM-Net is designed based on the assumptions that the group degree and the number of internal edges for each group are the same and each group pair shares the same number of external edges. More complicated benchmark with heterogeneous distribution of group degree and edges number should be considered in the future.

Acknowledgments

This research was supported in part by the National Science and Technology Major Project of the Ministry of Science and Technology of China under grant 2018ZX10715003-002, the National Key Research and Development Program of China under grant 2017YFC1703900, the Sichuan Science and Technology Program under grant 2018PTDJ0084, and the US National Science Foundation (NSF) under grant 1652107.

References

1. Newman MEJ. The structure and function of complex networks. SIAM Rev. 2003; 45(2): 167–256.
- View Article
- Google Scholar
2. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: structure and dynamics. Phys Rep. 2006; 424 (4–5): 175–308.
- View Article
- Google Scholar
3. Iyer S, Killingback T, Sundaram B, Wang Z. Attack robustness and centrality of complex networks. PLoS One. 2013; 8(4): e59613. pmid:23565156
- View Article
- PubMed/NCBI
- Google Scholar
4. Fortunato S. Community detection in graphs. Phys Rep. 2010; 486(3): 75–174.
- View Article
- Google Scholar
5. Sankowskaa A, Dariusz S. The small world phenomenon and assortative mixing in Polish corporate board and director networks. Physica A. 2016; 443: 309–315.
- View Article
- Google Scholar
6. Wu P, Pan L. Multi-objective community detection based on memetic algorithm. PLoS One. 2015; 10(5): e0126845. pmid:25932646
- View Article
- PubMed/NCBI
- Google Scholar
7. Newman MEJ. The structure of scientific collaboration networks. Proc Natl Acad Sci U S A. 2001; 98(2): 404–409. pmid:11149952
- View Article
- PubMed/NCBI
- Google Scholar
8. Miyauchi A, Kawase Y. Z-score-based modularity for community detection in networks. PLoS One. 2016; 11(1): e0147805. pmid:26808270
- View Article
- PubMed/NCBI
- Google Scholar
9. He J, Li C, Ye B, Zhong W. Efficient and accurate greedy search methods for mining functional modules in protein interaction networks. BMC Bioinformatics. 2012; 13 Suppl 10: S19. pmid:22759424
- View Article
- PubMed/NCBI
- Google Scholar
10. Cunha BR, González-Avella JC, Gonçalves S. Fast fragmentation of networks using module-based attacks. PLoS One. 2015; 10(11): e0142824. pmid:26569610
- View Article
- PubMed/NCBI
- Google Scholar
11. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech-Theory Exp. 2008; P10008.
- View Article
- Google Scholar
12. Newman MEJ. Finding community structure in networks using the eigenvectors of matrices. Phys Rev E. 2006; 74(3): 036104. pmid:17025705
- View Article
- PubMed/NCBI
- Google Scholar
13. Chen L, Yu Q, Chen B. Anti-modularity and anti-community detecting in complex networks. Inf Sci. 2014; 275: 293–313.
- View Article
- Google Scholar
14. Zachary WW. An information flow model for conflict and fission in small groups. J Anthropol Res. 1977; 33(4): 452–473.
- View Article
- Google Scholar
15. Trevisan L. Max cut and the smallest eigenvalue. SIAM J Sci Comput. 2012; 41(6): 1769–1786.
- View Article
- Google Scholar
16. Alon N, Sudakov B. Bipartite subgraph and the smallest eigenvalue. Comb Probab Comput. 2000; 9(1): 1–12.
- View Article
- Google Scholar
17. Holme P, Liljeros F, Edling C, Kim B. Network bipartivity. Phys Rev E. 2003; 68(5): 056107. pmid:14682846
- View Article
- PubMed/NCBI
- Google Scholar
18. Wang F. Detecting anti-communities of networks based on spectral method. M.Sc Thesis. Huazhong University of Science and Technology. 2008. Available from: http://cdmd.cnki.com.cn/Article/CDMD-10487-2009227871.htm
19. Ball B, Karrer B, Newman MEJ. An efficient and principled method for detecting communities in networks. Phys Rev E. 2011; 84: 036103. pmid:22060452
- View Article
- PubMed/NCBI
- Google Scholar
20. He D, Liu D, Jin D, Zhang W. A stochastic model for detecting heterogeneous link communities in complex networks. Proceedings of 29th AAAI Conference on Artificial Intelligence. 2015, Jan 25–30; Austin, Texas, USA, pp. 130–136.
21. Latouche P, Birmele E, Ambroise C. Overlapping stochastic block models with application to the French political blogosphere. Ann Appl Stat. 2011; 5(1): 309–336.
- View Article
- Google Scholar
22. Karrer B, Newman MEJ. Stochastic blockmodels and community structure in networks. Phys Rev E. 2011; 83(1): 016107. pmid:21405744
- View Article
- PubMed/NCBI
- Google Scholar
23. Newman MEJ, Leicht EA. Mixture models and exploratory analysis in networks. Proc Natl Acad Sci U S A. 2007; 104(23): 9564–9569. pmid:17525150
- View Article
- PubMed/NCBI
- Google Scholar
24. Newman MEJ. Communities, modules and large-scale structure in networks. Nat Phys. 2012; 8(1): 25–31.
- View Article
- Google Scholar
25. Ren W, Yan G, Liao X, Xiao L. Simple probabilistic algorithm for detecting community structure. Phys Rev E. 2009; 79(3): 036111. pmid:19392022
- View Article
- PubMed/NCBI
- Google Scholar
26. Shen H, Cheng X, Guo J. Exploring the structural regularities in networks. Phys Rev E. 2011; 84(5): 056111. pmid:22181477
- View Article
- PubMed/NCBI
- Google Scholar
27. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B. 1977; 39 (1): 1–38.
- View Article
- Google Scholar
28. Goemans MX, Williamson DP. Improved approximation algorithms for maximum cut and satisability problems using semidefinite programming. J Assoc Comput Mach. 1995; 42(6): 1115–1145.
- View Article
- Google Scholar
29. Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc Natl Acad Sci U S A. 2004; 101(9): 2658–2663. pmid:14981240
- View Article
- PubMed/NCBI
- Google Scholar
30. Newman MEJ. Network data from Newman’s homepage. Available from: http://-personal.umich.edu/~mejn/netdata/, Date of access: 13/04/2017.
31. Batagelj V, Mrvar A. Pajek datasets. Available from: http://vlado.fmf.uni-lj.si/pub/networks/data/, Date of access: 13/04/2017.
32. The Koblenz Network Collection. Available from: http://konect.uni-koblenz.de/, Date of access: 13/04/2017.
33. Danon L, Diaz-Guilera A, Duch J, Arenas A. Comparing community structure identification. J Stat Mech -Theory Exp. 2005; P09008.
- View Article
- Google Scholar
34. Albert R, Barabasi AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74(1): 47–97.
- View Article
- Google Scholar

[ref1] 1. Newman MEJ. The structure and function of complex networks. SIAM Rev. 2003; 45(2): 167–256.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: structure and dynamics. Phys Rep. 2006; 424 (4–5): 175–308.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Iyer S, Killingback T, Sundaram B, Wang Z. Attack robustness and centrality of complex networks. PLoS One. 2013; 8(4): e59613. pmid:23565156
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Fortunato S. Community detection in graphs. Phys Rep. 2010; 486(3): 75–174.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref5] 5. Sankowskaa A, Dariusz S. The small world phenomenon and assortative mixing in Polish corporate board and director networks. Physica A. 2016; 443: 309–315.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref6] 6. Wu P, Pan L. Multi-objective community detection based on memetic algorithm. PLoS One. 2015; 10(5): e0126845. pmid:25932646
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref7] 7. Newman MEJ. The structure of scientific collaboration networks. Proc Natl Acad Sci U S A. 2001; 98(2): 404–409. pmid:11149952
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref8] 8. Miyauchi A, Kawase Y. Z-score-based modularity for community detection in networks. PLoS One. 2016; 11(1): e0147805. pmid:26808270
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref9] 9. He J, Li C, Ye B, Zhong W. Efficient and accurate greedy search methods for mining functional modules in protein interaction networks. BMC Bioinformatics. 2012; 13 Suppl 10: S19. pmid:22759424
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref10] 10. Cunha BR, González-Avella JC, Gonçalves S. Fast fragmentation of networks using module-based attacks. PLoS One. 2015; 10(11): e0142824. pmid:26569610
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref11] 11. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech-Theory Exp. 2008; P10008.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref12] 12. Newman MEJ. Finding community structure in networks using the eigenvectors of matrices. Phys Rev E. 2006; 74(3): 036104. pmid:17025705
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref13] 13. Chen L, Yu Q, Chen B. Anti-modularity and anti-community detecting in complex networks. Inf Sci. 2014; 275: 293–313.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref14] 14. Zachary WW. An information flow model for conflict and fission in small groups. J Anthropol Res. 1977; 33(4): 452–473.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref15] 15. Trevisan L. Max cut and the smallest eigenvalue. SIAM J Sci Comput. 2012; 41(6): 1769–1786.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref16] 16. Alon N, Sudakov B. Bipartite subgraph and the smallest eigenvalue. Comb Probab Comput. 2000; 9(1): 1–12.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref17] 17. Holme P, Liljeros F, Edling C, Kim B. Network bipartivity. Phys Rev E. 2003; 68(5): 056107. pmid:14682846
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref18] 18. Wang F. Detecting anti-communities of networks based on spectral method. M.Sc Thesis. Huazhong University of Science and Technology. 2008. Available from: http://cdmd.cnki.com.cn/Article/CDMD-10487-2009227871.htm

[ref19] 19. Ball B, Karrer B, Newman MEJ. An efficient and principled method for detecting communities in networks. Phys Rev E. 2011; 84: 036103. pmid:22060452
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref20] 20. He D, Liu D, Jin D, Zhang W. A stochastic model for detecting heterogeneous link communities in complex networks. Proceedings of 29th AAAI Conference on Artificial Intelligence. 2015, Jan 25–30; Austin, Texas, USA, pp. 130–136.

[ref21] 21. Latouche P, Birmele E, Ambroise C. Overlapping stochastic block models with application to the French political blogosphere. Ann Appl Stat. 2011; 5(1): 309–336.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref22] 22. Karrer B, Newman MEJ. Stochastic blockmodels and community structure in networks. Phys Rev E. 2011; 83(1): 016107. pmid:21405744
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref23] 23. Newman MEJ, Leicht EA. Mixture models and exploratory analysis in networks. Proc Natl Acad Sci U S A. 2007; 104(23): 9564–9569. pmid:17525150
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

[ref24] 24. Newman MEJ. Communities, modules and large-scale structure in networks. Nat Phys. 2012; 8(1): 25–31.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref25] 25. Ren W, Yan G, Liao X, Xiao L. Simple probabilistic algorithm for detecting community structure. Phys Rev E. 2009; 79(3): 036111. pmid:19392022
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref26] 26. Shen H, Cheng X, Guo J. Exploring the structural regularities in networks. Phys Rev E. 2011; 84(5): 056111. pmid:22181477
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref27] 27. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B. 1977; 39 (1): 1–38.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref28] 28. Goemans MX, Williamson DP. Improved approximation algorithms for maximum cut and satisability problems using semidefinite programming. J Assoc Comput Mach. 1995; 42(6): 1115–1145.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref29] 29. Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc Natl Acad Sci U S A. 2004; 101(9): 2658–2663. pmid:14981240
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref30] 30. Newman MEJ. Network data from Newman’s homepage. Available from: http://-personal.umich.edu/~mejn/netdata/, Date of access: 13/04/2017.

[ref31] 31. Batagelj V, Mrvar A. Pajek datasets. Available from: http://vlado.fmf.uni-lj.si/pub/networks/data/, Date of access: 13/04/2017.

[ref32] 32. The Koblenz Network Collection. Available from: http://konect.uni-koblenz.de/, Date of access: 13/04/2017.

[ref33] 33. Danon L, Diaz-Guilera A, Duch J, Arenas A. Comparing community structure identification. J Stat Mech -Theory Exp. 2005; P09008.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref34] 34. Albert R, Barabasi AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74(1): 47–97.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

Figures

Abstract

Introduction

Related works

Methods

Anti-community

Degree-based block model

Local expansion optimization algorithm

Structural Center Detection (SCD).

Local Anti-community Expansion (LAE).

Group Membership Adjustment (GMA).

Complexity analysis

Experiments

Datasets

Synthetic benchmark DBM-Net.

Real-world networks

Performance evaluation

Performance comparison on DBM-Net

Performance comparison on real-world networks

Efficiency analysis

Conclusions

Acknowledgments

References