Detecting multiple communities using quantum annealing on the D-Wave system

A very important problem in combinatorial optimization is the partitioning of a network into communities of densely connected nodes; where the connectivity between nodes inside a particular community is large compared to the connectivity between nodes belonging to different ones. This problem is known as community detection, and has become very important in various fields of science including chemistry, biology and social sciences. The problem of community detection is a twofold problem that consists of determining the number of communities and, at the same time, finding those communities. This drastically increases the solution space for heuristics to work on, compared to traditional graph partitioning problems. In many of the scientific domains in which graphs are used, there is the need to have the ability to partition a graph into communities with the “highest quality” possible since the presence of even small isolated communities can become crucial to explain a particular phenomenon. We have explored community detection using the power of quantum annealers, and in particular the D-Wave 2X and 2000Q machines. It turns out that the problem of detecting at most two communities naturally fits into the architecture of a quantum annealer with almost no need of reformulation. This paper addresses a systematic study of detecting two or more communities in a network using a quantum annealer.


Introduction
The use of networks spans across many scientific domains. To showcase how broad this spectrum is, we highlight the following examples: molecules are chemical networks with atoms connected to each other [1][2][3]; living cells follow a communication pattern described by a network [4]; and human interactions create different social networks [5]. The study of networks has resurged in recent times due to the availability and capability of producing and storing data from a large number of applications. Advances in crystallography, for example, allows chemists to get atomistic representations of complex proteins; social media speeds up human communication; etc. [6].
In all of the previous mentioned scientific domains in which graphs are used, there is the imminent need to have the ability to partition a graph into communities with the "highest  [12]. This implies an extensive search space for feasible solutions as compared to the regular balanced GP problem. It also implies the need to combine heuristics and recursive optimization methods to search for the optimal community structure. Several algorithms have been proposed based on greedy techniques [13,14], simulated annealing [15,16], genetic algorithm [17], spectral optimization [18] and extreme optimization methods [19,20]. Recent work in [21] proposes multi-community detection using quantum hardware for signed graphs. The notion of "more connected" within a community is arbitrary and the existence of a unique definition is still under debate, however, the metric proposed by Girvan and Newman based on the modularity has been well accepted in the field [10,22,23]. Other metrics and methods include k-means [24], spectral partitioning [25], hierarchical grouping [26], modularity density [22], min-max cut [27], and normalized cut [25] among others. Moreover, Neukart, et al., implemented a quantum-assisted clustering method on the D-Wave system [28]. The modularity metric is used to quantify the quality of a community structure by comparing the connectivity of edges within communities with the connectivity of an equivalent network where edges would be placed randomly.
Maximizing the modularity with respect to the community partitions is a problem that has been largely studied and different methods have been proposed. Despite all these efforts, modularity maximization still remains a field that needs further exploration. Two main problems have been identified. The first problem is the existence of a resolution limit, in which a very small community could be missed if the resulting community structure also contains large communities [29]. The second problem is the existence of several minima with the same modularity which prevents or makes it difficult to obtain the global minima [30]. These two problems are related to what we have previously loosely defined as highly resolved community structure.
A method for further refining community structures has been recently introduced [31]. The authors proposed an exact algorithm for bi-partitioning a network using split and merge movements over communities of an existing partition. Using this approach the authors were able to further improve the modularity values and, as a consequence, the quality of the community structure. We will use some of these highly refined results as a point of comparison with the ones obtained from the method we are introducing.
Quantum computers of the annealer type, such as the D-Wave 2X and 2000Q, minimize the Ising objective function as follows making use of the quantum entanglement effect [32]. The objective function can be written as follows: where s i 2 {−1, +1} are magnetic spin variables where the result is encoded; h i and J ij are local magnetic fields and coupling strengths that encode the problem Hamiltonian. At the hardware level, the D-Wave quantum computer is composed of qubits with sparse connectivity as a fixed sparse graph, known as a Chimera graph. During the annealing process each qubit can be in a "superposition" state (both a "-1" and a "+1" simultaneously). The superposition lasts until an outside event causes it to collapse into either a "-1" or a "+1" state. The result of the annealing process is a low-energy ground state s, consisting of an Ising spin for each qubit value 2 {−1, +1}. This makes quantum computers useful to tackle NP-hard complex problems including optimization, machine learning and sampling problems. Maximization problems can also be solved by the D-Wave by using the negative of Eq 1 as the objective function. The formulation where variables take values of either 0 or 1 is called the quadratic unconstrained binary optimization or QUBO formulation and it is an alternative representation that can easily be translated to or from the Ising model. An Ising model can become a QUBO through the transformation, s = 2x − 1. Current D-Wave platforms have physical constraints such as limited precision, sparse connectivity, and number of available qubits. Embedding is required to map a problem onto the hardware Chimera graph prior to annealing. Purely quantum approaches are limited by the number of graph nodes/variables that can be represented on the hardware, 46 for the D-Wave 2X and 64 for the D-Wave 2000Q. Due to the fact of limited variables, larger problems require hybrid quantum-classical approaches.
In this paper we describe the method for performing community detection based on the modularity metric using the D-Wave quantum annealer. We carefully derive the formulation of the problem as a QUBO. Results are compared with existing benchmarks using "state of the art" tools.

Formulation
Let G = (V, E) be a weighted graph with nodes i in V and edges ij in E such that the corresponding adjacency matrix A is defined as follows: with w ij being the weight of edge ij. We can then construct a modularity matrix B as the difference between A and a matrix constructed as an outer product of the vector degree g. Here, the node degree g i is defined as g i = ∑ j A ij . Newman's expression for the modularity matrix B can be written as follows: where, 2m = ∑ i g i . Or equivalently: For partitioning the graph into at most two communities, if s i 2 {−1, 1}, is a binary variable indicating which community node i belongs to, then the modularity Q for any given partition is given by Community detection requires maximizing the modularity Q. The rows and columns of the matrix B sum to zero, thus, we have In addition, the term 2 can be viewed as a product of binary variables x i 2 {0, 1}. We can therefore write Eq (5) in matrix form as where s is a column vector with entries s i . A transformation from a QUBO to Ising formulation can be given by Eq (6) essentially shows that both QUBO and Ising formulations are equivalent for modularity maximization. Thus, the maximum modularity for at most two communities is given by which are clearly unconstrained quadratic optimization problems, suitable to be solved by quantum annealers. Now, if we are interested in partitioning the graph into at most k communities, the modularity for any given partition is given by where 1 � c i � k is the community node i belongs to and the function δ(c i , c j ) = 1, if c i = c j or 0 otherwise; with i 6 ¼ j for 1 � i and j � |V|. Eq 8 poses the problem that the function δ is not necessarily a quadratic binary variable function. In our previous work, we demonstrated dividing a graph into two communities [8]. Additionally, we introduced the concept of a logical super-node which allows the partition of graphs into more than two parts in an "all at once" k-concurrent fashion (see Fig 3). Each logical super-node represents a graph node. A super-edge represents a connection between graph nodes. The logical super-node utilizes a one-hot encoding to represent the selection of 1 out of k communities [33]. In this work, we will use the concept of the logical super-node and a proper formulation leading to a QUBO problem to partition the graph into more than two communities in a k-concurrent fashion.
In this section, we will generalize the modularity matrix formulation for an arbitrary number of communities. For this, let G = (N, E) be a graph for which the final result of the community split is a set composed of communities C i with the following property: For a given community, C, the value computes the modularity of community C within the QUBO formulation.
Suppose that now we want to partition G into at most k communities. Let x i,j be the decision variables such that ( Strictly speaking, the set {x i,1 , . . . x i,k } is a logical super-node as depicted in Fig 3. Since each node must be in exactly one community, the following constraint needs to be fulfilled. Let . . ; be the vector state, then: with x i,j 2 {0, 1}, for i = 1, . . ., n and j = 1, . . ., k. γ i are the corresponding relaxation coefficients.
The derivation that follows has already been shown in reference [8]. We will give a summary for completeness here. We start by considering the following equality: Let now Z i be the N � N zero matrix (with N ¼ k � n) whose jth diagonal element is 1 if and only if j � i (mod n). For example, in Z 1 every 1 st , (n + 1) th , (2n + 1) th , . . ., ((k − 1)n + 1) th diagonal element is 1 and has zero everywhere else. Then where X is the matrix (x 1 , . . ., x j , . . ., x k ) formed by the x j column vectors. In consequence, Let D γ be a diagonal matrix such that . . . ; g n Þ and B Γ be a block matrix with k × k blocks, where each block is equal to D γ , then with x i,j 2 {0, 1}, being the elements of matrix X with i = 1, . . ., n, j = 1, . . ., k. B is a block diagonal matrix with the modularity matrix B on the diagonal, and β just a tunable parameter to control the weight of the term.

Results and discussion
We used the python NetworkX tools [34] for pre-and post-processing of the example graphs. The size of these graphs in most cases is larger than the number of nodes or variables that can be embedded on the D-Wave 2X and 2000Q. Therefore, we have used the hybrid quantumclassical tool, qbsolv, developed by D-Wave [35]. The qbsolv software takes the full problem in QUBO format as input and makes multiple calls to the D-Wave to solve subQUBOs for global minimization, followed by tabu search for local minimization. It can be called directly through the D-Wave Ocean application programming interface (API) or from the command line. Resulting bitstrings of zeros and ones are translated based on the optimization problem's representation. The quality of a community structure is determined by evaluating the modularity metric. This varies from 0 to 1, with a larger value being preferable. We have used the Zachary (karate club) graph to compare the results of the modularity metric obtained with other methods. The aforementioned is a social network of friendships between 34 members of a karate club at a US university in the 1970s. With only 34 nodes and 78 edges, this graph is considered an archetypal social network that has been extensively used to benchmark graph algorithms [36]. From Table 1 we observe that modularity is similar across all methods, however the quantum  [14]. A representation of the community structure obtained by this method can be seen in Fig 4. In Fig 5 we can see how the numbers of communities and the modularity reach a maximum when the number of nodes (or variables) per logical super-node increases. Modularity and number of communities reach values of 0.42 and 4 respectively. Instead of increasing the number of nodes per super-node iteratively, one could directly use a large number such as k = n and get the same result. Currently, the D-Wave machines are limited in the problem size as number of variables that can be embedded and run on the Chimera graph architecture. The 2X is limited to 46 variables, while the 2000Q can run problems up to 64. The sparse connectivity of the chimera graph requires that some variables will be represented by chains of qubits [38]. This can quickly use up the available qubits on the 2X (up to 1152) or the 2000Q (up to 2048). The penalty constant γ that is used to constrain each graph node to be in only one community can vary depending on the graph. Tuning is required for each new network problem.
We have computed the community structure for the following set of benchmark social networks: a coappearance network of characters in the novel Les Miserables (LesMiserables) [39]; an undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand (Dolphins) [40]; a book co-purchasing network where nodes represent books about US politics and edges represent frequent co-purchasing of books (PoliticalBooks) [23]; a collaboration network of Jazz musicians (Jazz) [41]; and a metabolic network of the nematode C. elegans (Elegans) [20]. Results are shown in Table 2.
Results for the Zachary and Dolphins benchmarks match the highly refined values reported in [31] which, to our knowledge are the "best" in the literature ever obtained. The quantum annealer performs slightly better for the Dolphins network, when compared with non-refined results in the literature [42]. Our results on LesMiserables and PoliticalBooks benchmarks are similar in comparison to the refined values of 0.56001 and 0.52724 respectively [42]. On the other hand, results for the Jazz and Elegans benchmarks are similar in comparison, but lower than the values reported in the literature [20].
As a mode of comparison, we have also run the same calculation using a Classical annealer (CA) to solve the QUBO problem as implemented in the D-Wave's ocean API [43] (See Table 2). To our surprise, the CA performs as well as the QA for the case of the Zachary graph (smallest case in the set). This means that there must be a benefit of formulating the modularity method as a QUBO problem that goes beyond the Quantum nature of the D-Wave annealer. When the size of the graph increases, the performance of the CA gets compromised. For the largest graph this QUBO approach solved with the CA shows a notably poorer performance. The parameters for the classical annealing were chosen as the default with the exception of num_reads which was set to 1000 to increase the quality of the results. We have also explored different inverse temperature programs by changing the variable beta_range and saw no improvement on the results. One of the benefits of using quantum annealing is that there is no need to proceed recursively. The total default time for annealing is 20 μs for the case of the D-Wave. brown However, one can make a reasonable analogy between the  Table 2. The results of k-concurrent community detection on benchmark graphs. N, E, N com , and Mod. are the number of nodes, number of edges, number of communities and modularity respectively. We show results from both the D-Wave quantum annealer (QA) and a Classical annealer (CA). Detecting multiple communities using quantum annealing annealing time in QA and the number of iterations in CA. Problems having many local minima (that could come from large sized system) would require longer annealing times for QA [44]. For the case of the quantum annealer we used the default number of reads of 50 and no spin reversal transform was applied. The annealing time was 20 μs which is the default value for D-Wave 2000Q. For every case, the value of γ was set such that the penalty constant of Eq 2 is not violated and the relative importance of the values of B are not undermined. The values of γ used for all the graphs are as follows: -5 Zachary and dolphins, -6 for LesMiserables, -10 for PoliticalBooks, -525 for Jazz, and -33 for Elegans. Note that these values will depend upon the case analyzed and we do not have a direct way to determine them based on matrix B. For the case of CA the number was manually changed to get the best possible results without violating the one hot encoding constraint (nodes cannot belong to multiple communities). For the case of CA the values of γ that we used are: -5 for Zachary and Dolphins, -7 for LesMiserables, -10 for PoliticalBooks, -20 for Jazz, and -20 for Elegans. The score performed by the quantum annealer reflects the size of the machine architecture. A small graph results in a clean embedding on the Chimera graph, with small chains to compensate for missing connections. As we said before, when graphs are larger we start having longer chains which can hinder the annealing performance [38]. For even larger graphs, hybrid quantum-classical approaches such as qbsolv are used for orchestrating the use of the D-Wave solver on subproblems. Lower performance using qbsolv may come from the mixing of classical and quantum processing. The quantum state initially created is no longer coming from the full Hamiltonian, but several pieces of it and a full pseudo solution is reconstructed. We strongly believe that having more qubits available and increased connectivity will improve the performance of this technique.

Graph
A particular property of the modularity matrix is that it can be thresholded with almost no effect on the quality of the community structure. This statement has a significant impact since it is saying that we can save qubits and reduce the size of the problem to be embedded. Here we have tested this hypothesis by using the Zachary graph and thresholding the modularity matrix weights before solving the problem for community detection. From Table 3 we can see that we can remove up to 40% of the edges (by thresholding) and still get the same community structure with the same modularity value.
Note that, with this technique, the modularity remains unchanged up to a large threshold value because the community structure does not change and we are still evaluating the modularity with the original (non-thresholded) matrix. In other words, the annealing is performed Detecting multiple communities using quantum annealing with the QUBO generated from the thresholded modularity matrix whereas the evaluation of the modularity, instead, is performed with the original modularity matrix. It is difficult to disentangle the combined effect of qbsolv and the quantum annealing. However, if we look at the thresholding of the modularity matrix, we can see that the number of edges can be fairly reduced down to 300 without changing the result of the modularity. This means that, at least, 300×4×4 = 4800 chimera edges are required. The D-Wave 2000Q can embed a fully connected 64 node graph requiring at least 4096 chimera edges. This means that the 300 edges thresholded version of the Zachary graph is comparable to the maximum fully connected graph that can be embedded in the D-Wave 2000Q machine.

Conclusion
In this paper we have demonstrated the ability of a quantum computer and in particular a quantum annealer to solve community detection as an optimization problem. From the analysis of the results we observe that the quantum computer can render a highly optimized community structure. In the case of the Zachary graph, we reach the actual record value. For other other benchmark graphs, including the larger graphs such as Jazz and Elegans, the quality of the community structure is comparable to "state of the art" results.
One of the most notable observations is that by using this quantum annealing technique with the k-concurrent method, we obtain the community structure "all at once" within the annealing time. There is no need to implement an iterative process as is the case for heuristic methods running on classical computers. In principle, the quantum annealer is designed to explore the full search space during the annealing time. Limitations such as a reduced number of available qubits and sparse connectivity can be detrimental to the quality of the results. brown Improvements of the results presented in this work remain tied to the evolution of quantum annealers towards better scaling, and better exploration of the solution space.