Detection of an arbitrary number of communities in a block spin Ising model

Miguel Ballesteros; Ramsès H. Mena; Josè Luis Pèrez; Gabor Toth

doi:10.1371/journal.pone.0339060

Abstract

We study the problem of community detection in a general version of the block spin Ising model featuring M groups, a model inspired by the Curie-Weiss model of ferromagnetism in statistical mechanics. We solve the general problem of identifying any number of groups with any possible coupling constants. Up to now, the problem was only solved for the specific situation with two groups of identical size and identical interactions, see [1, 2]. Our results can be applied to the most realistic situations, in which there are many groups of different sizes and different interactions. In addition, we give an explicit algorithm that permits the reconstruction of the structure of the model from a sample of observations based on the comparison of empirical correlations of the spin variables, thus unveiling easy applications of the model to real-world voting data and communities in biology.

Citation: Ballesteros M, Mena RH, Pèrez JL, Toth G (2026) Detection of an arbitrary number of communities in a block spin Ising model. PLoS One 21(3): e0339060. https://doi.org/10.1371/journal.pone.0339060

Editor: Pablo Martin Rodriguez, Federal University of Pernambuco: Universidade Federal de Pernambuco, BRAZIL

Received: February 22, 2025; Accepted: December 1, 2025; Published: March 17, 2026

Copyright: © 2026 Ballesteros et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: There is no experimental data related to the article. The simulated data in the supplementary material, together with the Python scripts used to generate the data are available at the public repository https://github.com/gabor-toth-ac/Group-Reconstruction-Example.

Funding: This study was financially supported by Sehciti (formerly Conahcyt) in the form of a grant (FORDECYT-PRONACES 429825/2020, recently renamed Project CF-2019/429825) received by MB. This study was also financially supported by Sehciti (formerly Conahcyt) in the form of a grant (PAPIIT-DGAPA-UNAM IN114925) received by MB. This study was also financially supported by Sehciti (formerly Conahcyt) in the form of a Beca postdoctoral grant (1203857) awarded to GT. No additional external funding was received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction and results

In the influential article [1], Berthet, Rigollet, and Srivastava introduced the block spin Ising model consisting of two groups previously defined in the statistical mechanics literature in [3] to the field of community detection, and the authors showed that for two groups of identical size and identical coupling constants within each group the exact recovery of the community structure was possible with high probability using a sample of observations assumed to be generated by the model. The number of observations required depends on the regime (or phase) of the model. Of the three regimes of the model, the high and low temperature regimes where analysed in [1]. Subsequently, Löwe and Schubert supplied the missing critical regime in [2], and demonstrated that exact recovery was possible there, too. Thus, the problem of community detection for more than two groups and/or different coupling constants remained open. We solve the problem for a wide variety of situations as far as the number of groups in the population, the sizes of the groups, and interactions between voters within each group and between different groups are concerned, and show that reconstructing the group structure is possible given a certain number of observations. The procedure we present is completely constructive and easy to implement. Thus, our results in the present article allow for the application of community detection algorithms to real-world data. Possible applications are the identification of social classes, cultural groups, or sympathisers of political parties, or other characteristics which may not be readily available in data. Using the community detection algorithm will potentially allow social scientists to better understand societal structures derived from common preferences or interdependencies. Biologists can employ the algorithms for the problem of distinguishing species based on the observation of characteristics of different specimens. An important point that should be noted is that the techniques presented in this article can in principle also be applied to any other voting model with several communities, for which the pair correlations between votes can be deduced.

The literature on the detection of communities in stochastic models is extensive. Among the models analysed, graphical models (i.e. Markov random fields) are most prominent. See [4] for a description of how Markov random fields encode dependencies between random variables. Among graphical models, the most studied are the stochastic blockmodel and the block spin Ising model. The stochastic blockmodel (introduced in [5]) has been analysed thoroughly. See [6] for an up to date exposition and [7–10] for further important results about this model. In the stochastic blockmodel, we observe a single realisation of the random graph and assume that interactions between individuals are independent. This is very different from the block spin Ising model which is the subject of the present article.

Beyond the mathematical interest inherent to these models and the problem of community detection, there are also important applications in the field of image recognition [11,12], biology and genetics [13–15], recommendation systems [16], and natural language processing [17], among others. These applications are especially important given the ubiquitous use of large amounts of data. Other applications include the use of these models applied to problems in social sciences, such as sociological behaviour [18], migration [3], economic decision making [19–21], and political science [22]. Community detection algorithms applied to these problems allow researchers to understand the socioeconomic and political structure of societies and dependencies between different groups and classes, as well as the structures underlying species diversity and interdependence.

Compared to the stochastic blockmodel, there has been considerably less work about the community detection problem in Block spin Ising models, also referred to as multi-species mean-field models [23] or multi-group Curie-Weiss models [24–26]. In these models, there is a heterogeneous population of individuals subdivided into groups and each individual casts a binary vote in an election. We can also interpret the binary choice as a biological characteristic of the individual which can be observed and measured. The votes or measurements are random variables which are allowed to depend on each other (with different degrees of dependencies within each group and across group boundaries). These dependencies are regulated by the coupling constants between votes. We will formally present the model in Sect 1.1. An observer has access to the votes of all individuals in a number of elections, referenda, or measurements. The population is assumed static, and the observer sees the patterns in the voting behaviour. These patterns manifest in the way certain subsets of voters frequently tend to vote alike or contrary to each other. The observer’s problem consists of reconstructing the group structure of the population from the observed votes or measurements alone. The method consists in calculating the empirical pair correlations between individual votes and grouping together those voters who present the highest correlations in the sample, i.e. those who tend to agree with the highest frequency are assumed to belong to the same community. See Sects 2.2 and 2.3 for a complete description of the algorithm involved in the reconstruction procedure.

Rather than taking a single realisation of a random graph as in the stochastic blockmodel, in the block spin Ising model we observe a sample of n realisations of the model, i.e. n voting configurations of the shape which are realisations of the random vector assumed to be independent and distributed according to the probability measure of the block spin Ising model defined in Definition 1.5. Each realisation of contains the votes or measurements of the entire population in a single election, referendum, or observation. The problem we study consists of reconstructing the structure of the model in terms of assigning each of the random variables X_i, , (we will write for any ) that represents the vote of one individual to one of the M groups. Having access to a certain number of observations permits us to recover the structure with high probability for the possible structures of practical importance the model can assume.

1.1 The block spin Ising model

The block spin Ising model is defined for -valued random variables X_i indexed by , commonly referred to as spins. Since the application to voting and the observation of biological characteristics of a heterogeneous population is the main motivation of the study in this article, we will instead speak of votes and voting configurations. The voters are sorted into groups. Each group is of size , such that . The group identification function

(1.1)

assigns each voter to their respective group . Hence, the definition of the group sizes above implies , where for any countable set A the cardinality of A is denoted by .

Definition 1.1. We define the group size parameters for each group l

Remark 1.2. We will assume these constants , , exist and are positive. This implies in particular, that we can assume each group consists of at least two different individuals for large enough N, a fact we will use in the proofs of our results. While the population goes to infinity, the number of groups remains constant at M.

For any voting configuration , the Hamiltonian is given by

(1.2)

This Hamiltonian allows for different interactions between pairs of votes, such that if one belongs to group l and the other to group m, they interact by a coupling constant . These coupling constants subsume the inverse temperature parameter β found in the single-group Curie-Weiss model (see e.g. [27, Chapters IV and V] for a thorough discussion of the classical Curie-Weiss model). In fact, in the special case of M = 1, the definition of reduces to the Hamiltonian of the Curie-Weiss model. We note that, depending on the signs of the coupling parameters J_l,m, the value of varies with the voting configuration. In the present context of voting, we interpret as a measure of conflict in society surrounding a particular issue which gives rise to votes . If all coupling parameters are positive, there are two configurations that have the lowest possible level of conflict: the unanimous configurations and . All other configurations receive higher values . The highest levels are achieved when the votes are evenly split in each group (or closest to it in case of odd group sizes). We define the coupling matrix J to have entries equal to the constants above:

(1.3)

Definition 1.3. Let A > 0 stand for the statement that is a symmetric positive definite matrix, i.e. the Euclidean inner product is positive for all , and let mean A is positive semi-definite, i.e. holds for all .

Remark 1.4. We will assume J > 0 throughout this article. This assumption can be interpreted as the existence of stronger coupling between voters belonging to the same group versus voter coupling between groups. The assumption also allows for negative coupling between different groups, which models an antagonistic relationship between groups.

Definition 1.5. Let with J > 0 and as defined in (2). The block spin Ising model’s probability measure , which gives the probability of each of the 2^N voting configurations, is defined by

(1.4)

for all , where Z is a normalisation constant which depends on N and J.

1.2 Results

The model has three distinct regimes, in each of which the model behaves in a distinct fashion. This is reflected in the limiting distribution of the vector of suitably normalised sums of votes , where S_l is the sum of the votes in group . This limiting distribution is distinct in each of the three regimes. These results can be found in several articles, e.g. [23,26,28]. The three regimes are called (in adaptation of the corresponding terms for the classical single-group Curie-Weiss model) the high temperature, the critical, and the low temperature regime. The high temperature regime is characterised by the difference between the identity matrix and the coupling matrix J being a positive definite matrix, i.e. I–J > 0. The model is in the critical regime when but . Finally, the low temperature regime is equivalent to . High temperature corresponds to high disorder, meaning the voters tend to have a mind of their own with weak dependence of the votes. Low temperature corresponds to strong couplings between votes. Thus the regime depends on the social cohesion present in a society. Culturally homogeneous societies would likely fall into the low temperature regime, whereas confederations with loosely related heterogeneous groups would be expected to fall into the high temperature regime assumption.

We will study the problem of detecting the M communities in the model in the high temperature and the low temperature regimes. In the latter case, we will make an assumption (see Definition 1.8) about the Hessian matrices at the minima of the function defined by

(1.5)

which plays a crucial role in the analysis of the block spin Ising model. It appears in the de Finetti representation of the probability measure (see [26, Theorem 32] for more details). The de Finetti representation of the probabilities 1.4 as an integral over . The function F plays the role of exponentially weighting the points in the set . A smaller value of yields a higher weight in the integral. Thus, the minima of F are of particular importance, and their location depends on the regime the model is in.

Definition 1.6. We will use the symbol C for positive constants which are independent of N but may depend on the coupling matrix J and the group size parameters . We make no claim as to the precise value of these constants, and in fact they may change from one line of a calculation to the next.

Theorem 1.7. In the high temperature regime, i.e. for I–J > 0, set H: = J⁻¹−I. Then there is a positive constant such that for all

holds, where we use the notation

This theorem is proved in Sect 3.

As in the high temperature regime, in the low temperature regime, i.e. , we address the non-critical case. Our definition of the low temperature non-critical case is similar to the corresponding definition for high temperature.

Definition 1.8 (Low temperature non-critical case). In the case that , we say that the model is in the low temperature regime. Moreover, if the Hessian of F at every point where the minimum is attained is positive definite, we say that the model is non-critical.

Lemma 1.9. In the low temperature non-critical case, the number of points where the minimum of F is attained is finite. We denote them by

(1.6)

and the corresponding Hessian of F at the point z is denoted by H_k and is invertible for every .

Proof: Since F is dominated by the term for large , it follows that all points where the minimum is attained must be contained in a compact set (we recall that J⁻¹ is positive definite). Suppose that z is a point where the minimum is attained. The Hessian of F is positive definite at z by assumption. This implies that there is a neighbourhood of z where F attains its minimum only at z. We obtain that the set of points where the minimum is reached consists of isolated points, and it is a closed set because F is continuous. This set cannot be infinite, otherwise it would contain an accumulation point of it since B is compact, and this accumulation point would not be isolated. □

We define

(1.7)

Remark 1.10. Notice that the map

is bijective. We assume that there are no l and m with such that holds for every . This is equivalent to assuming that

The above implies that for every either or and , where the inequality derives from the non-collinearity of and and the equality condition in the Cauchy-Schwarz inequality. The vectors play a role in the group identification procedure in the high temperature regime, where it will allow us to separate the pair correlations belonging to different groups.

Theorem 1.11. Assume the model is in the low temperature regime and non-critical as per Definition 1.8, i.e. the Hessian H_k of F at z_k is positive definite for all k. Then the function F defined in (5) has a finite number of minima, . There is a positive constant such that for all

This theorem is proved in Sect 3.

The main results of this article, the proof that the community detection problem has a solution with high probability and the algorithms for the reconstruction of the group structure of the model are found in Sects 2.2 and 2.3, respectively. Theorems 1.7 and 1.11 allow us to calculate approximations for the correlations between votes for large N. We will use these approximations as a benchmark for the empirical correlations obtained from a sample of observations of voting configurations from the model:

(cf. Formulae (2.13) and (2.14)). From the asymptotic analysis of the block spin Ising model in Theorems 1.7 and 1.11, we know the large N value of , and by the law of large numbers the empirical correlations converge to these values as n goes to infinity. The empirical correlations also satisfy a large deviations principle (see Lemmas 2.7 and 2.21 and Proposition 2.2) which allows us to upper bound the probability of a significant deviation of the empirical correlations from these values by a function which exponentially decays to 0 with the number of observations n. Thus, with high probability, we obtain a sample of observations that are typical in that there are no large deviations of the empirical correlations from their expected values. Then we use an iterative algorithm to define an equivalence relation on the set of voters that corresponds to the underlying group structure represented by ι which is assumed to be unknown to the observer of the voting configurations . We next describe the algorithm informally for the high temperature regime (the low temperature regime algorithm is structurally similar). See the proofs of Theorems 1.12 and 1.13 for a rigorous description and Sect 1.3 for an example of the application of this algorithm.

We take a correlation

which according to Corollary 2.4, in the high temperature regime, corresponds to voters , who both belong to the same group, namely group with the largest value (cf. Theorem 1.7). We then identify the largest empirical correlations, which according to Lemma 2.21 belong to those voters i and j who indeed belong to said group, i.e. . Having identified the voters belonging to group l, we remove from our set of empirical correlations all those elements that correspond to voter pairs which include at least one of the indices just identified as belonging to group l. Then we pick the group with the next largest value , and repeat the last step. In each step, we identify at least one group of voters. Hence, the algorithm terminates after at most M steps. Thus, we provide an explicit algorithm of how to reconstruct the structure of the model.

We use the phrase ‘Property f holds with high probability as n goes to infinity’ in the sense that for any constant there is a constant such that for all the probability that f holds is at least .

The main theorems of this article are

Theorem 1.12 (Group Identification and Reconstruction Procedure in the High Temperature Regime). Let the model be in the high temperature regime. There is a fixed natural number and a constant (cf. Definition 2.5) such that if and , a sample of n observations allows us to recover the group partition with high probability. The probability that we cannot recover the group partition is bounded above by the exponentially decaying function

and

Theorem 1.13 (Group Identification and Reconstruction Procedure in the Low Temperature Regime). Let the model be in the low temperature regime and non-critical. Let be the minima of the function F, and assume there are no l and m with such that holds for every . There is a fixed natural number and a constant (cf. Definition 2.19) such that if and , a sample of n observations allows us to recover the group partition with high probability. The probability that we cannot recover it is bounded above by the following exponentially decaying function

Remark 1.14. The upper bounds for the probabilities that the community detection problem has a solution are tight in the sense that Sanov’s Theorem provides a corresponding lower bound as well, and said lower bound features an exponential term

respectively, for some in the high temperature regime and some in the low temperature regime. As such, in the general case of M groups, it is not possible to improve these bounds.

Remark 1.15. In the original articles [1,2] concerning the community detection problem in the block spin Ising model, the two groups considered where identical in all respects – group sizes and coupling constants. Therefore, community detection was only possible in terms of finding an equivalence relation on the set of voters such as those constructed in the proofs of Theorems 1.12 and 1.13. For the more general situation of several groups of arbitrary sizes and coupling constants, we can go beyond this solution and identify the equivalence classes detected and specific groups of the model. One such case is when all groups are of different sizes, i.e. for all . Another example is for all . In fact, there is a way to explicitly identify the groups in most cases. Also, the algorithm allows identifying the group structure in case there is imperfect information about the structure of the model in terms of group sizes and coupling constants. An example of the reconstruction algorithm using simulated data which illustrates that the group sizes are not strictly necessary to reconstruct the group structure of the model is included as Supplementary Material to this article.

In the next subsection, we present an example for the application of the community detection algorithm to the problem of extremist political movements. The rest of this article consists of Sect 2 where we formally state and prove the community detection results, Sect 3 in which we prove Theorems 1.7 and 1.11, and finally Sect 4 which includes some results we use in the proofs in order to make the article as self-contained as possible.

1.3 Detection of the members of extremist political parties

In this section, we describe how our community detection algorithms can be employed to detect members of political movements or parties. In many countries, extremist political parties are forbidden by the authorities. Membership is usually prohibited and heavily penalised. As such, admitting to membership risks prosecution up to lengthy prison sentences. Even if an extremist political party is not prohibited, or else it is not a formal party but more of a loose movement or association of like-minded people, it may still be the case that members do not openly identify as such. Censure from mainstream segments of the population leads to the phenomenon of self-censoring and avoidance of stating their membership openly. Thus, if we have the task of quantifying the membership numbers of extremist parties in a certain population of N people, asking for their political affiliation directly may not lead to accurate numbers, as extremist affiliations will likely be underrepresented among the responses. Instead, we take the indirect route of applying a questionnaire with n questions with binary responses. These can be yes/no questions but also questions asking about two possible solutions to current political or societal problems.

Suppose there are two political parties in a country. Party 1 is an extremist political movement shunned by sympathisers of party 2, which is a mainstream moderate party, and ‘party 3’, the remainder of the population which is not politically engaged. We endeavour to assign each respondent correctly to one of the three parties. Suppose we have a sample of n observations of voting configurations

obtained from the responses to our questionnaire for the entire population of voters. The assumption about the population structure implies there are three groups in our block spin Ising model (cf. Definition 1.5). The coupling matrix defined in (1.3) is assumed to be given by

which is positive definite. Moreover, it is easily checked that the matrix

is positive definite. Therefore, the model exhibits weak interactions between voters by virtue of being in the high temperature regime. We set H: = J⁻¹−I (cf. Theorem 1.7), calculate

and order the diagonal entries of H⁻¹:

We have

and set

Next, we calculate the empirical correlations between two votes for each pair , ,

Inspecting the empirical correlations, we notice that there is a cluster of around the value , contained in the open interval . We find that this cluster is composed of empirical correlations of the total correlations corresponding to the population as a whole. Since is the largest diagonal entry, the N₁ indices i to be found in the cluster of around the value correspond to the members of group 1, i.e. members of the extremist political party. We now exclude all indices i belonging to extremists from our set of empirical correlations and thus obtain a reduced set of of . We inspect this set of correlations and find that there is a cluster of around the value , in an open interval , which contains correlations. The N₂ indices to be found in these correlations belong to members of the mainstream political party. We exclude all N₂ of these indices from our correlations. The remaining correlations featuring N₃ indices belong to non-political members of the population. We note that the fortuitous clustering of the empirical correlations around the values is, in fact, a high probability event, provided the sample size n is large enough. (Cf. Theorems 1.12 and 1.13.) This concludes the reconstruction of this model’s group structure from the voting data in the sample.

2 Exact recovery of the group structure

In this section we prove the two main Theorems 1.12 and 1.13 of this article. We will use the Theorems 1.7 and 1.11, which are proved in Sect 3.

2.1 Large deviations theory

Here we will define the probability space on which the samples of any possible size generated by the model defined in Definition 1.5 live.

Let

be equipped with the σ-algebra of all subsets of . Let

Moreover, we define

provided with the σ-algebra of all subsets. We set

with the σ-algebra generated by the cylinder sets of the from , and .

We denote by

the elements of Ω. Here, for every t. We set the projections given by

and

The random vector X corresponds to observation . We define a product measure on using Kolmogorov’s extension theorem such that each X has a Curie-Weiss distribution according to Definition 1.5: for all and all ,

We define the random variables

We will next define a probability measure on Ω under which, for fixed i and j, are independent and identically distributed (i.i.d.) random variables, for fixed , are in general neither independent nor identically distributed. We identify with the random variables whose joint distribution is given by in Definition 1.5. First we define the probability measure by

For every , we define the product measure on by

Using Kolmogorov’s extension theorem, we define the measure on such that

for all such cylinder sets described above.

We denote by the empirical measure associated to the process , i.e.

(2.1)

We denote by the set of probability measures on , equipped with the total variation topology. Let be the set of real-valued functions on and we identify it with using the map . We provide with the topology of We use the symbol for the identity and for every and every we define

Therefore, we identify every measure with a continuous real-valued function on We set

(2.2)

the average of over all . We set for every

(2.3)

which is a closed set that does not contain . Note that . We set

For every two measures, we denote by

the relative entropy of with respect to θ. Notice that this function is non-negative and concave as a function of , and θ is its only minimum [29, Definition 2.1.5 and the following remark].

Lemma 2.1. Let and suppose that are such that

(2.4)

It follows that

(2.5)

Proof: Set and . Inequality (2.4) implies that

(2.6)

Moreover, we have that

(2.7)

As stated above, the minimum of as a function of is . It follows that f attains its minimum at r, and . A first order Taylor series around r implies that

(2.8)

where lies between r and t, and therefore . Displays (2.6)-(2.8) imply the desired result. □

We need a very precise upper bound for the tail probabilities under . Therefore, we will use results from the proof of Sanov’s Theorem to establish said upper bound instead of taking the statement from the theorem itself. The result given in the theorem is in terms of the of a sequence in the number of observations n, but we will use an upper bound to be found in (2.9) which allows us to establish precisely how large n has to be in order to have an upper bound on the probability of atypical observations.

Proposition 2.2. For every and every , we have the upper bound

Proof: The result follows by the proof of Sanov’s Theorem [29, Theorem 2.1.10]. We use Eq. (2.1.12) in [29] which in our case reads as the following equation (we denote by the probability induced by ):

(2.9)

where , see above Lemma 2.1.12 in [29]. The result follows by (2.9), the definition of and Lemma 2.1, which implies

for every □

2.2 High temperature group identification

Lemma 2.3. Let A be a positive definite matrix. Then

holds.

Proof: Let e_l, , stand for the l-th vector of the canonical basis of . Since A is positive definite, it has a square root A^1/2 which is itself positive definite. We have for all

The strict inequality above follows from the Cauchy-Schwarz inequality, since the invertibility of A^1/2 implies and are not linearly dependent, and thus the inequality is strict. As , we conclude that

□

Recall that H: = J⁻¹−I>0 holds in the high temperature regime. The previous Lemma implies

Corollary 2.4. We have for all with

Using this corollary, we define the constants and , which are used in Theorem 1.12.

Definition 2.5. Let

We define

(2.10)

which is a strictly positive constant by Corollary 2.4, and

(2.11)

Moreover, we choose a fixed natural number satisfying (cf. Theorem 1.7)

(2.12)

for every .

Remark 2.6. Notice that Proposition 2.2 implies

In the following, we use the standard notation in which a random sample (or an observation) is denoted by lowercase letters corresponding to the random variables represented by uppercase letters. More precisely, for a , we denote by

(2.13)

We fix for the rest of this section. Given ω, we obtain a sample and calculate the empirical correlations (the realisation of (2.2) given ω)

(2.14)

for all with .

Lemma 2.7. Assume that . Let . Then we have

(2.15)

Proof: Recall and . It follows that

Theorem 1.7 and (2.12) imply

so we obtain

□

Let be an enumeration of the set with

We define a partition of the set of all groups based on the values h_(u) above.

Definition 2.8. Let be a partition of induced by

Next we recursively define two set sequences which will be employed in the reconstruction algorithm of the group structure of the model.

Definition 2.9. Let . We define

and

Assume and have been defined for some . Set

and

Remark 2.10. It is a direct consequence of Definition 2.9 that

In what follows, we will state and prove a series of lemmas, which taken together will allow us to construct an equivalence relation on (see Proposition 2.17) based only on information contained in the sample of observations, and more specifically the empirical correlations . This equivalence relation corresponds to the partition into the sets , , and thus constitutes the solution to the community detection problem stated in Theorem 1.12.

Lemma 2.11. Let . Then we have

Proof: Recall that we fixed and hence is fixed as well. Suppose . By Definition 2.9,

holds. Lemma 2.7 supplies the upper bound

The two inequalities yield

(2.16)

Note that holds if and only if . By Definition 2.5,

and

The above and inequality (2.16) give

Thus, we have shown that implies . The converse is proved analogously. □

Lemma 2.12. Let . Suppose we have

for all and all . Then

holds for all .

Proof: Let . To obtain a contradiction, suppose . Then there is some such that or for some v < u. Without loss of generality, suppose . By assumption, this implies . However, this contradicts because of . Therefore, must hold. Since , Lemma 2.7 yields

and thus holds by definition of G_u:

□

Lemma 2.13. Let . Suppose we have

for all and all . Then

Proof: Assume . By Definition 2.9, there is no such that

(2.17)

for some v < u, as the latter would imply and thus in contradiction to our assumption.

To obtain a contradiction, assume that for some v < u. Since each group has at least two individuals (cf. Remark 1.2), there is an with . By assumption, this implies as v < u. This contradicts (2.17), and hence we conclude that . follows by the same arguments. □

Lemma 2.14. Let . Suppose we have

for all and all . Then

holds for all .

Proof: Let . By the last lemma, we have for all v < u. As , we have

(2.18)

To obtain a contradiction, assume . Then we have

by Definition 2.5, which contradicts (2.18). Therefore, must hold. Next assume for some w>u. Once again, Definition 2.5 implies

and we have a contradiction to (2.18). Hence, is proved. □

Corollary 2.15. Let . Suppose we have

for all and all . Then

holds for all .

Proof: The statement follows from Lemmas 2.12 and 2.14. □

Proposition 2.16. For all and all , we have

Proof: Together, Lemma 2.11 and Corollary 2.15 constitute a proof by induction over of the proposition’s statement. □

Proposition 2.17. Let the relation ∼ on be defined by

Then ∼ is an equivalence relation, and, for all , holds if and only if .

Proof: By Proposition 2.16, for all , the existence of a such that is equivalent to . The function yields a partition of with the equivalence classes given by , , i.e. the indices belonging to group l. Therefore, is equivalent to the statement that i and j belong to the same equivalence class given by some group . □

For the convenience of the reader, we restate Theorem 1.12. The theorem says that the problem of community detection has a solution which can be identified with high probability.

Theorem 2.18 (Group Identification and Reconstruction Procedure in the High Temperature Regime). Let and be as specified in Definition 2.5. Suppose that and n is a natural number. A sample of n observations allows us to recover the group partition of with high probability. The probability that we cannot recover the group partition is bounded above by the exponentially decaying function

Proof: Recall the definition of in (18) and that we have fixed . By Proposition 2.17, the set sequence G_u, , allows us to define an equivalence relation ∼ on which corresponds to the group structure of the model in the sense that for all holds if and only if i and j belong to the same group. Then the probability that recovery is impossible is bounded above by according to Remark 2.6. This concludes the proof of Theorem 1.12. □

2.3 Low temperature group identification

For the next definition, recall (1.7):

the notation for the minima of the function F, and Remark 1.10, and recall that we defined the subsets of Ω

Definition 2.19. We set

(note that is positive by Remark 1.10), and

(2.19)

Moreover, we choose a fixed natural number satisfying (cf. Theorem 1.11)

(2.20)

for every .

Remark 2.20. Notice that Proposition 2.2 implies that

Recall that Theorem 1.11 states

We fix for the rest of this section. Given ω, we obtain a sample . Recall the definition of the empirical correlations

for all with .

Lemma 2.21. Assume that . Set . Then the following inequality holds:

(2.21)

Proof: Since , it follows that (recall that and and (2.19), (2.3))

(2.22)

for all . Recalling Theorem 1.11,

we obtain

(2.23)

for every . □

Let be an enumeration of the set with

We define a partition of the set of all groups based on the values g_(u) above.

Definition 2.22. Let be a partition of induced by

Next we recursively define two set sequences which will be employed in the reconstruction algorithm of the group structure of the model.

Definition 2.23. Let . We define

and

Assume and have been defined for some . Set

and

Remark 2.24. It is a direct consequence of Definition 2.23 that

The proof of the next proposition follows the same steps as that of Proposition 2.17, using Lemma 2.21 instead of Lemma 2.7.

Proposition 2.25. Let the relation ∼ on be defined by

Then ∼ is an equivalence relation, and, for all , holds if and only if .

Theorem 2.26 (Group Identification and Reconstruction Procedure in the Low Temperature Regime). Let the model be in the low temperature regime and non-critical. Let be the minima of the function F, and assume there are no l and m with such that holds for every . There is a fixed natural number and a constant (cf. Definition 2.19) such that if and , a sample of n observations allows us to recover the group partition with high probability. The probability that we cannot recover it is bounded above by the exponentially decaying function

Proof: Recall the definition of in (2.19) and that we have fixed . By Proposition 2.25, the set sequence G_u, , allows us to define an equivalence relation ∼ on which corresponds to the group structure of the model in the sense that for all holds if and only if i and j belong to the same group. Then the probability that recovery is impossible is bounded above by according to Remark 2.20. This concludes the proof of Theorem 1.13. □

3 Proofs of Theorems 1.7. and 1.11

Recall that

We define for fixed with

(3.1)

and

(3.2)

Then the representation

(3.3)

holds by Theorem 4.1 in the Appendix.

3.1 Proof of Theorem 1.7

Lemma 3.1. In the high temperature regime, where H = J⁻¹−I>0, 0 is the only minimum of F and the lower bound

holds, where c_H>0 is half of the minimum eigenvalue of H.

Proof: Let be the Hessian matrix of F at . A direct calculation using hyperbolic trigonometric identities shows that

where is the Kronecker delta. This implies that is positive definite for all . Since , 0 is the only minimum of F, and, due to , F is strictly positive on the complement of {0}. Moreover, a Taylor series expansion of the function in t of order 1 with remainder of order 2 in integral form shows that

where in the above equation we used and the spectral theorem which yields strict positivity of the at most M eigenvalues of the self-adjoint H. □

We restate Theorem 1.7 for the convenience of the reader:

Theorem 3.2. In the high temperature regime, i.e. for I–J > 0, set H: = J⁻¹−I. Then there is a positive constant such that for all

holds, where we use the notation

Proof: We will show the statement for . The case can be handled analogously. For any set , let be the complement. We will work with Z₂(N) and Z₀(N) defined in (3.1) and (3.2). We denote by

the ball with radius r, centred at z, and we set (see Lemma 3.1)

which is selected in order to fulfil . We split Z₂(N) and Z₀(N) into integrals over and over . Lemma 3.1 implies that

(3.4)

and likewise we conclude that

(3.5)

and

(3.6)

Notice that and . This implies that all even derivatives of are polynomials with even powers of and all odd derivatives of are polynomials with odd powers of . Therefore, all such derivatives are uniformly bounded on , and odd derivatives evaluated at 0 vanish (and the same holds for F). Taking this into consideration, and using that and the Hessian of F at 0 is H, the remainder formula in Taylor’s Theorem leads to

(3.7)

for some constant C, independent of x (for we use Taylor’s Theorem, for we use that F(x) and grow quadratically in ). Note that for positive s and t, , the mean value theorem implies that . This, together with the fact that is bounded and (3.7), implies that

(3.8)

Likewise, we get

(3.9)

The Taylor series of at 0 (up to second order, using that and that is uniformly bounded), implies that there is a constant C such that

and this leads to

(3.10)

Next, Proposition 4.2 implies that

(3.11)

It follows from (3.1), (3.4), (3.6),(3.8), (3.10), and (3.11) that

(3.12)

and it follows from (3.2), (3.5), (3.6), (3.9), and (3.11) that

(3.13)

Set

Then, using (3.12) and (3.13), we obtain

(3.14)

Displays (3.3) and (3.14) lead to the desired result. □

3.2 Proof of Theorem 1.11

Recall the definition of F given in (1.5) and Lemma 1.9.

Lemma 3.3. There exist constants and c > 0 such that the following estimation holds:

(3.15)

where α is the minimum value of F, and the balls are disjoint for . Also, for every k and every ,

(3.16)

Proof: The continuity of the Hessian of F shows that there are small enough numbers and such that is positive in for every k (and we take δ small enough that these balls are disjoint). A first order Taylor series expansion (with second order remainder in the integral form) for the function shows that

(3.17)

for every . For large enough , F(x) is dominated by , where c > 0 is any constant smaller than the smallest eigenvalue of J⁻¹. For large enough x, there is a constant c such that . We conclude that there is a constant c > 0 such that

(3.18)

for large enough x and every k. Eqs (3.17) and (3.18) imply (3.15) for every x in the complement of a compact set not intersecting , for every k. On , attains its minimum, which we call (it cannot be smaller or equal 0 because F attains its minimum only at the points z). On the set , (3.15) holds true as well. This is verified with the following estimation:

(3.19)

for every .

We restate Theorem 1.11 for the reader’s convenience: □

Theorem 3.4. Assume the model is in the low temperature regime and non-critical as per Definition 1.8, i.e. the Hessian H_k of F at z_k is positive definite for all k. Then the function F defined in (1.5) has a finite number of minima, . There is a positive constant such that for all

Recall the definitions of Z₂(N), and Z₀(N) in (3.1) and (3.2), as well as the expression

given in Theorem 4.1.

We set (cf. Lemma 3.3)

which is selected in order to fulfil (and we assume that N is large enough such that ). Lemma 3.3 implies that

(3.20)

and likewise we conclude that

(3.21)

and (see (3.16))

(3.22)

Notice that and . This implies that all even derivatives of are polynomials with even powers of and all odd derivatives of are polynomials with odd powers of . Therefore, all such derivatives are uniformly bounded in . Next, we recall that and the Hessian of F at z is H_k. A second order Taylor series expansion (with third order remainder in the mean value form) for the function shows that, for every k,

(3.23)

on . Notice that for positive s and t with we have that the mean value theorem implies that . This, together with the fact that is bounded and (3.23), implies that

(3.24)

Likewise, we get

(3.25)

We denote by

the second order Taylor series of at . Using that all derivatives of are uniformly bounded, Taylor’s Theorem implies that

and this implies that

(3.26)

Next, since the function is odd and its integral vanishes, Proposition 4.2 implies that

(3.27)

It follows from (3.1), (3.20), (3.22), (3.24), (3.26), and (3.27) that

(3.28)

and it follows from (3.2), (3.21), (3.22), (3.25), and Proposition 4.2 that

(3.29)

Eqs. (3.28) and (3.29) imply that

(3.30)

Eqs. (3.30) and (3.3) lead to the desired result.

4 Auxiliary results

Recall the definitions of Z₂(N) and Z₀(N) from (3.1) and (3.2) for fixed with

and

The following theorem is a special case of [26, Theorem 32] a result proved for the first time in [30]. We present it here with a short proof for the reader’s convenience.

Theorem 4.1. The following identity holds

Proof: The Gaussian integral implies, expanding the square, that

Similarly, we obtain for any positive definite matrix

(4.1)

the latter equality being the result of a change of variables u = N⁻¹Aw. The first equality is straightforward for diagonal matrices A (using the one-dimensional version). Since A is self-adjoint, we can diagonalise it: A = U^TDU for an orthogonal matrix U and a diagonal matrix D. A change of variables gives the desired result (using the diagonal case with Uy instead of y).

For every

we set

Recall Definition 1.5 and note that

Using (4.1) with A = J and and the change of variables u = N⁻¹Jw, we obtain

Noting that the partition function Z is given by , we get

(4.2)

We finish by proving that the numerator is Z₂(N) and the denominator Z₀(N). The equation

and the associativity and commutativity of addition yield

Then we note that

since for any . Similarly,

and for all

We have

where we again used that for all . Similarly, we obtain an expression for the denominator of (4.2):

The result follows by dividing numerator and denominator by 2. □

The following result pertaining to Gaussian integrals is used repeatedly throughout the proofs. It is an immediate consequence of the definition of a multivariate normal distribution. See, e.g., [31, pp. 176–177] for a reference.

Proposition 4.2. Fix . We have for all invertible

Supporting information

S1 Text.

https://doi.org/10.1371/journal.pone.0339060.s001

(PDF)

References

1. Berthet Q, Rigollet P, Srivastava P. Exact recovery in the Ising blockmodel. Ann Statist. 2019;47(4):1805–34.
- View Article
- Google Scholar
2. Löwe M, Schubert K. Exact recovery in block spin ising models at the critical line. Electron J Stat. 2020;14:1796–815.
- View Article
- Google Scholar
3. Contucci P, Ghirlanda S. Modelling society with statistical mechanics: an application to cultural contact and immigration. Qual Quant. 2007;41:569–78.
- View Article
- Google Scholar
4. Lauritzen S. Graphical models. Oxford: Clarendon Press; 1996.
5. Holland PW, Blackmond Laskey K, Leinhardt S. Stochastic blockmodels: first steps. Soc Netw. 1983.
- View Article
- Google Scholar
6. Abbe E. Community detection and stochastic block models: recent developments. J Mach Learn Res. 2018;18(177):1–86.
- View Article
- Google Scholar
7. Mossel E, Neeman J, Sly A. Reconstruction and estimation in the planted partition model. Probab Theory Relat Fields. 2014;162(3–4):431–61.
- View Article
- Google Scholar
8. Mossel E, Neeman J, Sly A. Belief propagation, robust reconstruction and optimal recovery of block models. Ann Appl Probab. 2016;26:2211–56.
- View Article
- Google Scholar
9. Gao C, Ma Z, Zhang AY, Zhou HH. Achieving optimal misclassification proportion in stochastic block models. J Mach Learn Res. 2017;18.
- View Article
- Google Scholar
10. Amini AA, Levina E. On semidefinite relaxations for the block model. Ann Statist. 2018;46(1):149–79.
- View Article
- Google Scholar
11. Geman D, Geman S. Bayesian image analysis. In: Bienenstock E, Soulié FF, Weisbuch G, editors. Disordered systems and biological organization. vol. 20 of NATO ASI Series. Berlin, Heidelberg: Springer; 1986.
- View Article
- Google Scholar
12. Besag J. On the statistical analysis of dirty pictures. J Roy Statist Soc Ser B. 1986;48(3):259–302.
- View Article
- Google Scholar
13. Lauritzen S, Sheehan NA. Graphical models for genetic analyses. Statist Sci. 2003;18(489–514).
- View Article
- Google Scholar
14. Chen J, Yuan B. Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics. 2006;22(18):2283–90. pmid:16837529
- View Article
- PubMed/NCBI
- Google Scholar
15. Ballesteros M, Garro G. A model and a numerical scheme for the description of distribution and abundance of individuals. J Math Biol. 2022;85(4):31. pmid:36114925
- View Article
- PubMed/NCBI
- Google Scholar
16. Linden G, Smith B, York J. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 2003;7(1):76–80.
- View Article
- Google Scholar
17. Manning C, Schütze H. Foundations of statistical natural language processing. Cambridge, MA: MIT Press; 1999.
18. Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM. A survey of statistical network models. Foundations and Trends^® in Machine Learning. 2010;2(2):129–233.
- View Article
- Google Scholar
19. Brock WA, Durlauf SN. Discrete choice with social interactions. Review of Economic Studies. 2001;68(2):235–60.
- View Article
- Google Scholar
20. Opoku AA, Edusei KO, Ansah RK. A conditional curie-weiss model for stylized multi-group binary choice with social interaction. J Stat Phys. 2018.
- View Article
- Google Scholar
21. Löwe M, Schubert K, Vermet F. Multi-group binary choice with social interaction and a random communication structure – a random graph approach. Physica A. 2020;556.
- View Article
- Google Scholar
22. Kirsch W, Toth G. Optimal weights in a two-tier voting system with mean-field voters. Soc Choice Welf. 2025.
- View Article
- Google Scholar
23. Fedele M, Contucci P. Scaling limits for multi-species statistical mechanics mean-field models. J Stat Phys. 2011;144:1186–205.
- View Article
- Google Scholar
24. Kirsch W, Toth G. Two groups in a Curie-Weiss model. Math Phys Anal Geom. 2020;23(2).
- View Article
- Google Scholar
25. Kirsch W, Toth G. Two Groups in a Curie–Weiss model with heterogeneous coupling. J Theor Probab. 2019;33(4):2001–26.
- View Article
- Google Scholar
26. Kirsch W, Toth G. Limit theorems for multi-group Curie–Weiss models via the method of moments. Math Phys Anal Geom. 2022;25(4).
- View Article
- Google Scholar
27. Ellis R. Entropy, large deviations, and statistical mechanics. Whiley; 1985.
28. Knöpfel H, Löwe M, Schubert K, Sinulis A. Fluctuation results for general block spin Ising models. J Stat Phys. 2020;178:1175–200.
- View Article
- Google Scholar
29. Dembo A, Zeitouni O. Large deviations techniques and applications. 2nd ed. New York: Springer; 1998.
30. Toth G. Correlated voting in multipopulation models, two-tier voting systems, and the democracy deficit. FernUniversität in Hagen; 2020.
31. Durrett R. Probability. Theory and Examples. 5th ed. Thomson; 2019.

[ref1] 1. Berthet Q, Rigollet P, Srivastava P. Exact recovery in the Ising blockmodel. Ann Statist. 2019;47(4):1805–34.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Löwe M, Schubert K. Exact recovery in block spin ising models at the critical line. Electron J Stat. 2020;14:1796–815.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Contucci P, Ghirlanda S. Modelling society with statistical mechanics: an application to cultural contact and immigration. Qual Quant. 2007;41:569–78.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Lauritzen S. Graphical models. Oxford: Clarendon Press; 1996.

[ref5] 5. Holland PW, Blackmond Laskey K, Leinhardt S. Stochastic blockmodels: first steps. Soc Netw. 1983.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Abbe E. Community detection and stochastic block models: recent developments. J Mach Learn Res. 2018;18(177):1–86.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref7] 7. Mossel E, Neeman J, Sly A. Reconstruction and estimation in the planted partition model. Probab Theory Relat Fields. 2014;162(3–4):431–61.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Mossel E, Neeman J, Sly A. Belief propagation, robust reconstruction and optimal recovery of block models. Ann Appl Probab. 2016;26:2211–56.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref9] 9. Gao C, Ma Z, Zhang AY, Zhou HH. Achieving optimal misclassification proportion in stochastic block models. J Mach Learn Res. 2017;18.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Amini AA, Levina E. On semidefinite relaxations for the block model. Ann Statist. 2018;46(1):149–79.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref11] 11. Geman D, Geman S. Bayesian image analysis. In: Bienenstock E, Soulié FF, Weisbuch G, editors. Disordered systems and biological organization. vol. 20 of NATO ASI Series. Berlin, Heidelberg: Springer; 1986.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref12] 12. Besag J. On the statistical analysis of dirty pictures. J Roy Statist Soc Ser B. 1986;48(3):259–302.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref13] 13. Lauritzen S, Sheehan NA. Graphical models for genetic analyses. Statist Sci. 2003;18(489–514).
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref14] 14. Chen J, Yuan B. Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics. 2006;22(18):2283–90. pmid:16837529
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref15] 15. Ballesteros M, Garro G. A model and a numerical scheme for the description of distribution and abundance of individuals. J Math Biol. 2022;85(4):31. pmid:36114925
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref16] 16. Linden G, Smith B, York J. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 2003;7(1):76–80.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref17] 17. Manning C, Schütze H. Foundations of statistical natural language processing. Cambridge, MA: MIT Press; 1999.

[ref18] 18. Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM. A survey of statistical network models. Foundations and Trends^® in Machine Learning. 2010;2(2):129–233.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref19] 19. Brock WA, Durlauf SN. Discrete choice with social interactions. Review of Economic Studies. 2001;68(2):235–60.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref20] 20. Opoku AA, Edusei KO, Ansah RK. A conditional curie-weiss model for stylized multi-group binary choice with social interaction. J Stat Phys. 2018.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref21] 21. Löwe M, Schubert K, Vermet F. Multi-group binary choice with social interaction and a random communication structure – a random graph approach. Physica A. 2020;556.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref22] 22. Kirsch W, Toth G. Optimal weights in a two-tier voting system with mean-field voters. Soc Choice Welf. 2025.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref23] 23. Fedele M, Contucci P. Scaling limits for multi-species statistical mechanics mean-field models. J Stat Phys. 2011;144:1186–205.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref24] 24. Kirsch W, Toth G. Two groups in a Curie-Weiss model. Math Phys Anal Geom. 2020;23(2).
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref25] 25. Kirsch W, Toth G. Two Groups in a Curie–Weiss model with heterogeneous coupling. J Theor Probab. 2019;33(4):2001–26.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref26] 26. Kirsch W, Toth G. Limit theorems for multi-group Curie–Weiss models via the method of moments. Math Phys Anal Geom. 2022;25(4).
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref27] 27. Ellis R. Entropy, large deviations, and statistical mechanics. Whiley; 1985.

[ref28] 28. Knöpfel H, Löwe M, Schubert K, Sinulis A. Fluctuation results for general block spin Ising models. J Stat Phys. 2020;178:1175–200.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref29] 29. Dembo A, Zeitouni O. Large deviations techniques and applications. 2nd ed. New York: Springer; 1998.

[ref30] 30. Toth G. Correlated voting in multipopulation models, two-tier voting systems, and the democracy deficit. FernUniversität in Hagen; 2020.

[ref31] 31. Durrett R. Probability. Theory and Examples. 5th ed. Thomson; 2019.