Abstract
We study the problem of community detection in a general version of the block spin Ising model featuring M groups, a model inspired by the Curie-Weiss model of ferromagnetism in statistical mechanics. We solve the general problem of identifying any number of groups with any possible coupling constants. Up to now, the problem was only solved for the specific situation with two groups of identical size and identical interactions, see [1, 2]. Our results can be applied to the most realistic situations, in which there are many groups of different sizes and different interactions. In addition, we give an explicit algorithm that permits the reconstruction of the structure of the model from a sample of observations based on the comparison of empirical correlations of the spin variables, thus unveiling easy applications of the model to real-world voting data and communities in biology.
Citation: Ballesteros M, Mena RH, Pèrez JL, Toth G (2026) Detection of an arbitrary number of communities in a block spin Ising model. PLoS One 21(3): e0339060. https://doi.org/10.1371/journal.pone.0339060
Editor: Pablo Martin Rodriguez, Federal University of Pernambuco: Universidade Federal de Pernambuco, BRAZIL
Received: February 22, 2025; Accepted: December 1, 2025; Published: March 17, 2026
Copyright: © 2026 Ballesteros et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: There is no experimental data related to the article. The simulated data in the supplementary material, together with the Python scripts used to generate the data are available at the public repository https://github.com/gabor-toth-ac/Group-Reconstruction-Example.
Funding: This study was financially supported by Sehciti (formerly Conahcyt) in the form of a grant (FORDECYT-PRONACES 429825/2020, recently renamed Project CF-2019/429825) received by MB. This study was also financially supported by Sehciti (formerly Conahcyt) in the form of a grant (PAPIIT-DGAPA-UNAM IN114925) received by MB. This study was also financially supported by Sehciti (formerly Conahcyt) in the form of a Beca postdoctoral grant (1203857) awarded to GT. No additional external funding was received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction and results
In the influential article [1], Berthet, Rigollet, and Srivastava introduced the block spin Ising model consisting of two groups previously defined in the statistical mechanics literature in [3] to the field of community detection, and the authors showed that for two groups of identical size and identical coupling constants within each group the exact recovery of the community structure was possible with high probability using a sample of observations assumed to be generated by the model. The number of observations required depends on the regime (or phase) of the model. Of the three regimes of the model, the high and low temperature regimes where analysed in [1]. Subsequently, Löwe and Schubert supplied the missing critical regime in [2], and demonstrated that exact recovery was possible there, too. Thus, the problem of community detection for more than two groups and/or different coupling constants remained open. We solve the problem for a wide variety of situations as far as the number of groups in the population, the sizes of the groups, and interactions between voters within each group and between different groups are concerned, and show that reconstructing the group structure is possible given a certain number of observations. The procedure we present is completely constructive and easy to implement. Thus, our results in the present article allow for the application of community detection algorithms to real-world data. Possible applications are the identification of social classes, cultural groups, or sympathisers of political parties, or other characteristics which may not be readily available in data. Using the community detection algorithm will potentially allow social scientists to better understand societal structures derived from common preferences or interdependencies. Biologists can employ the algorithms for the problem of distinguishing species based on the observation of characteristics of different specimens. An important point that should be noted is that the techniques presented in this article can in principle also be applied to any other voting model with several communities, for which the pair correlations between votes can be deduced.
The literature on the detection of communities in stochastic models is extensive. Among the models analysed, graphical models (i.e. Markov random fields) are most prominent. See [4] for a description of how Markov random fields encode dependencies between random variables. Among graphical models, the most studied are the stochastic blockmodel and the block spin Ising model. The stochastic blockmodel (introduced in [5]) has been analysed thoroughly. See [6] for an up to date exposition and [7–10] for further important results about this model. In the stochastic blockmodel, we observe a single realisation of the random graph and assume that interactions between individuals are independent. This is very different from the block spin Ising model which is the subject of the present article.
Beyond the mathematical interest inherent to these models and the problem of community detection, there are also important applications in the field of image recognition [11,12], biology and genetics [13–15], recommendation systems [16], and natural language processing [17], among others. These applications are especially important given the ubiquitous use of large amounts of data. Other applications include the use of these models applied to problems in social sciences, such as sociological behaviour [18], migration [3], economic decision making [19–21], and political science [22]. Community detection algorithms applied to these problems allow researchers to understand the socioeconomic and political structure of societies and dependencies between different groups and classes, as well as the structures underlying species diversity and interdependence.
Compared to the stochastic blockmodel, there has been considerably less work about the community detection problem in Block spin Ising models, also referred to as multi-species mean-field models [23] or multi-group Curie-Weiss models [24–26]. In these models, there is a heterogeneous population of individuals subdivided into
groups and each individual casts a binary vote in an election. We can also interpret the binary choice as a biological characteristic of the individual which can be observed and measured. The votes or measurements are random variables which are allowed to depend on each other (with different degrees of dependencies within each group and across group boundaries). These dependencies are regulated by the coupling constants between votes. We will formally present the model in Sect 1.1. An observer has access to the votes of all individuals in a number of elections, referenda, or measurements. The population is assumed static, and the observer sees the patterns in the voting behaviour. These patterns manifest in the way certain subsets of voters frequently tend to vote alike or contrary to each other. The observer’s problem consists of reconstructing the group structure of the population from the observed votes or measurements alone. The method consists in calculating the empirical pair correlations between individual votes and grouping together those voters who present the highest correlations in the sample, i.e. those who tend to agree with the highest frequency are assumed to belong to the same community. See Sects 2.2 and 2.3 for a complete description of the algorithm involved in the reconstruction procedure.
Rather than taking a single realisation of a random graph as in the stochastic blockmodel, in the block spin Ising model we observe a sample of n realisations of the model, i.e. n voting configurations of the shape which are realisations of the random vector
assumed to be independent and distributed according to the probability measure of the block spin Ising model defined in Definition 1.5. Each realisation of
contains the votes or measurements of the entire population in a single election, referendum, or observation. The problem we study consists of reconstructing the structure of the model in terms of assigning each of the random variables Xi,
, (we will write
for any
) that represents the vote of one individual to one of the M groups. Having access to a certain number of observations permits us to recover the structure with high probability for the possible structures of practical importance the model can assume.
1.1 The block spin Ising model
The block spin Ising model is defined for -valued random variables Xi indexed by
, commonly referred to as spins. Since the application to voting and the observation of biological characteristics of a heterogeneous population is the main motivation of the study in this article, we will instead speak of votes and voting configurations. The voters are sorted into
groups. Each group
is of size
, such that
. The group identification function
assigns each voter to their respective group
. Hence, the definition of the group sizes above implies
, where for any countable set A the cardinality of A is denoted by
.
Definition 1.1. We define the group size parameters for each group l
Remark 1.2. We will assume these constants ,
, exist and are positive. This implies in particular, that we can assume each group consists of at least two different individuals for large enough N, a fact we will use in the proofs of our results. While the population goes to infinity, the number of groups remains constant at M.
For any voting configuration , the Hamiltonian
is given by
This Hamiltonian allows for different interactions between pairs of votes, such that if one belongs to group l and the other to group m, they interact by a coupling constant . These coupling constants subsume the inverse temperature parameter β found in the single-group Curie-Weiss model (see e.g. [27, Chapters IV and V] for a thorough discussion of the classical Curie-Weiss model). In fact, in the special case of M = 1, the definition of
reduces to the Hamiltonian of the Curie-Weiss model. We note that, depending on the signs of the coupling parameters Jl,m, the value of
varies with the voting configuration. In the present context of voting, we interpret
as a measure of conflict in society surrounding a particular issue which gives rise to votes
. If all coupling parameters are positive, there are two configurations that have the lowest possible level of conflict: the unanimous configurations
and
. All other configurations receive higher values
. The highest levels are achieved when the votes are evenly split in each group (or closest to it in case of odd group sizes). We define the coupling matrix J to have entries equal to the constants above:
Definition 1.3. Let A > 0 stand for the statement that is a symmetric positive definite matrix, i.e. the Euclidean inner product
is positive for all
, and let
mean A is positive semi-definite, i.e.
holds for all
.
Remark 1.4. We will assume J > 0 throughout this article. This assumption can be interpreted as the existence of stronger coupling between voters belonging to the same group versus voter coupling between groups. The assumption also allows for negative coupling between different groups, which models an antagonistic relationship between groups.
Definition 1.5. Let with J > 0 and
as defined in (2). The block spin Ising model’s probability measure
, which gives the probability of each of the 2N voting configurations, is defined by
for all , where Z is a normalisation constant which depends on N and J.
1.2 Results
The model has three distinct regimes, in each of which the model behaves in a distinct fashion. This is reflected in the limiting distribution of the vector of suitably normalised sums of votes , where Sl is the sum of the votes in group
. This limiting distribution is distinct in each of the three regimes. These results can be found in several articles, e.g. [23,26,28]. The three regimes are called (in adaptation of the corresponding terms for the classical single-group Curie-Weiss model) the high temperature, the critical, and the low temperature regime. The high temperature regime is characterised by the difference between the identity matrix
and the coupling matrix J being a positive definite matrix, i.e. I–J > 0. The model is in the critical regime when
but
. Finally, the low temperature regime is equivalent to
. High temperature corresponds to high disorder, meaning the voters tend to have a mind of their own with weak dependence of the votes. Low temperature corresponds to strong couplings between votes. Thus the regime depends on the social cohesion present in a society. Culturally homogeneous societies would likely fall into the low temperature regime, whereas confederations with loosely related heterogeneous groups would be expected to fall into the high temperature regime assumption.
We will study the problem of detecting the M communities in the model in the high temperature and the low temperature regimes. In the latter case, we will make an assumption (see Definition 1.8) about the Hessian matrices at the minima of the function defined by
which plays a crucial role in the analysis of the block spin Ising model. It appears in the de Finetti representation of the probability measure (see [26, Theorem 32] for more details). The de Finetti representation of the probabilities 1.4 as an integral over
. The function F plays the role of exponentially weighting the points in the set
. A smaller value of
yields a higher weight in the integral. Thus, the minima of F are of particular importance, and their location depends on the regime the model is in.
Definition 1.6. We will use the symbol C for positive constants which are independent of N but may depend on the coupling matrix J and the group size parameters . We make no claim as to the precise value of these constants, and in fact they may change from one line of a calculation to the next.
Theorem 1.7. In the high temperature regime, i.e. for I–J > 0, set H: = J−1−I. Then there is a positive constant such that for all
holds, where we use the notation
This theorem is proved in Sect 3.
As in the high temperature regime, in the low temperature regime, i.e. , we address the non-critical case. Our definition of the low temperature non-critical case is similar to the corresponding definition for high temperature.
Definition 1.8 (Low temperature non-critical case). In the case that , we say that the model is in the low temperature regime. Moreover, if the Hessian of F at every point where the minimum is attained is positive definite, we say that the model is non-critical.
Lemma 1.9. In the low temperature non-critical case, the number of points where the minimum of F is attained is finite. We denote them by
and the corresponding Hessian of F at the point z is denoted by Hk and is invertible for every
.
Proof: Since F is dominated by the term for large
, it follows that all points where the minimum is attained must be contained in a compact set
(we recall that J−1 is positive definite). Suppose that z is a point where the minimum is attained. The Hessian of F is positive definite at z by assumption. This implies that there is a neighbourhood
of z where F attains its minimum only at z. We obtain that the set of points where the minimum is reached consists of isolated points, and it is a closed set because F is continuous. This set cannot be infinite, otherwise it would contain an accumulation point of it since B is compact, and this accumulation point would not be isolated. □
We define
Remark 1.10. Notice that the map
is bijective. We assume that there are no l and m with such that
holds for every
. This is equivalent to assuming that
The above implies that for every either
or
and
, where the inequality derives from the non-collinearity of
and
and the equality condition in the Cauchy-Schwarz inequality. The vectors
play a role in the group identification procedure in the high temperature regime, where it will allow us to separate the pair correlations belonging to different groups.
Theorem 1.11. Assume the model is in the low temperature regime and non-critical as per Definition 1.8, i.e. the Hessian Hk of F at zk is positive definite for all k. Then the function F defined in (5) has a finite number of minima, . There is a positive constant
such that for all
This theorem is proved in Sect 3.
The main results of this article, the proof that the community detection problem has a solution with high probability and the algorithms for the reconstruction of the group structure of the model are found in Sects 2.2 and 2.3, respectively. Theorems 1.7 and 1.11 allow us to calculate approximations for the correlations between votes for large N. We will use these approximations as a benchmark for the empirical correlations obtained from a sample of observations
of voting configurations from the model:
(cf. Formulae (2.13) and (2.14)). From the asymptotic analysis of the block spin Ising model in Theorems 1.7 and 1.11, we know the large N value of , and by the law of large numbers the empirical correlations converge to these values as n goes to infinity. The empirical correlations also satisfy a large deviations principle (see Lemmas 2.7 and 2.21 and Proposition 2.2) which allows us to upper bound the probability of a significant deviation of the empirical correlations from these values by a function which exponentially decays to 0 with the number of observations n. Thus, with high probability, we obtain a sample of observations that are typical in that there are no large deviations of the empirical correlations from their expected values. Then we use an iterative algorithm to define an equivalence relation on the set of voters
that corresponds to the underlying group structure represented by ι which is assumed to be unknown to the observer of the voting configurations
. We next describe the algorithm informally for the high temperature regime (the low temperature regime algorithm is structurally similar). See the proofs of Theorems 1.12 and 1.13 for a rigorous description and Sect 1.3 for an example of the application of this algorithm.
We take a correlation
which according to Corollary 2.4, in the high temperature regime, corresponds to voters , who both belong to the same group, namely group
with the largest value
(cf. Theorem 1.7). We then identify the largest empirical correlations, which according to Lemma 2.21 belong to those voters i and j who indeed belong to said group, i.e.
. Having identified the voters belonging to group l, we remove from our set of empirical correlations all those elements that correspond to voter pairs which include at least one of the indices just identified as belonging to group l. Then we pick the group
with the next largest value
, and repeat the last step. In each step, we identify at least one group of voters. Hence, the algorithm terminates after at most M steps. Thus, we provide an explicit algorithm of how to reconstruct the structure of the model.
We use the phrase ‘Property f holds with high probability as n goes to infinity’ in the sense that for any constant there is a constant
such that for all
the probability that f holds is at least
.
The main theorems of this article are
Theorem 1.12 (Group Identification and Reconstruction Procedure in the High Temperature Regime). Let the model be in the high temperature regime. There is a fixed natural number and a constant
(cf. Definition 2.5) such that if
and
, a sample of n observations
allows us to recover the group partition with high probability. The probability that we cannot recover the group partition is bounded above by the exponentially decaying function
and
Theorem 1.13 (Group Identification and Reconstruction Procedure in the Low Temperature Regime). Let the model be in the low temperature regime and non-critical. Let be the minima of the function F, and assume there are no l and m with
such that
holds for every
. There is a fixed natural number
and a constant
(cf. Definition 2.19) such that if
and
, a sample of n observations
allows us to recover the group partition with high probability. The probability that we cannot recover it is bounded above by the following exponentially decaying function
Remark 1.14. The upper bounds for the probabilities that the community detection problem has a solution are tight in the sense that Sanov’s Theorem provides a corresponding lower bound as well, and said lower bound features an exponential term
respectively, for some in the high temperature regime and some
in the low temperature regime. As such, in the general case of M groups, it is not possible to improve these bounds.
Remark 1.15. In the original articles [1,2] concerning the community detection problem in the block spin Ising model, the two groups considered where identical in all respects – group sizes and coupling constants. Therefore, community detection was only possible in terms of finding an equivalence relation on the set of voters such as those constructed in the proofs of Theorems 1.12 and 1.13. For the more general situation of several groups of arbitrary sizes and coupling constants, we can go beyond this solution and identify the equivalence classes detected and specific groups of the model. One such case is when all groups are of different sizes, i.e.
for all
. Another example is
for all
. In fact, there is a way to explicitly identify the groups in most cases. Also, the algorithm allows identifying the group structure in case there is imperfect information about the structure of the model in terms of group sizes and coupling constants. An example of the reconstruction algorithm using simulated data which illustrates that the group sizes are not strictly necessary to reconstruct the group structure of the model is included as Supplementary Material to this article.
In the next subsection, we present an example for the application of the community detection algorithm to the problem of extremist political movements. The rest of this article consists of Sect 2 where we formally state and prove the community detection results, Sect 3 in which we prove Theorems 1.7 and 1.11, and finally Sect 4 which includes some results we use in the proofs in order to make the article as self-contained as possible.
1.3 Detection of the members of extremist political parties
In this section, we describe how our community detection algorithms can be employed to detect members of political movements or parties. In many countries, extremist political parties are forbidden by the authorities. Membership is usually prohibited and heavily penalised. As such, admitting to membership risks prosecution up to lengthy prison sentences. Even if an extremist political party is not prohibited, or else it is not a formal party but more of a loose movement or association of like-minded people, it may still be the case that members do not openly identify as such. Censure from mainstream segments of the population leads to the phenomenon of self-censoring and avoidance of stating their membership openly. Thus, if we have the task of quantifying the membership numbers of extremist parties in a certain population of N people, asking for their political affiliation directly may not lead to accurate numbers, as extremist affiliations will likely be underrepresented among the responses. Instead, we take the indirect route of applying a questionnaire with n questions with binary responses. These can be yes/no questions but also questions asking about two possible solutions to current political or societal problems.
Suppose there are two political parties in a country. Party 1 is an extremist political movement shunned by sympathisers of party 2, which is a mainstream moderate party, and ‘party 3’, the remainder of the population which is not politically engaged. We endeavour to assign each respondent correctly to one of the three parties. Suppose we have a sample of n observations of voting configurations
obtained from the responses to our questionnaire for the entire population of voters. The assumption about the population structure implies there are three groups in our block spin Ising model (cf. Definition 1.5). The coupling matrix
defined in (1.3) is assumed to be given by
which is positive definite. Moreover, it is easily checked that the matrix
is positive definite. Therefore, the model exhibits weak interactions between voters by virtue of being in the high temperature regime. We set H: = J−1−I (cf. Theorem 1.7), calculate
and order the diagonal entries of H−1:
We have
and set
Next, we calculate the empirical correlations between two votes for each pair ,
,
Inspecting the empirical correlations, we notice that there is a cluster of around the value
, contained in the open interval
. We find that this cluster is composed of
empirical correlations of the total
correlations corresponding to the population as a whole. Since
is the largest diagonal entry, the N1 indices i to be found in the cluster of
around the value
correspond to the members of group 1, i.e. members of the extremist political party. We now exclude all indices i belonging to extremists from our set of empirical correlations and thus obtain a reduced set of
of
. We inspect this set of correlations and find that there is a cluster of
around the value
, in an open interval
, which contains
correlations. The N2 indices to be found in these correlations belong to members of the mainstream political party. We exclude all N2 of these indices from our correlations. The remaining
correlations featuring N3 indices belong to non-political members of the population. We note that the fortuitous clustering of the empirical correlations around the values
is, in fact, a high probability event, provided the sample size n is large enough. (Cf. Theorems 1.12 and 1.13.) This concludes the reconstruction of this model’s group structure from the voting data in the sample.
2 Exact recovery of the group structure
In this section we prove the two main Theorems 1.12 and 1.13 of this article. We will use the Theorems 1.7 and 1.11, which are proved in Sect 3.
2.1 Large deviations theory
Here we will define the probability space on which the samples of any possible size generated by the model defined in Definition 1.5 live.
Let
be equipped with the σ-algebra of all subsets of
. Let
Moreover, we define
provided with the σ-algebra of all subsets. We set
with the σ-algebra generated by the cylinder sets of the from
,
and
.
We denote by
the elements of Ω. Here, for every t. We set the projections
given by
and
The random vector X corresponds to observation
. We define a product measure
on
using Kolmogorov’s extension theorem such that each X
has a Curie-Weiss distribution according to Definition 1.5: for all
and all
,
We define the random variables
We will next define a probability measure on Ω under which, for fixed i and j,
are independent and identically distributed (i.i.d.) random variables, for fixed
,
are in general neither independent nor identically distributed. We identify
with the random variables whose joint distribution is given by
in Definition 1.5. First we define the probability measure
by
For every , we define the product measure on
by
Using Kolmogorov’s extension theorem, we define the measure on
such that
for all such cylinder sets described above.
We denote by the empirical measure associated to the process
, i.e.
We denote by the set of probability measures on
, equipped with the total variation topology. Let
be the set of real-valued functions on
and we identify it with
using the map
. We provide
with the topology of
We use the symbol
for the identity
and for every
and every
we define
Therefore, we identify every measure with a continuous real-valued function on
We set
the average of over all
. We set for every
which is a closed set that does not contain . Note that
. We set
For every two measures, we denote by
the relative entropy of with respect to θ. Notice that this function
is non-negative and concave as a function of
, and θ is its only minimum [29, Definition 2.1.5 and the following remark].
Lemma 2.1. Let and suppose that
are such that
It follows that
Proof: Set and
. Inequality (2.4) implies that
Moreover, we have that
As stated above, the minimum of as a function of
is
. It follows that f attains its minimum at r, and
. A first order Taylor series around r implies that
where lies between r and t, and therefore
. Displays (2.6)-(2.8) imply the desired result. □
We need a very precise upper bound for the tail probabilities under . Therefore, we will use results from the proof of Sanov’s Theorem to establish said upper bound instead of taking the statement from the theorem itself. The result given in the theorem is in terms of the
of a sequence in the number of observations n, but we will use an upper bound to be found in (2.9) which allows us to establish precisely how large n has to be in order to have an upper bound on the probability of atypical observations.
Proposition 2.2. For every and every
, we have the upper bound
Proof: The result follows by the proof of Sanov’s Theorem [29, Theorem 2.1.10]. We use Eq. (2.1.12) in [29] which in our case reads as the following equation (we denote by the probability induced by
):
where , see above Lemma 2.1.12 in [29]. The result follows by (2.9), the definition of
and Lemma 2.1, which implies
for every □
2.2 High temperature group identification
Lemma 2.3. Let A be a positive definite matrix. Then
holds.
Proof: Let el, , stand for the l-th vector of the canonical basis of
. Since A is positive definite, it has a square root A1/2 which is itself positive definite. We have for all
The strict inequality above follows from the Cauchy-Schwarz inequality, since the invertibility of A1/2 implies and
are not linearly dependent, and thus the inequality is strict. As
, we conclude that
□
Recall that H: = J−1−I>0 holds in the high temperature regime. The previous Lemma implies
Corollary 2.4. We have for all with
Using this corollary, we define the constants and
, which are used in Theorem 1.12.
Definition 2.5. Let
We define
which is a strictly positive constant by Corollary 2.4, and
Moreover, we choose a fixed natural number satisfying (cf. Theorem 1.7)
for every .
Remark 2.6. Notice that Proposition 2.2 implies
In the following, we use the standard notation in which a random sample (or an observation) is denoted by lowercase letters corresponding to the random variables represented by uppercase letters. More precisely, for a , we denote by
We fix for the rest of this section. Given ω, we obtain a sample
and calculate the empirical correlations (the realisation of (2.2) given ω)
for all with
.
Lemma 2.7. Assume that . Let
. Then we have
Proof: Recall and
. It follows that
Theorem 1.7 and (2.12) imply
so we obtain
□
Let be an enumeration of the set
with
We define a partition of the set of all groups based on the values h(u) above.
Definition 2.8. Let be a partition of
induced by
Next we recursively define two set sequences which will be employed in the reconstruction algorithm of the group structure of the model.
Definition 2.9. Let . We define
and
Assume and
have been defined for some
. Set
and
Remark 2.10. It is a direct consequence of Definition 2.9 that
In what follows, we will state and prove a series of lemmas, which taken together will allow us to construct an equivalence relation on (see Proposition 2.17) based only on information contained in the sample of observations, and more specifically the empirical correlations
. This equivalence relation corresponds to the partition into the sets
,
, and thus constitutes the solution to the community detection problem stated in Theorem 1.12.
Lemma 2.11. Let . Then we have
Proof: Recall that we fixed and hence
is fixed as well. Suppose
. By Definition 2.9,
holds. Lemma 2.7 supplies the upper bound
The two inequalities yield
Note that holds if and only if
. By Definition 2.5,
and
The above and inequality (2.16) give
Thus, we have shown that implies
. The converse is proved analogously. □
Lemma 2.12. Let . Suppose we have
for all and all
. Then
holds for all .
Proof: Let . To obtain a contradiction, suppose
. Then there is some
such that
or
for some v < u. Without loss of generality, suppose
. By assumption, this implies
. However, this contradicts
because of
. Therefore,
must hold. Since
, Lemma 2.7 yields
and thus holds by definition of Gu:
□
Lemma 2.13. Let . Suppose we have
for all and all
. Then
Proof: Assume . By Definition 2.9, there is no
such that
for some v < u, as the latter would imply and thus
in contradiction to our assumption.
To obtain a contradiction, assume that for some v < u. Since each group has at least two individuals (cf. Remark 1.2), there is an
with
. By assumption, this implies
as v < u. This contradicts (2.17), and hence we conclude that
.
follows by the same arguments. □
Lemma 2.14. Let . Suppose we have
for all and all
. Then
holds for all .
Proof: Let . By the last lemma, we have
for all v < u. As
, we have
To obtain a contradiction, assume . Then we have
by Definition 2.5, which contradicts (2.18). Therefore, must hold. Next assume
for some w>u. Once again, Definition 2.5 implies
and we have a contradiction to (2.18). Hence, is proved. □
Corollary 2.15. Let . Suppose we have
for all and all
. Then
holds for all .
Proof: The statement follows from Lemmas 2.12 and 2.14. □
Proposition 2.16. For all and all
, we have
Proof: Together, Lemma 2.11 and Corollary 2.15 constitute a proof by induction over of the proposition’s statement. □
Proposition 2.17. Let the relation ∼ on be defined by
Then ∼ is an equivalence relation, and, for all ,
holds if and only if
.
Proof: By Proposition 2.16, for all , the existence of a
such that
is equivalent to
. The function
yields a partition of
with the equivalence classes given by
,
, i.e. the indices
belonging to group l. Therefore,
is equivalent to the statement that i and j belong to the same equivalence class given by some group
. □
For the convenience of the reader, we restate Theorem 1.12. The theorem says that the problem of community detection has a solution which can be identified with high probability.
Theorem 2.18 (Group Identification and Reconstruction Procedure in the High Temperature Regime). Let and
be as specified in Definition 2.5. Suppose that
and n is a natural number. A sample of n observations
allows us to recover the group partition of
with high probability. The probability that we cannot recover the group partition is bounded above by the exponentially decaying function
Proof: Recall the definition of in (18) and that we have fixed
. By Proposition 2.17, the set sequence Gu,
, allows us to define an equivalence relation ∼ on
which corresponds to the group structure of the model in the sense that for all
holds if and only if i and j belong to the same group. Then the probability that recovery is impossible is bounded above by
according to Remark 2.6. This concludes the proof of Theorem 1.12. □
2.3 Low temperature group identification
For the next definition, recall (1.7):
the notation for the minima of the function F, and Remark 1.10, and recall that we defined the subsets of Ω
Definition 2.19. We set
(note that is positive by Remark 1.10), and
Moreover, we choose a fixed natural number satisfying (cf. Theorem 1.11)
for every .
Remark 2.20. Notice that Proposition 2.2 implies that
Recall that Theorem 1.11 states
We fix for the rest of this section. Given ω, we obtain a sample
. Recall the definition of the empirical correlations
for all with
.
Lemma 2.21. Assume that . Set
. Then the following inequality holds:
Proof: Since , it follows that (recall that
and
and (2.19), (2.3))
for all . Recalling Theorem 1.11,
we obtain
for every . □
Let be an enumeration of the set
with
We define a partition of the set of all groups based on the values g(u) above.
Definition 2.22. Let be a partition of
induced by
Next we recursively define two set sequences which will be employed in the reconstruction algorithm of the group structure of the model.
Definition 2.23. Let . We define
and
Assume and
have been defined for some
. Set
and
Remark 2.24. It is a direct consequence of Definition 2.23 that
The proof of the next proposition follows the same steps as that of Proposition 2.17, using Lemma 2.21 instead of Lemma 2.7.
Proposition 2.25. Let the relation ∼ on be defined by
Then ∼ is an equivalence relation, and, for all ,
holds if and only if
.
Theorem 2.26 (Group Identification and Reconstruction Procedure in the Low Temperature Regime). Let the model be in the low temperature regime and non-critical. Let be the minima of the function F, and assume there are no l and m with
such that
holds for every
. There is a fixed natural number
and a constant
(cf. Definition 2.19) such that if
and
, a sample of n observations
allows us to recover the group partition with high probability. The probability that we cannot recover it is bounded above by the exponentially decaying function
Proof: Recall the definition of in (2.19) and that we have fixed
. By Proposition 2.25, the set sequence Gu,
, allows us to define an equivalence relation ∼ on
which corresponds to the group structure of the model in the sense that for all
holds if and only if i and j belong to the same group. Then the probability that recovery is impossible is bounded above by
according to Remark 2.20. This concludes the proof of Theorem 1.13. □
3 Proofs of Theorems 1.7. and 1.11
Recall that
We define for fixed with
and
Then the representation
holds by Theorem 4.1 in the Appendix.
3.1 Proof of Theorem 1.7
Lemma 3.1. In the high temperature regime, where H = J−1−I>0, 0 is the only minimum of F and the lower bound
holds, where cH>0 is half of the minimum eigenvalue of H.
Proof: Let be the Hessian matrix of F at
. A direct calculation using hyperbolic trigonometric identities shows that
where is the Kronecker delta. This implies that
is positive definite for all
. Since
, 0 is the only minimum of F, and, due to
, F is strictly positive on the complement of {0}. Moreover, a Taylor series expansion of the function
in t of order 1 with remainder of order 2 in integral form shows that
where in the above equation we used and the spectral theorem which yields strict positivity of the at most M eigenvalues of the self-adjoint H. □
We restate Theorem 1.7 for the convenience of the reader:
Theorem 3.2. In the high temperature regime, i.e. for I–J > 0, set H: = J−1−I. Then there is a positive constant such that for all
holds, where we use the notation
Proof: We will show the statement for . The case
can be handled analogously. For any set
, let
be the complement. We will work with Z2(N) and Z0(N) defined in (3.1) and (3.2). We denote by
the ball with radius r, centred at z, and we set (see Lemma 3.1)
which is selected in order to fulfil . We split Z2(N) and Z0(N) into integrals over
and over
. Lemma 3.1 implies that
and likewise we conclude that
and
Notice that and
. This implies that all even derivatives of
are polynomials with even powers of
and all odd derivatives of
are polynomials with odd powers of
. Therefore, all such derivatives are uniformly bounded on
, and odd derivatives evaluated at 0 vanish (and the same holds for F). Taking this into consideration, and using that
and the Hessian of F at 0 is H, the remainder formula in Taylor’s Theorem leads to
for some constant C, independent of x (for we use Taylor’s Theorem, for
we use that F(x) and
grow quadratically in
). Note that for positive s and t,
, the mean value theorem implies that
. This, together with the fact that
is bounded and (3.7), implies that
Likewise, we get
The Taylor series of at 0 (up to second order, using that
and that
is uniformly bounded), implies that there is a constant C such that
and this leads to
Next, Proposition 4.2 implies that
It follows from (3.1), (3.4), (3.6),(3.8), (3.10), and (3.11) that
and it follows from (3.2), (3.5), (3.6), (3.9), and (3.11) that
Set
Then, using (3.12) and (3.13), we obtain
Displays (3.3) and (3.14) lead to the desired result. □
3.2 Proof of Theorem 1.11
Recall the definition of F given in (1.5) and Lemma 1.9.
Lemma 3.3. There exist constants and c > 0 such that the following estimation holds:
where α is the minimum value of F, and the balls are disjoint for
. Also, for every k and every
,
Proof: The continuity of the Hessian of F shows that there are small enough numbers and
such that
is positive in
for every k (and we take δ small enough that these balls are disjoint). A first order Taylor series expansion (with second order remainder in the integral form) for the function
shows that
for every . For large enough
, F(x) is dominated by
, where c > 0 is any constant smaller than the smallest eigenvalue of J−1. For large enough x, there is a constant c such that
. We conclude that there is a constant c > 0 such that
for large enough x and every k. Eqs (3.17) and (3.18) imply (3.15) for every x in the complement of a compact set not intersecting
, for every k. On
,
attains its minimum, which we call
(it cannot be smaller or equal 0 because F attains its minimum only at the points z
). On the set
, (3.15) holds true as well. This is verified with the following estimation:
for every .
We restate Theorem 1.11 for the reader’s convenience: □
Theorem 3.4. Assume the model is in the low temperature regime and non-critical as per Definition 1.8, i.e. the Hessian Hk of F at zk is positive definite for all k. Then the function F defined in (1.5) has a finite number of minima, . There is a positive constant
such that for all
Recall the definitions of Z2(N), and Z0(N) in (3.1) and (3.2), as well as the expression
given in Theorem 4.1.
We set (cf. Lemma 3.3)
which is selected in order to fulfil (and we assume that N is large enough such that
). Lemma 3.3 implies that
and likewise we conclude that
and (see (3.16))
Notice that and
. This implies that all even derivatives of
are polynomials with even powers of
and all odd derivatives of
are polynomials with odd powers of
. Therefore, all such derivatives are uniformly bounded in
. Next, we recall that
and the Hessian of F at z
is Hk. A second order Taylor series expansion (with third order remainder in the mean value form) for the function
shows that, for every k,
on . Notice that for positive s and t with
we have that the mean value theorem implies that
. This, together with the fact that
is bounded and (3.23), implies that
Likewise, we get
We denote by
the second order Taylor series of at
. Using that all derivatives of
are uniformly bounded, Taylor’s Theorem implies that
and this implies that
Next, since the function is odd and its integral vanishes, Proposition 4.2 implies that
It follows from (3.1), (3.20), (3.22), (3.24), (3.26), and (3.27) that
and it follows from (3.2), (3.21), (3.22), (3.25), and Proposition 4.2 that
Eqs. (3.28) and (3.29) imply that
Eqs. (3.30) and (3.3) lead to the desired result.
4 Auxiliary results
Recall the definitions of Z2(N) and Z0(N) from (3.1) and (3.2) for fixed with
and
The following theorem is a special case of [26, Theorem 32] a result proved for the first time in [30]. We present it here with a short proof for the reader’s convenience.
Theorem 4.1. The following identity holds
Proof: The Gaussian integral implies, expanding the square, that
Similarly, we obtain for any positive definite matrix
the latter equality being the result of a change of variables u = N−1Aw. The first equality is straightforward for diagonal matrices A (using the one-dimensional version). Since A is self-adjoint, we can diagonalise it: A = UTDU for an orthogonal matrix U and a diagonal matrix D. A change of variables gives the desired result (using the diagonal case with Uy instead of y).
For every
we set
Recall Definition 1.5 and note that
Using (4.1) with A = J and and the change of variables u = N−1Jw, we obtain
Noting that the partition function Z is given by , we get
We finish by proving that the numerator is Z2(N) and the denominator Z0(N). The equation
and the associativity and commutativity of addition yield
Then we note that
since for any
. Similarly,
and for all
We have
where we again used that for all
. Similarly, we obtain an expression for the denominator of (4.2):
The result follows by dividing numerator and denominator by 2. □
The following result pertaining to Gaussian integrals is used repeatedly throughout the proofs. It is an immediate consequence of the definition of a multivariate normal distribution. See, e.g., [31, pp. 176–177] for a reference.
Proposition 4.2. Fix . We have for all invertible
References
- 1. Berthet Q, Rigollet P, Srivastava P. Exact recovery in the Ising blockmodel. Ann Statist. 2019;47(4):1805–34.
- 2. Löwe M, Schubert K. Exact recovery in block spin ising models at the critical line. Electron J Stat. 2020;14:1796–815.
- 3. Contucci P, Ghirlanda S. Modelling society with statistical mechanics: an application to cultural contact and immigration. Qual Quant. 2007;41:569–78.
- 4.
Lauritzen S. Graphical models. Oxford: Clarendon Press; 1996.
- 5. Holland PW, Blackmond Laskey K, Leinhardt S. Stochastic blockmodels: first steps. Soc Netw. 1983.
- 6. Abbe E. Community detection and stochastic block models: recent developments. J Mach Learn Res. 2018;18(177):1–86.
- 7. Mossel E, Neeman J, Sly A. Reconstruction and estimation in the planted partition model. Probab Theory Relat Fields. 2014;162(3–4):431–61.
- 8. Mossel E, Neeman J, Sly A. Belief propagation, robust reconstruction and optimal recovery of block models. Ann Appl Probab. 2016;26:2211–56.
- 9. Gao C, Ma Z, Zhang AY, Zhou HH. Achieving optimal misclassification proportion in stochastic block models. J Mach Learn Res. 2017;18.
- 10. Amini AA, Levina E. On semidefinite relaxations for the block model. Ann Statist. 2018;46(1):149–79.
- 11. Geman D, Geman S. Bayesian image analysis. In: Bienenstock E, Soulié FF, Weisbuch G, editors. Disordered systems and biological organization. vol. 20 of NATO ASI Series. Berlin, Heidelberg: Springer; 1986.
- 12. Besag J. On the statistical analysis of dirty pictures. J Roy Statist Soc Ser B. 1986;48(3):259–302.
- 13. Lauritzen S, Sheehan NA. Graphical models for genetic analyses. Statist Sci. 2003;18(489–514).
- 14. Chen J, Yuan B. Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics. 2006;22(18):2283–90. pmid:16837529
- 15. Ballesteros M, Garro G. A model and a numerical scheme for the description of distribution and abundance of individuals. J Math Biol. 2022;85(4):31. pmid:36114925
- 16. Linden G, Smith B, York J. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 2003;7(1):76–80.
- 17.
Manning C, Schütze H. Foundations of statistical natural language processing. Cambridge, MA: MIT Press; 1999.
- 18. Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM. A survey of statistical network models. Foundations and Trends® in Machine Learning. 2010;2(2):129–233.
- 19. Brock WA, Durlauf SN. Discrete choice with social interactions. Review of Economic Studies. 2001;68(2):235–60.
- 20. Opoku AA, Edusei KO, Ansah RK. A conditional curie-weiss model for stylized multi-group binary choice with social interaction. J Stat Phys. 2018.
- 21. Löwe M, Schubert K, Vermet F. Multi-group binary choice with social interaction and a random communication structure – a random graph approach. Physica A. 2020;556.
- 22. Kirsch W, Toth G. Optimal weights in a two-tier voting system with mean-field voters. Soc Choice Welf. 2025.
- 23. Fedele M, Contucci P. Scaling limits for multi-species statistical mechanics mean-field models. J Stat Phys. 2011;144:1186–205.
- 24. Kirsch W, Toth G. Two groups in a Curie-Weiss model. Math Phys Anal Geom. 2020;23(2).
- 25. Kirsch W, Toth G. Two Groups in a Curie–Weiss model with heterogeneous coupling. J Theor Probab. 2019;33(4):2001–26.
- 26. Kirsch W, Toth G. Limit theorems for multi-group Curie–Weiss models via the method of moments. Math Phys Anal Geom. 2022;25(4).
- 27.
Ellis R. Entropy, large deviations, and statistical mechanics. Whiley; 1985.
- 28. Knöpfel H, Löwe M, Schubert K, Sinulis A. Fluctuation results for general block spin Ising models. J Stat Phys. 2020;178:1175–200.
- 29.
Dembo A, Zeitouni O. Large deviations techniques and applications. 2nd ed. New York: Springer; 1998.
- 30.
Toth G. Correlated voting in multipopulation models, two-tier voting systems, and the democracy deficit. FernUniversität in Hagen; 2020.
- 31.
Durrett R. Probability. Theory and Examples. 5th ed. Thomson; 2019.