Enhanced storage capacity with errors in scale-free Hopfield neural networks: An analytical study

The Hopfield model is a pioneering neural network model with associative memory retrieval. The analytical solution of the model in mean field limit revealed that memories can be retrieved without any error up to a finite storage capacity of O(N), where N is the system size. Beyond the threshold, they are completely lost. Since the introduction of the Hopfield model, the theory of neural networks has been further developed toward realistic neural networks using analog neurons, spiking neurons, etc. Nevertheless, those advances are based on fully connected networks, which are inconsistent with recent experimental discovery that the number of connections of each neuron seems to be heterogeneous, following a heavy-tailed distribution. Motivated by this observation, we consider the Hopfield model on scale-free networks and obtain a different pattern of associative memory retrieval from that obtained on the fully connected network: the storage capacity becomes tremendously enhanced but with some error in the memory retrieval, which appears as the heterogeneity of the connections is increased. Moreover, the error rates are also obtained on several real neural networks and are indeed similar to that on scale-free model networks.


Introduction
Human neuroscience has attracted increasing attention through various studies. Among such activities, the retrieval or recall of associative memory in neural networks is a historically noticeable issue [1,2]. Associative memory means the ability to link, learn and remember the relationship between independent and unrelated items such as one man's name (e.g., Albert Einstein) and his previous achievement (e.g., E = mc 2 ). Neural network models of associative memory have been used to explain how the brain stores and recalls long-term memories. These models incorporate the so-called Hebbian rule for a cell assembly, a group of excitatory neurons mutually coupled by strong synapses [3]: Memory storage occurs when a cell assembly is created by Hebbian synaptic plasticity, and memory retrieval or recall occurs when the neurons in the cell assembly are activated by a stimulus. Neural network models of associative memory assume that information exists alternatively as neural activity or as synaptic PLOS  is the degree exponent. Such networks are referred to as scale-free (SF) networks. Recent electrophysiological data [11] also revealed that the distribution of synaptic connections strength follows a log-normal distribution, which has a heavy tail similar to that in the powerlaw distribution. Moreover, it is known that the brain damages of schizophrenia [12,13] or comatose patients [14] are caused by the malfunctioning of hub neurons. Thus, hub neurons that have many connections is likely to exist in a brain network. This paper aims to investigate the properties of the retrieval patterns created by the Hopfield model on SF networks at finite temperature analytically. We also compare obtained results with previous numerical results obtained from fully connected networks and diverse SF networks such as the Barabási-Albert model with γ = 3 [15] and the Molloy-Reed model with several values of γ [16]. We also use the Chung-Lu model [17,18] to construct uncorrelated SF networks over the entire range of γ. The details of the Chung-Lu model are presented in the Method section. Particularly we consider the limit γ ! 2, which is the case observed in the fMRI data [19][20][21].
The main results of our studies are presented in Figs 1 and 2. In Fig 1, we present properties of the retrieval pattern for various degree exponents γ as a fuction of temperature T and storage rate a. When γ < 2.04 (Fig 1(e) and 1(f)), the retrieval phase spans most of the low-temperature (noise) region; thus memory retrieval in the system is appreciably enhanced compared with the one of the original Hopfield model [6,7]. In Fig 2, the error rate, a fraction of neurons which fails memory retrieval, is obtained as a function of storage rate a p/N, the number of stored patterns per neuron, at zero temperature T = 0. Remarkably when γ = 2.01, the error rate is almost zero when the number of stored patterns p is small, and gradually increases but is less than 0.3 even when p is increased to the total number of neurons N. This implies that the storage capacity becomes tremendously enhanced as the SF network becomes extremely heterogeneous in structure, but there occur some errors. We remark that the previous solution of the Hopfield model on fully connected networks [6,7] revealed that the error rate is zero up to a certain threshold, but beyond which it becomes one-half and the system falls into a total confusion state. Based on these results, we think that the solution of the Hopfield model on SF networks reflects more a normal brain in the point that it provides the case of imperfect memory retrieval with some error even when its storage capacity is small. Moreover, the result is timely in accord with a recent experimental discovery that storage capacity of brain is in the petabyte range, as much as entire web, ten times more than previously thought [22]. We hope that our result will provide some theoretical development for modeling associative memory networks in neuroscience.

Model development
Hopfield model on a given network. In the Hopfield model, each neuron at node i of a given neural network (denoted as G) has an Ising spin S i with two states, S i = +1 and S i = −1 representing excited and rest states for transmitting or not transmitting a signal, respectively [5]. The Hamiltonian (corresponding to energy of the system) is introduced as where the coupling strength J ij between two connected nodes (i, j) 2 G, known as the synapse Enhanced storage capacity with errors in scale-free Hopfield neural networks Here, m, q, and r are given by Eqs (29)(30)(31) of the S1 File, respectively. SG does the spinglass phase, in which m = 0, q > 0, and r > 0. In the P and SG phases, the retrieval of stored patterns is impossible. Thus, they are often referred to as the confusion phase. The retrieval phase is denoted as R, in which m > 0, q > 0, and r > 0. The retrieval of stored memory is possible. Finally, M does the mixed phase, in which the features of both the retrieval and the spin-glass phases coexist. As the degree exponent γ is decreased from infinity in a through γ = 2.01 in f, the retrieval phase not only intrudes into the region of the SG phase, but also raises the boundary of the phase P to a higher temperature region. Eventually the SG phase remains on the T = 0 axis when γ = γ c ' 2.04, in which the phase R spans most of the low-temperature region. Thus, memory retrieval is enhanced. The phase boundary was obtained by performing numerical calculations for the Chung-Lu SF networks with the system size N = 1000 and mean degree K = 5.0. Solid and dotted lines or curves indicate the second-order and first-order transitions, respectively. We note that the case a on ER network is nearly the same as that in mean field limit obtained in the original Hopfield model. efficacy, takes the Hebbian form, K is the mean degree, i.e., the average number of edges of the network G of size N. x m i is another quantity assigned to node i, which also has either +1 or −1. A collective quantity fx m i g represents a memory pattern denoted by μ that is stored in the system. The index μ runs μ = 1, . . ., p, which means that the number of memory patterns stored is p. Whereas x m i is fixed throughout the dynamics. Starting from some initial values of {S i (t = 0)}, the state of each spin is updated asynchronously as When ∑ j J ij S j (t) becomes zero, S i (t + 1) = +1 is assigned definitely. Enhanced storage capacity with errors in scale-free Hopfield neural networks As the updating is repeated, the energy given by (1) reduces and the system reaches a local minumum state. In this state, each spin S i becomes equivalent to x m i for a given pattern μ, and the stored memory is retrieved. The underlying mechanism for such converging behavior is explained in Section 2 of the S1 File. In real-world systems, however, the repeated dynamics may not be deterministic as Eq 3, but it may include some noise. To take into account of noise effect, the original study [6] of Hopfield model invoked the formalism developed in equilibrium statistical mechanics at finite temperature, in which temperature represents noise strength.
Ensemble average of the Hopfield model over different network configurations and stored patterns. Thus far, we consider the Hopfield model on a given network. However, connection profiles of SF networks can be different from sample to sample even though they follow the same degree distribution. Thus, we need to take average of physical quantities over the ensemble of different network configurations. To proceed, we consider the probability that a given SF network G exists in the ensemble. That is given as where f ij is the probability to connect a link between two nodes i and j. It was derived that [17,18]. The factor w i is a weight of node i reflecting heterogeneous degrees of a SF network. The explicit form of the weight factor is presented in the Method section. Then the ensemble average over different networks for any given physical quantity A is taken as where hÁ Á Ái K denotes the average over different graph configurations. A(G) represents any physical quantity obtained in a graph G. Next, we consider a situation in which stored patterns are not deterministic, but stochastically generated. This case is considered for the purpose of testing the efficiency of the algoritumc. Specifically, each pattern μ is created with the population of x m i ¼ 1 with probability 1/2 and that of x m i ¼ À 1 with probability 1/2 for each node i. Then, the probability that a pattern μ is created is given as We take the average of any given physical quantity A over the ensemble of different stored patterns as where hÁ Á Ái ξ is an average over the quenched disorder of x m i .

Analytic solutions
The order parameters. We characterize the phases of the Hopfield model by three quantities, which are often called the order parameters in statistical physics as follows: i) The overlap parameter defined as m m a P N i w i hx m i S a i i, which represents the extent of which the μ-th pattern of memory ξ μ and the α-th state of the system S α overlap with each other. Thus, this quantity measures the retrieval success rate of the μ-th memory. We remark that the factor w i is required to take into account heterogeneous degrees. w i is proportional to the degree k i of neuron i [23][24][25][26]. Actually this weight w i is the same as that used in the Chung-Lu model introduced in Methods. ii) The so-called spin glass order parameter q ab P N i w i hS a i S b i i representing the extent of which the two states α and β of the replica overlap each other. The replica is a quantity introduced in the spin-glass theory to resolve the technical difficulties in taking the ensemble average hÁ Á Ái K and hÁ Á Ái ξ introduced in Eqs (5) and (7). iii) A new quantity is introduced as r ab ðN=pÞ P p m¼2 m m a m m b . This quantity is necessary to derive the first two order parameters, and is interpreted as the sum of the effects of each non-retrieval pattern.
Next, we take the replica-symmetric solution by setting q αβ = q and r αβ = r for all α 6 ¼ β and m 1 a ¼ m for all α. Then the order parameters m, q and r at zero temperature are obtained as which are used for obtaining the error rate. Here, storage rate a is defined as the number of existing memory patterns p divided by the total number of neurons in a given network, i.e., the network size N. Therefore, the storage capacity a c is the maximum value of storage rate a. Detailed calculations for those parameters are presented in Section 4 of the S1 File. Phase diagram and error rates. The phase diagram we obtained are shown in Fig 1 in the T − a plane for different degree exponents γ but with fixed N = 1000 and K = 5.0. First, in Fig 1(a), we consider the case of the Erdős and Rényi (ER) network, equivalent to the limit γ ! 1. This phase diagram is nearly the same as that obtained on fully connected network in Ref. [7]. There exist four different phases: i) The paramagnetic phase denoted as P, in which m = 0, q = 0, and r = 0 because of thermal fluctuations. ii) The spin-glass phase denoted as SG, in which m = 0, q > 0, and r > 0. In those phases P and SG, the retrieval of stored patterns is impossible. Thus, they are referred to as the confusion phases. iii) The retrieval phase denoted as R, in which m > 0, q > 0, and r > 0, and in which the retrieval of stored memory is possible. Finally, the mixed phase denoted as M, in which the properties of both the retrieval and the spin-glass phases coexist.
The free energy of the system in each phase has also been investigated in Ref. [7]. In phase R, a stored pattern, for instance, fx m i g i is matched to a state of the system fS a i g i at the global minimum of free energy. However, in phase M, a stored pattern is matched to a state of the system in a metastable state, while the SG state lies at the global minimum.
For a = 0, which occurs when p $ OðNÞ in the limit N ! 1, there exist two phases, P and R. As a is increased slightly from a = 0, the SG and M phases appear between the phases P and R. The transition between the P and the SG phases is a second-order transition, whereas that between SG and M is a first-order transition. Likewise, the transition between M and R is also first order. At T = 0, the transition between R and M occurs at a m ' 0.114, and the transition between M and SG occurs at a c ' 0.143. Therefore, as temperature is lowered from a sufficiently high value, successive transitions occur following the steps P ! SG ! M ! R for a < a m . For a m < a < a c , successive transitions occur following the steps P ! SG ! M, and for a > a c , a transition from P ! SG occurs.
Second, we obtain the phase diagram for finite γ values in Fig 1(b)-1(e). That undergoes drastic changes depending on γ. Remarkably, the R phase intrudes into the region of the SG phase, but it also raises the boundary of the P phase to a higher-temperature region. As γ approaches 2.04 in Fig 1(e), we observe that the R phase prevails, whereas the SG phase shrinks until it only exists at T = 0 in the region a > 0.6. Such changes in the T − a diagrams can be understood analytically by examining the phase boundary between the P and SG phases. The details are presented in Section 5 of the S1 File.
Third, we consider a particular case, the noiseless case T = 0 in Fig 2. The order parameters m, q and r are given in Eqs (8)(9)(10). The figures show the behavior of the error rates n e as a function of a with N = 1000 and K = 5.0. The error rate means the relative error of the neural networks and is defined as n e (1 − m)/2. This figure is obtained analytically for various degree exponent values including γ ' 2.0. We first consider the case in which γ ! 1, i.e., the ER limit. The result of this case is almost the same as that obtained in Ref. [7]. When a is less than a c ' 0.143, n e is very small, such that the error rate is negligible and the system is almost in the error-free state. The obtained value a c is close to a c ' 0.138, which was obtained on the fully connected network in Ref. [7]. As a reaches a c , n e suddenly jumps to 0.5 as shown in Fig 2(b). This means that for a > a c , the system is in the error-full state (i.e., the state of complete confusion). As γ decreases, this behavior persists up to γ c ' 2.7, and no longer holds for γ γ c . Note that the value of a c slightly decreases with decreasing γ, but their dependences are almost negligible.
Next, when γ approaches 2.0, n e noticeably changes from the step-function-like shape to a monotonously increasing one as shown in Fig 2(d). As γ is lowered further and approaches 2.0, the range of a for the state of complete confusion, with n e = 0.5, disappears and a c cannot be defined anymore. For instance, at γ = 2.01, the error rate becomes less than 0.3 for the entire range of a. Therefore, when γ is lowered to as small as 2.0, while the range of a for the state of n e = 0 (the state of perfect retrieval without error) is reduced, the range of a for the state of n e = 0.5, the region of the state of complete confusion disappears. The details are presented in Section 6 of the S1 File. These behaviors have never been observed yet in previous studies [15,16,27]. Fig 2 thus implies that as the network changes from the ER network to the extremely heterogeneous SF network with degree exponent two, the storage capacity becomes tremendously enhanced, but some error occurs. This result suggests that hubs play central roles in memory retrieval.

Conclusion and discussion
We obtained the results that as the network changes from a hub-absent network to a SF network with degree exponent just above two, the storage capacity becomes tremendously enhanced, but some error occurs. These features seem to be in accordance with what we experience in everyday life and with a recent discovery of enormously high storage capability in the human brain. Thus hubs, i.e., neurons with a large number of synapses, and other nodes with heterogeneous degrees in neural networks play a central role in enhancing the storage capacity. It is interesting that a normal human brain has such a structure, even though detailed structural properties such as modularity and degree correlation are not yet known. We consider the effect of degree correlation near γ = 2.0 by performing a similar analysis of the static model [23][24][25][26], the degree correlation of which is disassortative between 2 < γ < 3. We find that correlated SF networks near γ = 2.0 + have lower values of n e than the uncorrelated ones. The details are presented in Section 8 of the S1 File. We also checked the error rate n e on several Enhanced storage capacity with errors in scale-free Hopfield neural networks real neural networks, having heavy-tailed degree distributions but with undetermined degree exponents due to the small network size. The obtained patterns of the error rates are indeed similar to the theoretical prediction. The details are presented in Section 9 of the S1 File.
Our results might provide some guidelines for constructing an artificial neural network, provided that it is constructed on the basis of the Hopfield model: If we want to construct an artificial neural network that is capable of perfect memory recall with a large value of a c , as a basic topology of the artificial neural network, it may be more appropriate to choose an ERtype random network or a fully connected one. However, if we prefer to construct an artificial neural network that supports an extended range of storage rate and tolerate a small range of errors, it may be more appropriate to choose an SF-type network with the degree exponent γ ' 2.0. In this case, we can construct an artificial neural network with relatively low cost compared with a fully-connected one which needs very many synapses of O(N 2 ).

Construction of scale-free networks
To construct uncorrelated SF networks, we use the Chung-Lu (CL) model [17,18]: We start with a fixed number of N vertices. Each vertex i (i = 1, 2, . . ., N) is assigned a weight where ν is a control parameter in the range [0, 1), and i 0 is constant given by ð0 n < 1=2Þ: A pair of vertices (i, j) is chosen with the probabilities w i and w j , respectively, and they are connected with an edge, unless the pair is already connected. This process is repeated NK/2 times. Then, the resulting network becomes an uncorrelated SF network following a power-law degree distribution, P d (k) * k −γ , where k denotes the degree and γ is the degree exponent with γ = 1 + 1/ν. In such random networks, the probability that a given pair of vertices (i, j) (i 6 ¼ j) is not connected by an edge, as denoted by 1 − f ij , is given by (1 − 2w i w j ) NK/2 ' exp(−NKw i w j ), while the connection probability f ij = 1 − exp(−NKw i w j ).
As a particular case, when we choose i 0 as 1 for all the values of ν, the CL model reduces to the static model [23][24][25][26], which has correlations for the range of 1/2 < ν < 1 [23,24]. Therefore, the weights for the static model are given by where ν is a control parameter in the range [0, 1), and z N ðnÞ P N j¼1 j À n ' N 1À n =ð1 À nÞ. Note that f ij ' NKw i w j for finite K, however, f ij ' 1 for 2 < γ < 3 and ij ( N 3−γ . For the Erdős-Rényi (ER) graph [28][29][30], ν becomes 0 and the weights of both models become w i = 1/N, independent of the index i. Since w i w j = 1/N 2 , the fraction of bonds present becomes f ij ' K/N and the total number of the connected edges L is NK/2. So K becomes the mean degree in the ER graph.
When K approaches N, this network becomes a fully-connected one or a regular lattice with infinite-range interaction. Note that previous studies on the Hopfield model focused mainly on such extreme cases of K ! N [2,6,7,31,32].
Supporting information S1 File. Supporting information for the main paper. This supporting information contains the detailed calculations of the free energy and order parameters using the replica analysis. (PDF) S1 Fig. A simple example of simulation of the Hopfield model. (a) A network of size N = 9 and mean degree K ¼ 5 3 . (b) Two patterns ξ 1 and ξ 2 are encoded by the Hebbian rule. In (c) and (d), we start from two random patterns. After some updates using Eq (1) of the S1 File, the patterns ξ 1 and ξ 2 are retrieved. and 1(f) of the main paper for the CL model, respectively. P represents paramagnetic phase, SG spin glass phase, and R retrieval phase. The SG phase remains only on the axis T = 0 when γ = γ c ' 2.35, then the R phase spans the entire region of a at the lower temperatures. Thus, the static model shows better memory retrieval than the CL model. To obtain the phase boundary, numerical calculations were performed for the static model with plugging N = 1000 and K = 5.0 into the formulas. Solid and dotted curves indicate the second-order and the firstorder transitions, respectively. Note that black dotted line in each panel near the T = 0 line represents the AT line (Eq (32) of the S1 File). Thus, replica-symmetric solution is valid over almost the entire region of the phase space. are drawn to compare the simulation data (•) with the analytic solution (red curve) using Eq (40) of the S1 File under the same conditions of N and L. Here, the γ values we used for w i in Eq (40) of the S1 File were 2.0035 (b), 2.0050 (d), and 2.0035 (f), respectively. (EPS) S1 Data Set. Supporting information for the " S7 Fig". This is another supporting information which contains the data set necessary to replicate the " S7 Fig". (XLS)