Abstract

We analyze the effect of network topology on the pattern stability of the Hopfield neural network in the case of general graphs. The patterns are randomly selected from a uniform distribution. We start the Hopfield procedure from some pattern v. An error in an entry e of v is the situation where, if the procedure is started at e, the value of e flips. Such an entry is an instability point. Note that we disregard the value at e by the end of the procedure, as well as what happens if we start the procedure from another pattern or another entry of v. We measure the instability of the system by the expected total number of instability points of all the patterns.

Our main result is that the instability of the system does not depend on the exact topology of the underlying graph, but rather only on its degree sequence. Moreover, for a large number of nodes, the instability can be approximated by , where is the standard normal distribution function and are the degrees of the nodes.

1  Introduction

The Hopfield model is one of the classic mathematical models of neuron functionality, playing an important role in computational neuroscience. The capacity of the Hopfield network is measured by the number of bit vectors that can be stored and retrieved in the neural network while preserving a certain level of stability, namely, a certain bound on retrieval errors for the stored vectors. Previous investigations of the capacity of neural networks examined the relation between the (neural) network topology and the capacity and stability of the network. Analyses of the capacity of the Hopfield network in a complete graph appeared in the seminal works (Amit, Gutfreund, & Sompolinsky, 1985; McEliece, Posner, Rodemich, & Venkatesh, 1987). In addition, Löwe and Vermet (2011, 2014) and Bovier and Gayrard (1992) found the capacity of different types of random graphs. However, very little theoretical work has been done on other, more realistic, topologies. The only work that has considered other topologies is based on computer simulations rather than theoretical calculations of the capacity and stability. In this work, we calculate exactly the Hopfield network instability for a general topology network and prove that it is solely dependent on the graph degree sequence.

The Hopfield network is an autoassociative neural network, introduced by Hopfield (1982). In this letter, we refer to the original Hopfield model, a graph G with n nodes that saves m vectors (patterns) in such that and for , where Wij is the weight of the edge between node i and node j; ui, uj are the values of the ith and jth entries in u; and indicates summation over all saved vectors u.

The recall starts with an input vector and ends after reaching a stable state, where no node vi changes its value according to .

As in other mechanisms of data storage, the Hopfield network has some capacity. In this case, the capacity is defined in terms of the number of vectors that can be saved without having errors when recalling any saved vector. Another way of defining the capacity can be error dependent—how many vectors can be saved so that the number of errors will be less than some predetermined threshold. As can be understood from the definition of capacity, the probability of an error depends on the number of vectors saved in the net.

There are several ways of counting errors in the Hopfield network. One way is to start the Hopfield algorithm from a random vector and measure the Hamming distance (Hamming, 1950) of the result from the closest initially stored vector. Another way is to start from an initially saved vector, run the Hopfield algorithm, and measure the Hamming distance of the result from the initial vector. McEliece et al. (1987) showed that, asymptotically, no more than vectors can be saved so that all of them may be retrieved perfectly for the complete graph. Amit et al. (1985) showed using the replica trick that for the complete graph, the level of error (as defined there, different from our notion of error) increases very fast when the number of saved vectors is greater than approximately . Following that, other work was done to find lower bounds with rigorous proofs. Newman (1988) first obtained a lower bound of . This was improved by Loukianova (1997) to , Talagrand (1998) to , and finally by Feng, Shcherbina, and Tirozzi (2001) to . One of the interesting aspects of the study of the Hopfield network is the effect of graph topology when dealing with a noncomplete graph. Graphs with the same number of vertices and edges but different topologies may differ substantially in their capacity of the Hopfield model. Simulations performed by Jianquan, Juan, Jinde, and Zhiqiang (2006) and McGraw and Menzinger (2003) estimated the capacities for random, regular, small-world (Watts & Strogatz, 1998) and scale-free (Barabási & Albert, 1999) graph topologies. Löwe and Vermet have calculated the capacity of the Erdős-Renyi random graphs G(n,p) in ranges of p that yield sparse graphs (Löwe & Vermet, 2011) as well as the case of zero errors in various random graph topologies (Löwe & Vermet, 2014).

In this work we focus on the measure of pattern stability. An instability point is an entry vi in a stored vector v, for which the value of the vertex vi is different from when starting the system from v. Denoting by I the number of instability points, we provide a convenient approximation for its expected value for large n when the number of saved vectors is , which includes the range of m where the model functions as an associative memory with a small number of errors as well as the range where the number of errors increases and the model no longer functions as an associative memory (Hertz, Palmer, & Krogh, 1991). Note that depends only on the degree sequence of G.

2  Instability Solely Depends on the Degree Sequence

We calculate the expected number of instability points first in the complete graph case and then for a general graph. Denote by the event that the ith entry vi of one of the initial vectors v is an instability point. This event occurs if
formula
2.1
It will be convenient to separate on the right-hand side of equation 2.1 the contribution of v itself from that of all other patterns:
formula
2.2
Thus, vi is an instability point if the argument of the function on the right-hand side of equation 2.2 is negative. Namely, the probability that vi is an instability point is
formula
For and , define random variables Tuj by . The four variables in the product are independent, and each assumes the values 1 and with probabilities each. Hence, so does the variable Tuj. Note that while some of the four factors in each Tuj appear in other s as well, the factor uj appears only in Tuj. Thus, the Tujs are independent. Due to symmetry,
formula
Now the variables are -distributed, and it will be more convenient to express the probabilities in terms of these variables:
formula
Put . Since the variables Tuj are independent, T is - distributed. Since n is large, we may use the central limit theorem to obtain
formula
where and Z is a standard normal. Hence:
formula
which is equal to the probability of an error in Hertz et al. (1991). Due to the fact that all the entries of all the vectors behave in the same way, the expected number of instability points is
formula
where is (say) the first random vector. For large where ,
formula
For a general graph G, the calculations repeat verbatim, except that the relevant j’s for node i are only its neighbors in G. That is, denoting by the set of neighbors of i and by the degree of i, the probability of error when starting from vi is
formula
As before,
formula
where . Thus,
formula
2.3
Assuming that is large, we have
formula
Thus, the expected number of instability points is
formula
Therefore,
formula
2.4
We can conclude that, surprisingly, the expected number of instability points depends solely on the set of graph node degrees and is not a function of other parameters of the graph.
Note that in the case of a scale-free graph topology, the expected number of nodes with degree k is determined only by three parameters, and for large n, it is approximately , where n is the total number of nodes and c and are two constants determined by the method of constructing the scale-free network (Barabási & Albert, 1999). Thus, in this case we can approximate the expected number of instability points by
formula
2.5
where k0 is a small constant, which is the minimum degree in the graph, and is also determined by the construction method.

One may ask whether, in fact, the degree sequence determines not only but the exact distribution of I. The following example shows that this is not the case. Consider saving three vectors on the 2-regular graphs G1 and G2, depicted in Figure 1. A simple calculation of the probability function of the random variable I shows that the two distributions are quite different even though both graphs are 2-regular and, in particular, have the same degree sequence. The expected values of both distributions are easily calculated to be 1.125.

Figure 1:

Two 2-regular graphs of size 6.

Figure 1:

Two 2-regular graphs of size 6.

3  Experimental Results

To test the accuracy of equation 2.4, we conducted a few computer simulations and compared them with our theoretical results. Table 1 compares the experimental and theoretical results. We used three groups of 5 graphs, all with the same total number of nodes and edges. The groups have distinct degree sequences. Each group contained five randomly selected graphs with 100 nodes. The average degree in each case was 10. For each graph, 100 experiments were made with 5 stored vectors () and 100 more experiments with 25 stored vectors (). We also compared simulations on scale-free graphs to the result of equation 2.4. We have tested five scale-free graphs with 5 stored vectors () and 5 scale-free graphs with 25 stored vectors (). For each of the 10 graphs we made 100 experiments. All the graphs were built according to the Barabási-Albert model (1999) with the same Barabási-Albert model’s parameters: . Note that we have not compared the experimental results to equation 2.5 since becomes close to only for n much larger than 5000.

Table 1:
Average Number of Instability Points: Experimental versus Theoretical Results.
m = 5m = 25
Node DegreesSimulationEquation 2.4SimulationEquation 2.4
 29.73 28.46 652.41 648.25 
 47.30 45.35 689.19 685.81 
 42.31 41.88 691.70 689.59 
Barabási-Albert model:  23.35 24.09 10,414.6 10,412.81 
     
m = 5m = 25
Node DegreesSimulationEquation 2.4SimulationEquation 2.4
 29.73 28.46 652.41 648.25 
 47.30 45.35 689.19 685.81 
 42.31 41.88 691.70 689.59 
Barabási-Albert model:  23.35 24.09 10,414.6 10,412.81 
     

4  On the Importance of Instability Points

This work deals with initial instability points, that is, instability points in the original saved vectors. One can consider the capacity of the Hopfield model from the point of view of general instability points—points that will change their value if selected by the algorithm in the nonoriginal vectors. Moreover, by considering the route an original vector with initial instability points does until it reaches a stable state, one can learn about the connection between instability points and the final Hamming distance that the originally saved vector reaches. For example, if , there will be, on average, approximately instability points per saved vector. However, the average Hamming distance between the saved and final vector is (Hertz et al., 1991). This is due to an avalanche effect that causes new instability points to emerge while flipping the value of the current instability points. Without this avalanche, the final Hamming distance between an originally saved vector and a final stable vector would have been equal to the number of initial instability points. Understanding the dynamics of this avalanche may help to offer more understanding of the capacity of the Hopfield model. As an example to the dynamics of instability points, our simulations show a strong linear correlation between the sum of inputs of a point (in a complete graph, if this sum is less than zero, the point is an instability point) and the change in this sum after flipping the point with the lowest input in the entire vector. If we sort all the points in a saved vector according to their input and flip the one with the lowest input, p0, then the input to all other points changes because it is affected by the new value of the point we flipped. This change in each point pi input is linearly correlated with the original input of each pi. This means that if a point is close to being an instability point, it has a higher probability of actually becoming a new instability point after flipping an original instability point on the way to the final stable vector, which fact can cause the avalanche. It is known that topology influences the final Hopfield capacity (Jianquan et al., 2006; McGraw & Menzinger, 2003). However, given the importance of instability points in the process, the fact that their initial number is identical in graphs with the same degree sequences regardless of topology may provide a better understanding of the process.

Acknowledgments

This work was partially supported by the Rita Altura Trust Chair in Computer Sciences, Lynne and William Frankel Center for Computer Sciences, and Israel Science Foundation (grant number 428/11).

References

Amit
,
D.
,
Gutfreund
,
H.
, &
Sompolinsky
,
H.
(
1985
).
Storing infinite numbers of patterns in a spin-glass model of neural networks
.
Phys. Rev. Lett.
,
55
,
1530
1533
.
Barabási
,
A.
, &
Albert
,
R.
(
1999
).
Emergence of scaling in random networks
.
Phys. Rev. Lett.
,
286
,
509
512
.
Bovier
,
A.
, &
Gayrard
,
V.
(
1992
).
Neural networks and physical systems with emergent collective computational abilities
.
J. Stat. Phys.
,
69
,
597
627
.
Feng
,
J.
,
Shcherbina
,
M.
, &
Tirozzi
,
B.
(
2001
).
On the critical capacity of the Hopfield model
.
Commun. Math. Phys.
,
216
,
139
177
.
Hamming
,
R. W.
(
1950
).
Error detecting and error correcting codes
.
Bell System Technical Journal
,
29
,
147
160
.
Hertz
,
J.
,
Palmer
,
R. G.
, &
Krogh
,
A. S.
(
1991
).
Introduction to the theory of neural computation
.
Cambridge
:
Perseus Publishing
.
Hopfield
,
J. J.
(
1982
).
Rigorous bounds on the storage capacity of the dilute Hopfield model
.
Proc. Natl. Acad. Sci. U.S.A.
,
79
,
2554
2558
.
Jianquan
,
L.
,
Juan
,
H.
,
Jinde
,
C.
, &
Zhiqiang
,
G.
(
2006
).
Topology influences performance in the associative memory neural networks
.
Phys. Lett. A.
,
354
,
335
343
.
Loukianova
,
D.
(
1997
).
Lower bounds on the restitution error in the Hopfield model
.
Probab. Theory Relat. Fields.
,
107
,
161
176
.
Löwe
,
M.
, &
Vermet
,
F.
(
2011
).
The Hopfield model on a Sparse Erdős-Renyi graph
.
J. Stat. Phys.
,
143
,
205
214
.
Löwe
,
M.
, &
Vermet
,
F.
(
2014
).
Capacity of an associative memory model on random graph architectures
.
CoRR., arXiv:1303.4542
.
McEliece
,
R.
,
Posner
,
E. C.
,
Rodemich
,
E. R.
, &
Venkatesh
,
S.
(
1987
).
The capacity of the Hopfield associative memory
.
IEEE Trans. Inf. Theory
,
33
,
461
482
.
McGraw
,
P.
, &
Menzinger
,
M.
(
2003
).
Topology and computational performance of attractor neural networks
.
Phys. Rev. E.
,
68
,
047102
.
Newman
,
C.
(
1988
).
Memory capacity in neural networks
.
Neural Netw.
,
1
,
223
238
.
Talagrand
,
M.
(
1998
).
Rigorous results for the Hopfield model with many patterns
.
Probab. Theory Relat. Fields.
,
110
,
177
276
.
Watts
,
D.
, &
Strogatz
,
S.
(
1998
).
Collective dynamics of “small-world” networks
.
Nature
,
393
,
440
442
.