Abstract

Hopfield neural networks (HNNs) have proven useful in solving optimization problems that require fast response times. However, the original analog model has an extremely high implementation complexity, making discrete implementations more suitable. Previous work has studied the convergence of discrete-time and quantized-neuron models but has limited the analysis to either two-state neurons or serial operation mode. Nevertheless, two-state neurons have poor performance, and serial operation modes lose fast convergence, which is characteristic of analog HNNs. This letter is the first in the field analyzing the convergence and stability of quantized Hopfield networks (QHNs)—with more than two states—operating in fully parallel mode. Moreover, this letter presents some further analysis on the energy minimization of this type of network. The main conclusion drawn is that QHNs operating in fully parallel mode always converge to a stable state or a cycle of length two and any stable state is a local minimum of the energy.

1.  Introduction

The main feature of iterative algorithms is that new solutions are obtained from the result of a previous computation, and with each update, the solution approaches closer to the final optimum. This behavior is useful in minimization problems where algorithms are continuously approaching the minimum. Algorithms like Newton's or steepest descent (Moon & Stirling, 2000) have proven their good performance in finding the minimum of functions in .

Hopfield proposed the use of recurrent neural networks for solving optimization problems via the minimization of an energy function (Hopfield & Tank, 1985). From a hardware point of view, the Hopfield neural network (HNN) is an analog network composed of several operational amplifiers interconnected by resistors (Hopfield, 1984). Each neuron output is bounded between a maximum and a minimum value, which reduces the search space to an N-dimensional hypercube. Neurons are considered active or inactive if their outputs exceed a threshold or not. The stability of this kind of network is guaranteed when the energy function is described as a Lyapunov function (Haykin, 1999).

Starting from Hopfield's work, HNNs have been successfully employed in several practical problems due to their fast convergence (see examples in Lázaro & Girma, 2000, and Ahn & Ramakrishna, 2004), although many more can be found in the literature—a direct consequence of neuron parallel interworking. In these applications, HNNs can replace other heuristics with worse performance whose use is motivated only by the need of a fast response time.

1.1.  Quantized Hopfield Networks.

The dynamics of the discrete-time and quantized-neuron model, also known as quantized Hopfield networks (QHNs), is, in general, different from that of the continuous-neuron model (Bousoño-Calzón & Salcedo-Sanz, 2004). Although the original HNN is a hardware model using analog circuits, the enormous size of the hardware network with a large number of neurons and the difficulty of accurately implementing the resistor values, which can change network behavior, have made QHNs implemented over digital devices, like field-programmable gate arrays (FPGAs), the best HNN implementation option. Therefore, QHNs are used in computer simulations and, most of all, in realistic hardware implementations of HNNs, since quantization can be conveniently adjusted to the hardware limits of the digital device. When reducing the quantization step, QHNs tend to the digital implementation of continuous HNNs (CHNs). Section 2 presents a comparison of QHNs and CHNs. First, let us formally define QHNs.

Let Q be a QHN of N neurons uniquely defined by (T, i) where T=[Tij] is an symmetric matrix and i=[Ii] is a vector of N elements,1 where N is the number of neurons. The network state at iteration t is defined by the neuron outputs v(t)=[Vi(t)] that are updated by , that is, . Neuron outputs are quantized over the interval [0, 1] by steps of ; thus, , where is the set of all possible states of a neuron of Q. An energy function defined as
formula
1.1
is associated with each QHN Q. If neuron i is updated at iteration t, then
formula
1.2
formula
1.3
and if the neuron is not updated. Depending on how neurons are updated, several modes of operation have been identified (Bruck & Goodman, 1988). Q operates in a serial mode if one neuron is updated at each iteration t, in a parallel mode if n<N neurons are updated, or in fully parallel mode if all N neurons are updated each iteration.

1.2.  State of the Art on Convergence and Stability of QHNs.

In Bruck and Goodman (1988), the behavior of binary Hopfield networks (BHNs) was studied: two-state neurons with p=1, operating in serial and fully parallel modes. The main conclusions were that BHNs operating in serial mode always converge to a stable state if the elements of the diagonal of T are nonnegative and that BHNs operating in fully parallel mode converge to a stable state or to a cycle of length two, that is, BHN oscillates between two states, independent of T. Moreover, Bruck and Goodman demonstrated that the energy is a monotonically decreasing function in the serial-operation mode, obviously with a nonnegative diagonal in T. These conclusions have important implications. First, the fact that BHN stability is guaranteed when operating in serial mode solves one of the main drawbacks of CHNs, which were criticized precisely because of their problems of instability (Wilson & Pawley, 1988; Forti, Manetti, & Marini, 1992). Second, the final conclusion of Bruck and Goodman entails that BHNs in serial-operation mode evolve toward a local minimum of the energy function. Therefore, they exhibit the same ability as CHNs for solving optimization problems with a cost function to be minimized, which is the end purpose of these mathematical tools. However, this conclusion is valid only for the serial-operation mode that entails a significant deterioration of the system response time, since only one neuron is updated per iteration.

Two-state neurons are a simplification of the continuous-neuron model and assume that neurons can take only two values in their evolution. Hopfield (1984) proposed these for solving convergence problems that he detected in his original model. Nevertheless, the use of BHNs has severe consequences, as shown in Calabuig, Monserrat, Gómez-Barquero, and Lázaro (2006), and their outcomes are very poor compared with CHNs (Joya, Atencia, & Sandoval, 2002). When CHNs and QHNs are compared, the performance of a QHN depends on the shape of the energy function, the number of neurons, and the value of p. Obviously, with a greater p, QHNs are more similar to CHNs. For these reasons, QHNs with p>1 are preferable, although high values of p require greater response times.

The generalization of BHNs to QHNs was originally proposed by Matsuda (1999a), and many others have used them since then (Bousoño-Calzón & Salcedo-Sanz, 2004; Matsuda, 1999b). The QHNs proposed by Matsuda operate in serial mode. Considering this mode of operation, Matsuda (1999a) demonstrated the convergence of QHNs to a local minimum of the energy, which reinforces the same statement made in Bruck and Goodman (1988) for BHNs. Moreover, using a modified dynamics, not exactly the one shown in equation 1.2, the convergence to a minimum was ensured independent of the diagonal of T.

1.3.  Objectives of This Work.

Although this letter does not focus on demonstrating the good performance of QHNs, since this issue has already been addressed in Calabuig et al. (2006) and Joya et al. (2002), this letter demonstrates, in section 2, the power of QHNs operating in fully parallel mode. Our main goal is to prove that this type of network can be implemented in a way such that it is fast and stable. With this aim, this letter extends the work of Bruck and Goodman (1988) and Matsuda (1999a) and analyzes the convergence and stability of QHNs with p>1 and fully parallel operation mode. Section 3 demonstrates that this type of network converges to a stable state or to a cycle of length two. Although the energy can increase from one iteration to the next, section 4 shows that if the network converges, it does so toward a local minimum.

2.  Advantages of QHNs Operating in Fully Parallel Mode

One of the main drawbacks of the serial-operation mode is that it loses the fast convergence of HNNs, characterized by the parallel and simultaneous evolution of neurons, making serial operation useless for applications requiring fast response times.

Another drawback is that the convergence to a stable state is ensured only if the elements of the diagonal of T are nonnegative, which is not always true. If this condition is not satisfied, then the QHN may converge to cycles of unknown length, which makes the detection and exit from these cycles practically impossible. Matsuda (1999a) tried to solve this by slightly modifying the updating criteria of equation 1.2. Nevertheless, this approach was never compared with a fully parallel operation mode.

Another solution is to force zeros in the diagonal. HNNs are usually used in optimization problems that can be described with a set of binary variables.2 In fact, the use of binary variables represented by neurons is a common practice in the application of HNNs to engineering problems (see the examples in Hopfield & Tank, 1985; Lázaro & Girma, 2000; Calabuig, Monserrat, Gómez-Barquero, & Cardona, 2008). Therefore, the desired solutions are at the hypercube corners, since only the corners have a physical meaning in the original problem. At the hypercube corners, Vi = 0 or Vi = 1 and, consequently, V2i = Vi. Thus, the quadratic term Tii can be integrated in the linear term Ii without changing the energy value at the corners. Moreover, the corner with minimum energy remains the same. Therefore, with a simple modification of the energy function, the elements of the diagonal of T can be forced to be equal to zero. Nevertheless, this is not always a good option. Although the energy at the corners is not changed, the shape of the energy function inside the hypercube changes drastically and may produce incorrect outcomes.

This section compares these two serial approaches, Matsuda's proposal (SQHN-Mat) and zero forcing (SQHN), with the fully parallel operation mode (PQHN) and CHNs. For this analysis, the energy function defined in Le and Pham (2005) for the M-queens problem (MQP) was used (see the appendix for further details about the simulations). In this case, all the elements of the diagonal of T are negative. All results depicted in Figures 1 and 2 are averaged over 1000 independent simulations with different initial states. Figure 1 shows the performance of the four approaches for an increasing number of queens. The CHN always has less energy than the other three approaches, as shown in Figure 1a. The other QHNs use 64 neuron states, that is, p = 63.

Figure 1:

Performance for different numbers of queens.

Figure 1:

Performance for different numbers of queens.

Figure 2:

Performance for different numbers of neuron states.

Figure 2:

Performance for different numbers of neuron states.

For the SQHN, the diagonal was cancelled to guarantee its convergence. In spite of having the same energy values at the corners as PQHN, the change of the energy shape affects drastically the convergence of SQHNs. Indeed, results prove that SQHN is far from the optimum.

Following from the previous example, the results of Figure 1b demonstrate that PQHNs need significantly fewer iterations to converge than the other three approaches. The time to converge is more than one order of magnitude below that of CHNs and more than two for many queens with respect to SQHN-Mat. Nevertheless, theorem 1 will prove that PQHNs may converge to a stable state or a cycle of length two, whereas the other three approaches always converge to a stable state. This convergence to cycles slightly increases the average energy with respect to CHN and SQHN-Mat, as shown in Figure 1a.

In order to have a measurement more suitable for comparing the performance of the four approaches, Figure 1c depicts the average number of iterations until a good solution is reached or, in other words, until the network converges to a valid solution of the problem. After a network ends its evolution, on detecting a stable state or a cycle of length two, neuron outputs are rounded to the extremes, 0 or 1. If that state is not a valid solution of the MQP, the algorithms are run again with a different initial state. The results of Figure 1c show the total number of iterations needed to reach a good solution. This figure shows that the PQHN has the best behavior, improving the CHN and SQHN-Mat by a factor similar to that of Figure 1b. This means that despite the average energy being slightly higher, the final states of the PQHN, even after detecting a cycle,3 are not far from a good solution, and rounding solves the problem in many cases. In fact, the probability of reaching valid solutions is very similar for the PQHN and SQHN-Mat, which is 45.7%, and 45.8%, respectively. CHNs reach good solutions 49.0% of the time, which is only slightly higher. Nevertheless, SQHNs could not converge to a valid solution in any of the trials. This is due to the change performed in the diagonal of T.

Finally, the behavior of PQHNs relies completely on the number of states. Figure 2 compares different numbers of states for the 12-queens case. SQHN is barely affected by the number of states in terms of average energy—because of the modification of the diagonal of T—whereas PQHN and SQHN-Mat require a certain number of states to show good performance. This figure also demonstrates why BHNs are not a good option, since a QHN's performance worsens for such a low number of states. The average number of iterations required to reach a good solution is also depicted in Figure 2c. This figure shows that at its best, SQHN-Mat needs many more iterations than PQHN.

To sum up, this section has demonstrated, with a simple example, the advantages of QHNs operating in fully parallel mode. They present good, fast convergence behavior compared with CHNs and SQHN-Mat and much better performance than SQHNs. The rest of the letter studies the convergence and stability of PQHNs.

3.  Convergence and Stablity

Bruck and Goodman (1988) proved that BHNs operating in a fully parallel mode converge to a stable state or a cycle of length two. The following theorem shows that any QHN with p>1 can be reformulated in the form of a BHN; hence, the same conclusion remains valid.

Theorem 1. 

Let be a QHN with p > 1 and N neurons and v(t) the neuron outputs at iteration t. Let and .

Then there exists a BHN of neurons and an injective function so that if .

Proof.

In order to prove the theorem, at least one specific BHN and a function F satisfying the above conditions must be found.

First, consider the following transformation. Divide each neuron of Q into p ordered binary neurons in such a way that if the original neuron is in state sn, the first n binary neurons are in state 1 and the rest in state 0. For example, if p = 4, the original neurons can be in any of the following states . Now each neuron is divided into four binary neurons, and their states are selected according to this rule: , , , , and . Assume this transformation is the injective function F.

Since each neuron is divided into p binary neurons, the BHN has neurons. In order to simplify the nomenclature in this proof and as a direct consequence of this transformation, rewrite as a 2D-BHN, , with neurons, where the p neurons of row i are related to the ith neuron of Q. It is worth noting that 2D-HNNs do not differ from HNNs; they are just a way to order the neurons and group them according to their specific meaning. Therefore, if the initial states are the same, , and the dynamics of Q and are equivalent, then , . Thus, if at iteration t, the ith neuron of Q increases, then the first neuron at state 0 of the ith row of must change to state 1. Also, if the ith neuron of Q decreases, the last neuron at state 1 of the ith row of must change to state 0. Consider the following dynamics:
formula
3.1
formula
3.2
formula
3.3
where is the energy function of and is the neuron in the ith row and jth column of . The neurons (i, 0) and (i, p + 1) do not exist, and consequently, and are defined in equation 3.2. Moreover, A is a positive constant greater than the absolute value of any . From equation 3.1, if both and are in state 1, the energy gradient of the neuron (i, j) is negative, and, according to equation 1.2, . Similarly, if both and are in state 0, the energy gradient of the neuron (i, j) is positive, and in the next iteration, . If and , then the energy gradient of the neuron (i, j) is exactly the energy gradient of the ith neuron of Q, . If , then Vi(t) increases in the next iteration. This fact is completely reflected in the ith row of the 2D-BHN since the first neuron in state 0 will change to state 1. Similarly, if , then Vi(t) decreases, and the last neuron of the ith row of the 2D-BHN in state 1 will change to state 0.
The pair can be obtained by comparing equations 1.3 and 3.1. They are
formula
3.4
formula
3.5
Finally, from equations 3.4 and 3.5, and are
formula
3.6
formula
3.7
formula
3.8
where is the ceiling function, that is, the function that returns the smallest integer not less than x. Note that the neurons of the ith row of correspond to the neurons p(i−1)+1 to pi of .

As a result, the BHN defined in equations 3.6 and 3.7 and obtained from the original QHN Q satisfies theorem 1.

Consequently, and from Bruck and Goodman (1988), if v(t) = v(t−1), then it can be concluded that the QHN has reached equilibrium. On the other hand, if and v(t) = v(t−2), the QHN is oscillating between v(t) and v(t−1).

4.  Minimization of the Energy Function

Section 3 showed that any QHN operating in a fully parallel mode reaches a stable state or a cycle of length two. Cycles are not as problematic as they may seem, as shown in section 2. The biggest problem is that QHNs do not always reduce the energy from one iteration to the next. Therefore, although QHNs reach a stable state, the energy at that state could be greater than the energy at the initial state, since a stable state is always a local minimum of the energy but not necessarily the absolute one. This section proves that the energy at any stable state is always less than the energy of the initial state that led to it.

First, the next theorem reveals what happens in the QHN evolution when the energy increases:

Theorem 2. 
Let be a QHN. If the energy increases from iteration t to iteration t + 1, at least one component of the energy gradient changes sign:
formula
4.1
formula
4.2

Proof.
The increment of the energy is
formula
4.3
If , then at least one term of the summation of equation 4.3 must be positive. If the ith term is positive, then
formula
4.4

Thus, from equation 4.4, the energy gradient of neuron i changes sign.

Some additional conclusions can be drawn from analysis of the proof of theorem 2. From equation 4.3, some neurons can be identified as guilty of the energy increase. As it has been proven, all components of the energy gradient that correspond to the guilty neurons change sign. Thus, the QHN tends to correct the cause of the energy increment since for those neurons, (recall equation 1.2). This relevant result hints that the QHN has a good evolution despite sporadic increments of the energy. The following theorem states formally this fact:

Theorem 3. 

Let be a QHN. Then if Q reaches a stable state, the energy at this state is less than or equal to the energy of any previous state.

Proof.
Let us define te as the iteration where Q reaches the stable state, that is, , and ta < te as any previous iteration. Similarly to equation 4.3, the increment of the energy from ta to te is
formula
4.5

Therefore, the energy increment can be divided into N terms—one per neuron. Each term is further split into products of neuron variations and energy gradients. Now this proof will focus on each one of these products to show that all of them are nonpositive, which proves the validity of the theorem.

The first product, , is always nonpositive due to equation 1.2. For the last product, , two different cases can be identified. The first case is when the equilibrium is reached provided that or . Then the last product is, obviously, nonpositive. If , that means that the ith neuron is at one of the hypercube extremes, that is, Vi(te)=0 or Vi(te)=1. Moreover, if , the ith neuron has reached the extreme exactly at iteration te. Then and , or and . In both cases, and must have opposite signs; if not, , and Q would not be at the stable state.

For the rest of products, three different cases must be studied. The first case is when and have the same sign, or ; the second is when and have different signs—hence, their sum is zero; and the third is when . Since and have opposite signs, all the products of the first case are nonpositive. For the second case, , and thus these products are also nonpositive. Finally, for the last case, if because , then the products are nonpositive. Second, if , then the ith neuron is at one of the extremes at iteration t; and and must have opposite signs; if not, would not be zero, as demonstrated for the last product. Table 1 summarizes all possible combinations of this second term.

Table 1:
Possible Combinations of the Second Term of Equation 4.5.
 Sign of 
Case     
− − 
 (Impossible combination) 
3a 
3b − −(Vi(t)=1) 
− 
− − 
0(Vi(t)=0) 
− 0(Vi(t)=1) 
− − 
− − 
3b − −(Vi(t)=0) 
3a − 
 − − (Impossible combination) 
− − − 
 Sign of 
Case     
− − 
 (Impossible combination) 
3a 
3b − −(Vi(t)=1) 
− 
− − 
0(Vi(t)=0) 
− 0(Vi(t)=1) 
− − 
− − 
3b − −(Vi(t)=0) 
3a − 
 − − (Impossible combination) 
− − − 

Moreover, from this proof, theorem 3 can be generalized to any iteration t that satisfies the following condition:
formula
4.6

5.  Conclusion

This letter has extended previous studies, analyzing the convergence and stability of QHNs operating in a fully parallel mode. This analysis is interesting because these neural networks are easily implementable and take advantage of the original parallelism of HNNs.

Moreover, this letter has proved that QHNs operating in fully parallel mode always converge to a stable state or to a cycle of length two. Moreover, cycles are not very problematic, as shown in section 2, and this type of network requires many fewer iterations to find a good solution of the MQP compared with CHNs and other QHNs operating in serial mode.

Finally, although the energy does not always decrease from one iteration to the next, the QHN dynamics always tend to decrease the energy, obtaining a stable state with less energy than the initial state of the neural network.

As future work, a deep analysis of cycles would be very interesting. Although the results of section 2 show that cycles do not damage the performance of QHNs for the MQP, they could have severe consequences in other applications. Additionally, a step forward after this letter is the implementation—or an implementation study—of QHNs in digital devices, like FPGAs.

Appendix

This appendix presents some technical issues about the simulations performed in section 2. The energy function used is Le and Pham (2005):
formula
A.1
with A=B=1000. All elements of the main diagonal of T are exactly −2A, which are not nonnegative (i.e., they are negative). The CHNs were not implemented with the analog circuit but simulated in a computer. The evolution of the analog circuit can be characterized by the following differential equation:
formula
A.2
where are constants that depend on the elements of the analog circuit. This evolution was simulated using Euler's technique:
formula
A.3
with . This value was selected by a trial-and-error procedure. Lower values make the CHNs require more iterations to converge, whereas greater values make CHNs diverge. Note that the time variable t is different in equation A.2 than from the rest of the letter, including equation A.3. In equation A.2, t is continuous since this equation corresponds with the actual evolution of CHNs, which are continuous in time. In the rest of the letter, t is an integer that represents the current iteration.

Acknowledgments

Part of this work has been performed in the framework of the CELTIC project CP5-013 ICARUS. This work was partially supported by the Spanish Ministry of Industry, Tourism and Trade and the FEDER program of the European Commission under the project TSI-020400-2008-113 and the Universidad Politécnica de Valencia (PAID-06-08/3301).

Notes

1

Further details on the physical meaning of the HNN parameters can be found in Hopfield (1984).

2

Although variables in the original problem are binary, neurons characterizing those variables should have more than two states to improve the outcomes.

3

Note that PQHN may converge to cycles. In that case, the final state is one of the two states of the cycle.

References

Ahn
,
C. W.
, &
Ramakrishna
,
R. S.
(
2004
).
QoS provisioning dynamic connection-admission control for multimedia wireless networks using a Hopfield neural network
.
IEEE Transactions on Vehicular Technology
,
53
,
106
117
.
Bousoño-Calzón
,
C.
, &
Salcedo-Sanz
,
S.
(
2004
).
A discrete-time quantized-state Hopfield neural network
.
Annals of Mathematics and Artificial Intelligence
,
42
,
345
367
.
Bruck
,
J.
, &
Goodman
,
J. W.
(
1988
).
A generalized convergence theorem for neural networks
.
IEEE Transactions on Information Theory
,
34
,
1089
1092
.
Calabuig
,
D.
,
Monserrat
,
J.
,
Gómez-Barquero
,
D.
, &
Lázaro
,
O.
(
2006
).
User bandwidth usage-driven HNN neuron excitation method for maximum resource utilization within packet-switched communication networks
.
IEEE Communications Letters
,
10
,
766
768
.
Calabuig
,
D.
,
Monserrat
,
J. F.
,
Gómez-Barquero
,
D.
, &
Cardona
,
N.
(
2008
).
A delay-centric dynamic resource allocation algorithm for wireless communication systems based on HNN
.
IEEE Transactions on Vehicular Technology
,
57
,
3653
3665
.
Forti
,
M.
,
Manetti
,
S.
, &
Marini
,
M.
(
1992
).
A condition for global convergence of a class of symmetric neural circuits
.
IEEE Transactions on Circuits and Systems I
,
39
,
480
483
.
Haykin
,
S.
(
1999
).
Neural networks: A comprehensive foundation
.
Upper Saddle River, NJ
:
Prentice Hall
.
Hopfield
,
J. J.
(
1984
).
Neurons with graded response have collective computational properties like those of two-state neurons
.
Proceedings of National Academy of Sciences
,
81
,
3088
3092
.
Hopfield
,
J. J.
, &
Tank
,
D. W.
(
1985
).
“Neural” computation of decisions in optimization problems
.
Biological Cybernetics
,
52
,
141
152
.
Joya
,
G.
,
Atencia
,
M. A.
, &
Sandoval
,
F.
(
2002
).
Hopfield neural networks for optimization: Study of the different dynamics
.
Neurocomputing
,
43
,
219
237
.
Lázaro
,
O.
, &
Girma
,
D.
(
2000
).
A Hopfield neural-network-based dynamic channel allocation with handoff channel reservation control
.
IEEE Transactions on Vehicular Technology
,
49
,
1578
1587
.
Le
,
T. N.
, &
Pham
,
C. K.
(
2005
).
A new N-parallel updating method of the Hopfield-type neural network for N-queens problem
. In
Proceedings IEEE International Joint Conference on Neural Networks
(Vol.
2
, pp.
788
791
).
Piscataway, NJ
:
IEEE Press
.
Matsuda
,
S.
(
1999a
).
Quantized Hopfield networks for integer programming
.
System and Computers in Japan
,
30
,
1
12
.
Matsuda
,
S.
(
1999b
).
Theoretical analysis of quantized Hopfield network for integer programming
.
Proceedings IEEE International Joint Conference on Neural Networks
(Vol.
1
, pp.
568
571
).
Piscataway, NJ
:
IEEE Press
.
Moon
,
T. K.
, &
Stirling
,
W. C.
(
2000
).
Mathematical methods and algorithms for signal processing
.
Englewood Cliffs, NJ
:
Prentice Hall
.
Wilson
,
G. V.
, &
Pawley
,
G. S.
(
1988
).
On the stability of the travelling salesman problem algorithm of Hopfield and Tank
.
Biological Cybernetics
,
58
,
63
70
.