## Abstract

The idea that there is an *edge of chaos*, a region in the space of dynamical systems having special meaning for complex living entities, has a long history in artificial life. The significance of this region was first emphasized in cellular automata models when a single simple measure, λ_{CA}, identified it as a transitional region between order and chaos. Here we introduce a parameter λ_{NN} that is inspired by λ_{CA} but is defined for recurrent neural networks. We show through a series of systematic computational experiments that λ_{NN} generally orders the dynamical behaviors of randomly connected/weighted recurrent neural networks in the same way that λ_{CA} does for cellular automata. By extending this ordering to larger values of λ_{NN} than has typically been done with λ_{CA} and cellular automata, we find that a second edge-of-chaos region exists on the opposite side of the chaotic region. These basic results are found to hold under different assumptions about network connectivity, but vary substantially in their details. The results show that the basic concept underlying the lambda parameter can usefully be extended to other types of complex dynamical systems than just cellular automata.

## 1 Introduction

The notion that there is an *edge of chaos*, a transitional region between order and disorder in the space of dynamical systems that has special meaning for complex living entities, has a long history in the field of artificial life. During the 1980s, extensive computational experiments and theoretical analysis established that the dynamics of cellular automata models fall into four broad classes [23]. These classes can be characterized by the behavior of a cellular space when started from an initial random configuration (state):

- I.
*uniformly quiescent*—all cells quickly become quiescent, with all activity in the space dying out; - II.
*fixed or periodic activity*—local activity patterns that either stabilize (fixed-point attractors) or exhibit oscillatory behavior having a relatively short period (limit cycles); - III.
*chaotic activity*—widespread random-appearing activity that persists indefinitely, exhibiting sensitivity to initial conditions and unpredictability; and - IV.
*complex activity*—persistent complex activity patterns that are often localized but persistent and sometimes propagate across the cellular space.

Subsequent work established a simple single measure λ_{CA} that correlates with the dynamical behavior and Wolfram class of cellular automata [11].^{1} The lambda value is defined as the fraction of rules in a cell's transition function that result in the cell being assigned a non-quiescent state at the next time step. Thus, λ_{CA} always lies in the real interval [0.0, 1.0], with 0 indicating that all transition rules lead to the quiescent cell state, and 1 that all rules lead to a non-quiescent state. Of central interest here is that lambda values roughly order the four classes of cellular automata as illustrated in Figure 1, giving a clear structure to the space of cellular automata transition functions. From this perspective, the class IV cellular automata models supporting complex localized activity patterns most reminiscent of living systems can be viewed as a phase transition—the *edge of chaos*—between simple, fixed-point or limit-cycle configurations (order) and widespread chaotic configurations or disorder [11].

More recently, the significance of the edge-of-chaos regime has been further investigated explicitly or implicitly in other types of non-cellular-automata computational models. For example, examination of models inspired by ant colonies has revealed a variety of dynamical regimes as an activity gain parameter and an ant density parameter are varied, finding that values of these parameters defining the border region between ordered and chaotic dynamics tell where significant computational properties can emerge [22]. Other studies have determined that Boolean networks originally inspired by gene regulatory networks have an edge-of-chaos regime that appears when the average incoming connectivity has a critical value of 2, and have explored its relationship to information propagation and adaptation [6, 9].

Most recently, an increasing amount of interest has focused on investigating the edge of chaos in randomly connected recurrent neural networks. Similar to cellular automata and random Boolean networks, recurrent neural networks can exhibit fixed-point, limit cycle, and chaotic behaviors. While the occurrence of different dynamic behaviors and the phase transitions between them as network parameters change has long been of interest [3, 7, 16, 19], this interest has increased dramatically over the last few years, in large part due to advances in reservoir computing methods for processing time series data [8, 14]. In this paradigm, a randomly connected recurrent neural network—the reservoir—is used as a “hidden layer” that is driven by a temporal sequence of input signals, and that in turn drives the network's adaptive outputs. Such networks have been found to excel at learning to process time series data during prediction and classification tasks while only requiring learning on reservoir-to-output connections. Several studies have now shown that operating randomly connected or weighted recurrent neural networks in the edge-of-chaos regime typically improves the network's computational properties and ability to process information effectively [1, 2, 12, 15, 21]. While details vary, these results have proven to be true for networks composed of analogue rate-encoding neurons, binary-valued threshold neurons, and spiking neurons. Thus there is now a substantial body of work showing that the edge of chaos, originally studied in cellular automata and other aspects of artificial life, is an important operating regime for information processing in recurrent neural networks. However, the parameter spaces examined have generally been multidimensional, where the individual dimensions (parameters) used include the number of incoming connections, the variance of weights, the fraction of excitatory connections, the mean node bias values, the spatial scale of local connectivity, and weight magnitude gain factors. None of this past work has examined whether there is a single simple measure for neural networks that is analogous to Langton's λ_{CA} for cellular automata.

In this context, here we introduce a parameter λ_{NN} that is explicitly inspired by and analogous to λ_{CA} but is defined for recurrent neural networks. Our intent in doing this is not just to document the different regions in the space of neural networks that we consider, but also to explore whether a single measure like this can help identify critical regions that will yield maximum computational performance when randomly connected neural networks are used during information-processing tasks, such as with reservoir networks. We show through a series of systematic computational experiments that λ_{NN} generally orders the dynamical behaviors of randomly connected/weighted recurrent neural networks composed of linear threshold units in the same way that λ_{CA} does for cellular automata (Figure 1). Further, by extending this ordering to larger values of λ_{NN} than has typically been done with λ_{CA} and cellular automata (Figure 1, horizontal axis), we find that a second edge of chaos exists on the opposite side of the chaotic region that corresponds to larger λ_{NN} values. These basic results are found to hold under different assumptions about network connectivity, such as using fully connected versus sparsely connected networks, but vary substantially in their details.

## 2 Methods

We first describe the neural networks and experimental methods used in this work, and then define the measure λ_{NN} used to order the space of neural networks.

### 2.1 Network Description

We examine computationally the long-term dynamical behavior of a variety of recurrent neural networks having randomly connected/weighted connectivity between *N* nodes. Four different types of connectivity were used in different computational experiments: full connectivity (every node is connected bidirectionally to every other node), random partial connectivity (every node receives *k* < *N* input connections from other randomly chosen nodes), local connectivity (every node receives *k* < *N* input connections from its *k* nearest neighbor nodes), and local-plus-random connectivity (every node receives *k* < *N* input connections from other randomly chosen nodes and also from its *k* < *N* nearest neighbor nodes). We include the study of networks with random partial connectivity because they can be related to the sparsely connected networks used in some historically significant random neural network investigations [3, 7] and they are also reminiscent of the sparse connectivity often used in contemporary reservoir computing. We also study networks with local connectivity, both with and without associated random “long range” connections, because this is reminiscent of the connectivity in the cerebral cortex and several past brain-inspired cortical models. Such local connectivity is also most similar to what occurs in cellular automata, where a cell's transition function is based on local neighborhoods [11, 20, 23].

Network weights in general are a mixture of randomly assigned excitatory and inhibitory values, but in any given simulation are uniform in the sense that all excitatory weights have the same positive value and all inhibitory weights have the same negative value. Two parameters *fp* and *wp* determine how weights are assigned. Whether each weight is excitatory or inhibitory is randomly and independently determined so that a connection is excitatory with probability *fp*, and inhibitory with a probability 1.0 − *fp*. Thus *fp* represents the expected value of the *f*raction of connections in a network that are *p*ositive (excitatory). Each connection randomly selected to be inhibitory has a weight of −1.0, while each excitatory connection in a given simulation has a weight *wp*, where *wp* can take on positive values smaller than, equal to, or larger in magnitude than 1.0 in different simulations in order to adjust the relative amount of excitation and inhibition in any given network. In the simulations involving networks having local connections, the *N* neurons in the network are viewed as forming a one-dimensional lattice having periodic boundary conditions where each neuron has *k*/2 connections to the closest neighbor nodes on each side. In this situation, the excitatory connections, whose number was determined stochastically based on *fp*, are allotted to the closest neighbors, and the more distal neighbors are allotted the remaining inhibitory connections. This arrangement is inspired by the *Mexican hat* pattern of lateral connectivity observed in the mammalian cerebral cortex and often used in computational models of the cortex.

*i*has an activity level designated

*a*

_{i}of either 0 (off) or 1 (on). The input

*in*

_{i}to node

*i*at any point in time

*t*is given bywhere

*w*

_{ij}is the weight on the connection from another node

*j*to node

*i*, and

*b*

_{i}is a fixed bias value. The activation level of each node

*i*at time step

*t + 1*is then defined as

*a*

_{i}

*(t + 1)*= 0 if

*in*

_{i}

*(t)*< 0, and

*a*

_{i}

*(t + 1)*= 1 otherwise; that is, a threshold of zero is always used. We use a very small, fixed bias

*b*

_{i}= −0.0001 in all simulations so that nodes will be off if they have a zero or negligible input.

### 2.2 Experimental Methods

We conducted a systematic series of computational experiments where, in each single experiment, one of the four types of connectivity (full, random partial, local, or local-plus-random) and one set of specific *fp* and *wp* parameter values are used. Across different experiments, the parameter *fp*, which determines the probability that a connection is positive or negative, is systematically varied from 0.0 to 1.0 in steps of 0.025. For each *fp* value, the ratio of positive to negative weight magnitudes *wp* is also systematically varied from 0.0 to 2.0 in steps of 0.05; the negative weights are always equal to −1.0, as we are primarily interested in the *relative* amounts of excitation and inhibition in a network. In any single simulation, the activation value *a*_{i} is randomly initialized to 1 or 0 with equal probability, so the expected number of on nodes initially is *N*/2 (using other initial states might sometimes produce different results). Because of this and the random nature of connection and weight assignments during network construction, each computational experiment (i.e., each specific combination of type of connectivity, *fp* value, and *wp* value) is run 50 times with different random number streams for each simulation. All simulations are generally run for a maximum of *t*_{max} = 1000 time steps using *N* = 100 neurons in each simulation. If a network's activity during a simulation reaches a fixed point or limit cycle, this is automatically recorded and the simulation is terminated early at that point solely for computational efficiency. When networks have partial or local connectivity, we always use *k* = 10 connections per node.^{2}

Every time a single neural network is run, generally up through 1000 time steps, its final attractor state is determined. If, by the final time step, all of the neurons are off, the network's activity is classified as *extinguished* (corresponds to Wolfram class I). If all of the neurons are turned on, the network's activity is classified as *saturated* (also a uniform outcome reminiscent of class I behavior). If, during a run, the network's activity state no longer changes from time step to time step, it is considered to have reached a *fixed point* where its activity has been neither extinguished nor saturated (class II). If the network's activity state repeats (i.e., it is cycling through a sequence of states), it is classified as having reached a *limit cycle* (class II). If the network's dynamics did not match any of these types of attractors by time step *t*_{max} = 1000, it is classified as exhibiting a *chaotic* dynamics (class III).^{3} To approximately identify the edge-of-chaos regions (class IV) in our results, we generally designate a particular connectivity-*fp-wp* combination as being part of an edge-of-chaos region if (i) it lies in between regions where activity quickly reaches a fixed point or limit cycle (class II) in all 50 runs and regions where activity never does (class III) during all 50 runs; (ii) it takes a long time for transient complex activity patterns to reach a fixed point or limit cycle (but not longer than *t*_{max}); and (iii) among the 50 simulation runs for this specific connectivity-*fp-wp* combination, some runs reach a fixed point or limit cycle by *t*_{max}, while some do not and thus are classified as chaotic. As will be seen, these criteria generally lead to well-defined contiguous regions in the space of neural networks that we examine.

### 2.3 Defining a Lambda Parameter for Recurrent Neural Networks

How can we define a lambda parameter λ_{NN} for neural networks that is analogous to λ_{CA} for cellular automata? The intent of the original measure λ_{CA} was to provide a single parameter that would naturally order the space of cellular automata transition functions (rule sets), separating this space into regions that have similar dynamics, and in particular identifying the conditions under which one might expect a complex dynamics to emerge [11]. As noted earlier, defining λ_{CA} to be the fraction of rules (transitions) that lead to a non-quiescent cell state was found to be a remarkably simple measure that, while not perfect, largely accomplished this task for 0 ≤ λ_{CA} ≤ 1 − 1/*s*, where *s* is the number of possible individual cell states. The task here is to define an analogous simple measure λ_{NN} for recurrent neural networks. Doing so is nontrivial in the sense that, unlike the individual cells in most cellular automata models, nodes in randomly connected/weighted neural networks like those we consider here each have a different local transition function: In some cases (due to the random connectivity), local neighborhoods do not vary or overlap in a systematic way between adjacent nodes as they do in cellular automata, and even if this is not the case, the fraction of each node's input connections that are excitatory versus inhibitory can also vary due to the random assignment of excitatory versus inhibitory weights, even for fully connected networks where all nodes have the same number of incoming connections from all other nodes. Thus, the neural networks that we study are closer to more general discrete dynamical networks such as random Boolean networks than they are to typical cellular automata.

_{NN}, for any given connectivity constraint (full, random partial, etc.), consider a 2D space of neural networks whose dimensions are

*fp*, the probability that a randomly selected connection is excitatory rather than inhibitory, and

*wp*, the weight value assigned to excitatory connections. We hypothesize that a simple measure λ

_{NN}based on the relative amounts of excitation and inhibition in a neural network can provide an analogous ordering of neural networks in this space to that provided by λ

_{CA}for cellular automata. Specifically, letbe the sum of all weights, both excitatory and inhibitory, on the connections between nodes in a recurrent network that is based on specific

*fp*and

*wp*values. We assume that

*w*

_{ij}= 0 whenever there is no connection from node

*j*to node

*i*, as occurs in partially connected networks. The sum

*S*can range from large negative to large positive values, but for finite network size

*N*and finite weights, it is bounded by minimum and maximum values that we will designate

*S*

_{min}and

*S*

_{max}, respectively. Then for an individual recurrent neural network having specific

*fp*and

*wp*values, we defineas our candidate for λ

_{NN}. Like λ

_{CA}, the value of λ

_{NN}lies in the interval [0.0, 1.0] and dictates a partial ordering on the space of recurrent neural networks.

_{NN}is reminiscent of λ

_{CA}in that λ

_{NN}roughly measures the fraction of local transitions that would lead a node to be non-quiescent (i.e., to have a value

*a*

_{i}= 1) rather than to be quiescent (

*a*

_{i}= 0). This is because the linear threshold neurons as defined above determine their activation state to be non-quiescent iff their local excitatory input exceeds their local inhibitory input, which is in turn governed on average by the relative amounts of excitatory and inhibitory weights in the network. Further, we can be more specific about λ

_{NN}for the types of networks considered here, regardless of which of the four types of connectivity (full, random, partial, etc.) are involved, by giving more concrete characterizations of

*S*

_{min}and

*S*

_{max}for the range of

*fp*and

*wp*values used. Specifically,

*S*will take on the value

*S*

_{min}when all connections are inhibitory (

*fp*= 0.0) and all excitatory weights are zero (

*wp*= 0.0), so

*S*

_{min}=

*S*(0, 0). In contrast,

*S*

_{max}will occur when all weights are excitatory (

*fp*= 1.0) and these weights all take on their maximum value (

*wp*= 2.0 for the range of weights we consider), so

*S*

_{max}=

*S*(1, 2). Thus, we can rewrite our definition of λ

_{NN}asfor the four types of network architecture and range of

*fp*and

*wp*values that are examined in the following computational experiments.

## 3 Results

Our main results are that variations in λ_{NN} values order regions of varying dynamics in the space of neural networks in a fashion analogous to those observed with λ_{CA} in cellular automata. Further, edge-of-chaos regions can often be found, although their details can vary substantially depending on the type of connectivity that is present. We also describe the results of examining how a network's Wolfram class correlates with its ability to serve successfully as a reservoir in a very simple time series learning task.

### 3.1 Lambda Behavior

Figure 2a shows λ_{NN} values, multiplied by 100 and rounded for display purposes, for the *fp*-by-*wp* space of neural networks described above when random partial connectivity (*k* = 10) is being used. For example, when *fp* = 0.3 and *wp* = 0.3, then λ_{NN} = 0.13. As can be seen, λ_{NN} monotonically increases as one moves roughly diagonally from the upper left corner of this figure, where λ_{NN} is 0.0, to the lower right corner, where λ_{NN} is 1.0. Figure 2b shows a contour plot (generated using matlab) of these same λ_{NN} values that demonstrates this more clearly. Any straight or mildly concave or convex line running from the upper left corner to the lower right corner that is roughly perpendicular to each of the contours as it crosses them provides a path of gradually increasing λ_{NN} values.

Thus, as it was designed to do, λ_{NN} provides a single simple measure that imposes a partial ordering on the neural networks represented by each point in this *fp*-by-*wp* space. Neural networks gradually transition from those dominated by inhibitory connections (upper left corner in both parts of Figure 2) to those dominated by excitatory influences (lower right corner), passing through intermediate regions where excitation and inhibition are roughly balanced (e.g., when *fp* = 0.5 and *wp* = 1.0, producing λ_{NN} = 0.33). While Figure 2 illustrates this for network architectures having partial random connectivity, qualitatively similar results are obtained for other network architectures (fully connected, locally connected, or locally-plus-randomly connected), but are not shown here for brevity.

### 3.2 Characterizing Network Dynamics

We systematically characterized the dynamics of randomly connected/weighted neural networks represented by points throughout this *fp-wp* space for each of the four types of network connectivity (full, random partial, local, or local-plus-random). Given the quantization of the *fp* and *wp* scales that we used, and running 50 simulations with each having different random number streams for each *fp-wp* value pair, this represents a total of 336,200 independent network simulations (4 × 41 × 41 × 50).

Figure 3 shows the results obtained when fully connected networks are studied in this fashion. Figure 3a plots the network dynamics obtained as *fp* and *wp* are systematically varied, showing the different regions that are observed by labeling each entry with the result seen most frequently during the 50 runs at that point in *fp-wp* space. As one moves progressively from the upper left to the lower right (i.e., from λ_{NN} = 0 to λ_{NN} = 1), one sequentially encounters a region where all activity dies out (labeled with o's), a region where complex/chaotic dynamics dominates (#'s), regions where limit cycles (c's) and fixed-point attractors (f's) predominate, and finally a region where the network is saturated (1's). Comparing this with how λ_{NN} varies in Figure 2, it is seen that, just as occurs with λ_{CA} and cellular automata, λ_{NN} provides a single measure that roughly indicates how regions with differing dynamics occur in these randomly weighted neural networks.

Figure 3b makes the edge-of-chaos regions more evident (labeled with +). In Figure 3a, each label indicates the most frequent dynamics that occurs at its specific *fp* and *wp* values out of all 50 runs for that point in *fp-wp* space. In contrast, here in Figure 3b the same symbols o, #, and 1 label locations where for *all* 50 runs, activity is extinguished, chaotic, or saturated, respectively. Sandwiched in between these regions are other curving regions where a mixture of fixed-point and limit cycle attractors both are observed (labeled b) or a mixture of these attractors and chaotic behaviors are seen (labeled +) during different runs. Taking the latter regions (+'s) to roughly indicate the edge of chaos, we see that the edge of chaos lies on both sides of the region in which complex/chaotic dynamics occurs, that is, for both smaller and larger values of λ_{NN} than those occurring with complex/chaotic dynamics. This observation is reinforced by Figure 3c, which indicates via grayscale how long it takes for networks to reach a fixed-point or limit cycle attractor in *fp-wp* space. Regions corresponding to the edge of chaos (shaded gray) often have networks that, while they ultimately result in fixed-point or chaotic attractors, take progressively longer on average to reach these simple-dynamics final states as one gets closer to the chaotic region of dynamics. Finally, Figure 3d uses a grayscale to show the fraction of final states that occur most often for different values of λ_{NN}.

Figure 4 shows the analogous results for partially randomly connected networks having *k* = 10 input connections per node. While the λ_{NN} values continue to identify a similar progression of dynamical regimes as one moves from upper left to lower right in Figure 4a and b, there are substantial changes in the location and sizes of the different regions compared to Figure 3. The regions of complex and chaotic dynamics have broadened and shifted significantly, areas that we view as edge of chaos regions (+'s) are greatly expanded at the expense of the region where solely chaotic dynamics occur (Figure 4b), and there is a clearer indication that regions where fixed-point attractors occur and those where limit cycles occur are more disjoint. As Figure 4c and d illustrate, much less of the *fp-wp* space is dedicated to uniform final activity states (extinguished or saturated regions) and more to broader regions with marginal or complex dynamics.

Figure 5 shows the corresponding results for locally connected networks, again with *k* = 10. The values of λ_{NN} continue to show a progression of dynamical regions in *fp-wp* space, but there is now no region in which chaotic dynamics occurs at all, and hence no edge-of-chaos regions are identifiable. As seen in Figure 5a and b, all simulations terminate with either a uniform activity pattern (extinguished, saturated), other fixed-point attractor states, or limit cycles prior to reaching *t*_{max} = 1000 time steps. Still, the central regions of the space continue to take longer to reach their final simple attractor states (Figure 5c). Limit cycles have especially become more common (Figure 5d).

Finally, Figure 6 shows the same set of results for networks composed of both local connections (*k* = 10) and partial random connectivity (*k* = 10), having 20 output connections per node. These results are largely in between those illustrated in Figures 4 and 5 for random partial and solely local connectivity networks considered separately. Regions in which complex and chaotic dynamics occur have returned, although chaotic behaviors never occur most frequently for any specific *fp* and *wp* values. A region that resembles the edge-of-chaos (class IV) regions in the results of Figures 3 and 4 described above (+'s) has greatly expanded (Figure 6b). Compared to when local connectivity is used alone, when random connections are added like this, in many cases it took much longer to reach final fixed-point or limit cycle attractors, consistent with the idea that much of the dynamics is more similar to the edge-of-chaos regime.

### 3.3 Influence of Dynamics on Learning Effectiveness

Much of the resurgence of interest in the dynamics of randomly connected/weighted recurrent neural networks during the last decade has arisen because of the use of such networks as *reservoirs* in the processing of time series data [8, 14]. The problem in this situation is to determine a priori what conditions a reservoir should have to function optimally. As a small step in examining this problem in the context of λ_{NN} and the edge-of-chaos dynamics, we undertook a separate systematic set of computational experiments in which each of the four types of network architectures is used, for varying *fp* and *wp* values, as a reservoir for a very simple time series learning task. The recurrent network structures used as reservoirs are unchanged from those described above (100 linear threshold neurons, etc.), and the procedures used are largely unchanged, except as follows. Now each run has two additional input nodes fully connected to each reservoir node, and three output nodes that receive incoming connections from every reservoir node. The weights between the input nodes and the reservoir are arbitrarily assigned random values from 0.0 to 1.0, while the weights from the reservoir to the output nodes are assigned initial values from −1.0 to 1.0. Further, some preliminary runs indicated that for networks where activity would normally die out to zero at every node (extinguished) in the absence of external inputs, incoming activity to the reservoir nodes from the input nodes now prevents this from occurring. For this reason, a stronger fixed −1.0 bias is used at all reservoir nodes in all simulations during learning. Each output node also has a bias value initialized to be a random value between 0.0 and 1.0.

*w*

_{ij}is given bywhere

*i*indexes an output node,

*j*indexes a reservoir node (or the bias), η is a fixed learning rate (0.1 in our runs),

*t*

_{i}is the target (correct output value), and

*a*

_{i}(

*a*

_{j}) is the corresponding actual value of the output (reservoir) node. Starting at time step

*t*= 1, all runs are allowed to run longer than in the previous experiments, for 6400 time steps per run (at which time the final Wolfram class was determined), to allow extra time for learning. For this reason and because of the increased computations per run required for weight changes, only one run is done for each pair of

*fp*and

*wp*values (rather than 50) to minimize computational costs, resulting in a total of 6,724 additional independent simulation runs (4 × 41 × 41). Weight changes were made incrementally, that is, after each time step of a run.

Figure 7 summarizes the results of training the four types of networks examined in this study. The grayscale here indicates the length of time required to reach an end state classification of each network's dynamics, determined as in the preceding computational experiments. While this is similar to what was displayed in part (c) of the preceding four Figures 3,^{4}^{5}–6, the details of the resulting plots differ in each case because only a single run (rather than an average of 50 runs) is being described for each pair of *fp* and *wp* values, and because the input node activities influence the reservoir's activation states. The stars in these four images indicate those points in *fp-wp* space for which training produced a neural network that achieved 100% correct output node activation levels by the time training is complete. As can be seen by comparing the locations of these successful training outcomes with the regions that apparently represent edge-of-chaos dynamics here (and in the corresponding parts b and c of the preceding figures), learning generally but not always is most effective when a network is in the edge-of-chaos regime.

## 4 Discussion

There has been long-standing interest, which continues today, in developing measures that can characterize the dynamical behaviors of complex cellular and network systems. The measures that have been studied, and the systems that they have been applied to, are quite diverse. For example, Wuensche has introduced and studied a parameter *Z* based on the convergence of dynamical flows in cellular automata state space and suggested that it characterizes the mechanism underlying λ_{CA}, the latter being an approximation of *Z* [24, 25]. Measures have also been developed for random Boolean networks [9], including the use of Fisher information [18], to distinguish their ordered, critical, and chaotic regimes. Lyapunov exponents characterizing the expansion rates of perturbations to node activities have been defined for similar purposes [13]. Further examples and discussion can be found in a recent review [4]. Our results in this article add to this growing list of potentially useful measures of complex system dynamics.

In the work reported here, we have examined the effectiveness of a single simple measure λ_{NN} in characterizing the dynamics of recurrent neural networks of linear threshold neurons. As with the original lambda parameter λ_{CA} for cellular automata models [11], we found that λ_{NN} strongly correlates with the Wolfram classes of dynamics, extended to apply to recurrent neural networks. For fully connected and random partially connected networks, λ_{NN} orders these classes in a similar way. This is not particularly surprising in that λ_{CA} measures the fraction of state transitions (rule table entries) producing non-quiescent cell activity, while λ_{NN}, in measuring the relative amounts of excitation and inhibition in a network, also effectively determines how many local activity patterns will be mapped to “on” neurons (non-quiescent neurons). Put otherwise, λ_{NN} can be viewed as predicting the approximate fraction of local states that will lead to non-quiescent neurons in a network. Thus λ_{NN} as described here expands the range of neighborhood structures to which the lambda concept applies, from the purely local ones used in cellular automata to others, such as fully connected or randomly distributed neighborhoods.

While λ_{NN} orders the regions of differing dynamics in the space of neural networks in a fashion that one might expect, we found that like λ_{CA}, its absolute numerical value is of limited usefulness in predicting a priori the dynamics of a specific individual neural network. For example, given a single λ_{NN} value, two specific neural networks with the same connection architecture but different random excitatory and inhibitory weights and different initial states might end up in different types of attractors over time. If one also changes the connectivity distribution in these two networks, this ambiguity is substantially increased. For example, one distribution might fairly quickly result in a periodic attractor, while another might continue to exhibit chaotic activity over a long time span.

When the range of λ_{NN} values examined in fully or partially connected networks includes larger values of λ_{NN}, one finds that there can be a second edge-of-chaos region and a progression of dynamical regimes for large λ_{NN} values that are ordered in a complementary, mirror image fashion to those seen with smaller λ_{NN} values, as illustrated in Figure 8. This can include a second edge-of-chaos region associated with larger λ_{NN} values that is sometimes broader and more evident than that associated with smaller λ_{NN} values (see parts b and c of Figures 3 and 4, for example). The significance of the second edge-of-chaos region is unclear at present, but we would expect that it would exhibit the same complex behaviors of interest in artificial life and potentially the same computational universality as the first. To our knowledge, the upper range of λ_{CA} values has not yet been mapped out systematically for cellular automata (e.g., [20]), but a similar second region of complex edge-of-chaos dynamics has been suggested to exist for at least binary-state cellular automata should one do such a mapping [17]. For example, the patterns of widespread non-quiescent cells (suggestive of high λ_{CA} values) seen along with the propagation of particles or signals during the performance of computational tasks by some cellular automata transition functions discovered using genetic algorithms [17] suggests to us that examples of this second edge-of-chaos region may already be known. It is not clear at this time, for neural networks or cellular automata with *k* > 2 possible neuron or cell states, whether a second edge-of-chaos region will be found.

As others have observed with random networks, we found that the number of connections per node significantly affected network dynamics, although we only considered this in a limited way. The effect of the number of connections can be seen by comparing the dynamical regions of *fp-wp* space for fully connected networks (Figure 3) and those for random partially connected networks having 10 connections per node (Figure 4). Both the locations and the sizes of these dynamical regions differed substantially. However, in both cases λ_{NN} still ordered the regions in a similar fashion, from uniform through complex/chaotic and back to uniform.

Not only the number of connections per node, but also their distribution, proved to be important in determining network dynamics. This can be seen by comparing the dynamics of networks having 10 connections per node when those connections are random (Figure 4) with that when they are localized (Figure 5). As discussed earlier, the latter networks were included in this study because they are reminiscent of local connectivity in the cerebral cortex, where lateral connections are much more likely to occur between physically neighboring cortical elements than between distant ones. Even though the same number of connections were present, when they were localized in a Mexican hat configuration (excitatory connections to closest neighbors, inhibitory to next closest), neither complex nor chaotic dynamics was ever convincingly observed. Interestingly, this suggests by analogy that the relative lack of usefulness of λ_{CA} that has been observed for cellular automata having only two possible states per cell [11] may be due primarily to the locality of cell neighborhoods, rather than to their size or binary cell states as is sometimes assumed. This prediction could be easily tested experimentally. Further, when localized and random connectivity are combined, effectively producing a small-world network that more closely resembles biological cortical connectivity, complex and chaotic dynamics reappear, although somewhat damped in nature.

Among the four network architectures that we considered, the networks with 10 random connections per node are most closely related to the sparse, randomly connected and weighted networks used as reservoirs in reservoir computing models. Interestingly, when we used this type of network as a reservoir for a simple learning task, almost all successful cases of network training occurred when the reservoir network fell into the edge-of-chaos regions of the *fp-wp* space (Figure 7). This includes the edge-of-chaos region associated with larger as well as smaller values of λ_{NN}. This is consistent with evidence provided by others that networks using other types of node activation functions (than linear threshold neurons) yet still having edge-of-chaos dynamics are most effective as reservoirs during learning [1, 2, 12, 21, 22], although there are dissenting views [26].

We conclude that λ_{NN} is a useful measure for qualitatively ordering the space of neural networks. However, its potential effectiveness in practice is limited in that there is not one crisp value of λ_{NN} that can be used to predict, for every specific neural network, the precise dynamics that will be observed. All that λ_{NN} can do is to suggest a reasonable starting range of network parameters to explore in a given specific situation. This and λ_{NN}'s ability to qualitatively order the space of neural networks based on their dynamics suggests that further study of λ_{NN} and related measures for other types of networks will prove to be useful. Further, λ_{NN} may have broader applicability than envisioned here in that with some neural network models (e.g., basic Hopfield networks), the desired network behavior is not found within the edge-of-chaos regime. Determining which dynamics is best for a specific situation depends on the relative need for exploration versus exploitation of the activity dynamics space. This is another area where further work may prove useful.

## Notes

We also ran simulations having *k* = 50 connections per node, but these simulations did not show any remarkably different results qualitatively (they were essentially intermediate between those for full connectivity and for *k* = 10 random partial connectivity), so for brevity we just report the full and *k* = 10 connectivity cases here.

This specific value of *t*_{max}, although large, is somewhat arbitrary, as simulations that have not repeated their activity state at this point in time could still possibly do so. It was selected as a tradeoff between running simulations a long time and maintaining computational tractability in the context of the large numbers of simulations that were run. As will be seen later (Figures 3,^{4}^{5}^{6}–7), the vast majority of simulations either reached a fixed point or limit cycle within 100 time steps, or did not reach one at all. Of course, in spite of the enormous size of the state space (2^{100} possible activity states), it is finite, and this implies that any simulation must ultimately repeat its states.

## References

## Author notes

Contact author.

Computer and Information Science Department, Levine Hall, University of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104. E-mail: seifterj@seas.upenn.edu

Department of Computer Science and UMIACS, A.V. Williams Building, University of Maryland, College Park, MD 20742. E-mail: reggia@cs.umd.edu