## Abstract

The neocortex has a remarkably uniform neuronal organization, suggesting that common principles of processing are employed throughout its extent. In particular, the patterns of connectivity observed in the superficial layers of the visual cortex are consistent with the recurrent excitation and inhibitory feedback required for cooperative-competitive circuits such as the soft winner-take-all (WTA). WTA circuits offer interesting computational properties such as selective amplification, signal restoration, and decision making. But these properties depend on the signal gain derived from positive feedback, and so there is a critical trade-off between providing feedback strong enough to support the sophisticated computations while maintaining overall circuit stability. The issue of stability is all the more intriguing when one considers that the WTAs are expected to be densely distributed through the superficial layers and that they are at least partially interconnected. We consider how to reason about stability in very large distributed networks of such circuits. We approach this problem by approximating the regular cortical architecture as many interconnected cooperative-competitive modules. We demonstrate that by properly understanding the behavior of this small computational module, one can reason over the stability and convergence of very large networks composed of these modules. We obtain parameter ranges in which the WTA circuit operates in a high-gain regime, is stable, and can be aggregated arbitrarily to form large, stable networks. We use nonlinear contraction theory to establish conditions for stability in the fully nonlinear case and verify these solutions using numerical simulations. The derived bounds allow modes of operation in which the WTA network is multistable and exhibits state-dependent persistent activities. Our approach is sufficiently general to reason systematically about the stability of any network, biological or technological, composed of networks of small modules that express competition through shared inhibition.

## 1. Introduction

Large biological and artificial systems often consist of a highly interconnected assembly of components (see Figure 1). The connectivity between these elements is often densely recurrent, resulting in various loops that differ in strength and time constant (Girard, Tabareau, Pham, Berthoz, & Slotine, 2008; Slotine & Lohmiller, 2001; Hopfield, 1982; Amari, 1977; Douglas, Koch, Mahowald, Martin, & Suarez, 1995; Liu, Wang, & Liu, 2006). This organization is true of the neocortex, where the statistics of connectivity between neurons indicate that recurrent connections are a fundamental feature of the cortical networks (Douglas & Martin, 2004; Binzegger, Douglas, & Martin, 2004; Douglas et al., 1995). These recurrent connections are able to provide the excitatory and inhibitory feedback necessary for computations such as selective amplification, signal restoration, and decision making. But this recurrence poses a challenge for the stability of a network (Slotine & Lohmiller, 2001; Tegnér, Compte, & Wang, 2002; Cohen & Grossberg, 1983). Connections must be neither too strong (leading to instability) nor too weak (resulting in inactivity) for the network to function properly (Koch & Laurent, 1999). In addition, connections are continually changing as a function of learning or are accumulated semirandomly throughout development or evolution. How, then, do these networks ensure stability? Artificial neural networks can rely on their bounded (e.g., sigmoid) activation functions, but biological neurons do not usually enter saturation. Instead, their stability depends crucially on the balance between inhibition and excitation (Hahnloser, Sarpeshkar, Mahowald, Douglas, & Seung, 2000; McCormick & Contreras, 2001). In this letter, we explore how the stability of such systems is achieved, not only because we wish to understand the biological case but also because of our interest in building large, neuromorphic electronic systems that emulate their biological counterparts (Indiveri, Chicca, & Douglas, 2009).

Reasoning about the computational ability as well as the stability of neural systems usually proceeds in a top-down fashion by considering the entire system as a single entity able to enter many states (as e.g., in Hopfield networks: Izhikevich, 2007; Hopfield, 1982; Hertz, Krogh, & Palmer, 1991). Unfortunately, the number of states that must be considered grows exponentially with the size of the network, and so this approach quickly becomes intractable. For this reason stability analysis of large-scale simulations of the brain is proving difficult (Izhikevich & Edelman, 2008; Ananthanarayanan, Esser, Simon, & Modha, 2009; Markram, 2006).

We present an alternative approach that uses bottom-up reasoning about the modules that constitute the network. The idea is that the stability of the modules should be conferred on the networks that they compose. Of course, simply combining several modules, each of which is stable in isolation, to form a larger system does not necessarily imply that the new system is stable (Slotine & Lohmiller, 2001; Slotine, 2003). However, we explore the possibility that when the modules employ a certain kind of stability mechanism, then they are also able to confer stability on the supersystem in which they are embedded. We show that modules that achieve their own stability by observing constraints on their inhibitory-excitatory balance can be stable alone as well as in combination.

We have chosen to examine this problem in networks of WTA circuits (Yuille & Geiger, 2003), because these circuits are consistent with the observed neuroanatomical connections of cortex (Douglas & Martin, 2004; Binzegger et al., 2004). Moreover, the WTA is interesting because it can implement useful computational operations such as signal restoration, amplification, max-like winner selection (i.e., decision making), or filtering (Maass, 2000; Hahnloser, Douglas, Mahowald, & Hepp, 1999; Douglas & Martin, 2007; Yuille & Geiger, 2003). And combining multiple WTAs in a systematic manner extends these possibilities by allowing persistent activity and state-dependent operations (Rutishauser & Douglas, 2009; Neftci, Chicca, Indiveri, Slotine, & Douglas, 2008; Neftci, Chicca, Indiveri, Cook, & Douglas, 2010).

Typically, WTA networks operate in a high-gain regime in which their operation is nonlinear (e.g., selective amplification). While the stability of a WTA can be analyzed by linearizing around the possible steady states, rigorous analysis that takes the nonlinearities into account is difficult using linear analysis tools (Strogatz, 1994; Izhikevich, 2007; Hahnloser, 1998; Hahnloser, Seung, & Slotine, 2003). Instead, we use nonlinear contraction analysis (Lohmiller & Slotine, 1998, 2000; Slotine, 2003) to investigate the stability of WTA networks. The concept of contraction is a generalization of stability analysis for linear systems, allowing contraction analysis (Lohmiller & Slotine, 1998) to be used for the analysis of circuits in the fully nonlinear case, without making linearized approximations.

A nonlinear time-varying system is said to be contracting if initial conditions or temporary disturbances are forgotten exponentially fast. Thus, any two initial conditions will result in the same system trajectory after exponentially fast transients. Importantly, the properties of contracting systems are preserved when they are combined to form a larger systems (Slotine, 2003). Also, contraction allows parameter regimes that are not unduly restrictive. For instance, it can describe strong feedback loops, and ranges of parameters can be found where the system is both contracting and operating in a highly nonlinear regime. In addition, contraction analysis can deal with systems that are multistable (expressing several stable attractors or behaviors), where it guarantees exponentially fast convergence to one of the possible attractors. Such systems are capable of rich state-dependent computations, while at the same being contracting. We have used contraction analysis to reason about the permissible kinds and strengths of connectivity within and between WTA modules embedded in a network. If the individual modules are contracting, then observing our constraints is sufficient to guarantee stability (boundedness) of a system composed of such modules. Thus, contraction analysis permits the derivation of simple bounds on the network parameters that will guarantee exponential convergence to equilibria in the fully nonlinear case. This approach enables the systematic synthesis of large circuits, which are guaranteed to be stable if the set of bounds is observed. While we will demonstrate the feasibility of our approach in the case of WTA-type networks, our approach is not restricted to such networks. It can be applied as well to any simple nonlinear circuit that is capable of nonlinear computational operations.

## 2. Results

Our results are organized as follows. First, we introduce the basic organization of the WTA circuit. Second, we apply contraction theory to analyze the stability of networks of WTA circuits. We derive analytically the bounds on the parameters of the network that permit it to operate properly in either a soft or hard WTA configuration. We conclude by performing numerical simulations to confirm that the analytical bounds are valid and not unnecessarily restrictive.

### 2.1. The Winner-Take-All Network.

**x**consists of

*N*− 1 excitatory units

*x*

_{1..N−1}and one inhibitory unit

*x*(see Figure 2A). Each excitatory unit receives recurrent input from itself (α

_{N}_{1}) and its neighbors (α

_{2,3,…}). For simplicity, only self-recurrence is considered here (α

_{2,3,…}= 0), but similar arguments obtain when recurrence from neighboring units is included (see section 2.6). The inhibitory unit receives input from each excitatory unit with weight β

_{2}and projects to each excitatory unit with weight β

_{1}. The dynamics of each unit are described by equations 2.1 and 2.2. The firing rate activation function

*f*(

*x*) is a nonsaturating rectification nonlinearity max(0,

*x*). The dynamics of this network, and in particular the boundedness of its trajectories, depends on the balance of excitation and inhibition: where

*I*(

_{i}*t*) is external input to unit

*i*. All thresholds

*T*>0 are constant and equal.

_{i}*G*>0 is a constant that represents the load (conductance) and is assumed

*G*= 1, unless stated otherwise. All parameters are positive: α>0, β

_{1}>0, β

_{2}>0. We refer to such a system as either a WTA or a recurrent map throughout the letter. Map will denote a WTA throughout, and not a discrete dynamical system.

### 2.2. Combining Several WTAs.

A single WTA network can implement some useful computational operations (see section 1). However, more sophisticated computational operations can be achieved by combining several WTAs (Rutishauser & Douglas, 2009) by sparse and selective connections between some of the excitatory units of the various WTAs. We consider two ways of combining WTAs: bidirectional and unidirectional. A bidirectional (and symmetric) connection establishes a recurrent connection between two WTAs. A unidirectional connection provides the activity of one WTA as input to a second WTA (feedforward). The inhibitory units neither receive input from nor project to any other map. Thus, activity between maps is always excitatory (positive). This arrangement is motivated by the long-range connections in cortex, which are predominantly excitatory (Douglas & Martin, 2004; Douglas et al., 1995) (but they can contact both excitatory and inhibitory targets). While long-range inhibitory projections in cortex exist as well, we focus exclusively on excitatory long-range connectivity in this letter.

These two kinds of connectivity are motivated by our previous finding that three WTAs connected by a combination of bi- and unidirectional connections are sufficient to implement state-dependent processing in the form of an automaton (Rutishauser & Douglas, 2009). An automaton consists of two components: states and transitions between states. By connecting two maps bidirectionally, the network is able to maintain one region of persistent activity in the absence of external input, and this winning region represents the current state of the automaton (state dependence is a form of memory, and we thus refer to these localized regions of persistent activity as memory states). Transition circuits allow the network to select a new winner, conditioned on the current state as well as an external input. The implementation of these transitions requires a third WTA (to select the most appropriate transition) as well as unidirectional connections between the maps that drive the transition (see below). In this letter, we explore what constraints the presence of these additional connections poses on the stability of this and larger (more than three WTAs) networks.

First, consider two identical WTAs **x** and **y** (see Figure 2C). Each WTA consists of *N* = 3 units (2 excitatory, 1 inhibitory). The only connection between the two networks is γ, which symmetrically (bidirectional) connects *x*_{2} and *y*_{2}. Thus, this network can represent only one state.

Second, we consider unidirectional connections between WTAs. These are feedforward connections between two maps—for example, when units on map **x** provide input to units on map **z**. However, such feedforward connections can result in (indirect) recurrence—for example, when map **z** in turn provides input to **x**. Thus, analysis of unidirectional connections requires that we consider three maps **x**, **y**, and **z** simultaneously. The two maps **x** and **y** are connected bidirectionally as shown above, whereas **z** contains units that receive external input as well as input from **y** and also provide output to **x** (see Figure 2D). In this way, strong enough activation of units on **z** can bias the ongoing competition in the network and thereby induce a switch to a new winner (so changing state).

A given unit on **z** can either receive input from a different unit than it projects to (so providing a transition from one state to an other) or it can receive from and project to the same state. In Figure 2D, *z*_{1} is an example of a unit that initiates a transition from state 1 to 2, whereas *z*_{2} receives input from and projects to state 2. Thus, *z*_{2} establishes an additional loop of recurrent feedback and is the more restrictive case when considering stability.

The term *T _{TNj}* is an additional constant threshold for activation of the transition unit, so that in the absence of an external input

*I*, the transition unit will remain inactive

_{TNj}*z*= 0. The external input

_{j}*I*can be used selectively to initiate a transition. An appropriate choice of the threshold

_{TNi}*T*will ensure that the transition unit

_{TNj}*z*is active only when both the external input

_{j}*I*>0 and the input from the projecting map

_{TNi}*y*ϕ>0 are present. The activation of

_{j}*z*is thus state dependent because it depends on both an external input and the current winner of the map.

_{j}Now we will explore what constraints the presence of γ>0 and ϕ>0 imposes on stability. We use contraction analysis to show that if the single WTAs are contracting, γ and ϕ can be used (with an upper bound) to arbitrarily combine WTAs without compromising the stability of the aggregate system. Since we base our arguments on contraction analysis, we first introduce its basic concepts.

### 2.3. Contraction Analysis.

Essentially, a nonlinear time-varying dynamic system will be called *contracting* if arbitrary initial conditions or temporary disturbances are forgotten exponentially fast, that is, if trajectories of the perturbed system return to their unperturbed behavior with an exponential convergence rate. It turns out that relatively simple algebraic conditions can be given for this stability-like property to be verified and that this property is preserved through basic system combinations and aggregations.

A nonlinear contracting system has the following properties (Lohmiller & Slotine, 1998, 2000; Slotine, 2003; Wang & Slotine, 2005):

- •
Global exponential convergence and stability are guaranteed.

- •
Convergence rates can be explicitly computed as eigenvalues of well-defined Hermitian matrices.

- •
Combinations and aggregations of contracting systems are also contracting.

- •
Robustness to variations in dynamics can be easily quantified.

Before stating the main contraction theorem, recall first the following. The symmetric part of a matrix **A** is . A complex square matrix **A** is Hermitian if **A**^{T} = **A***, where ^{T} denotes matrix transposition and * complex conjugation. The Hermitian part **A**_{H} of any complex square matrix **A** is the Hermitian matrix . All eigenvalues of a Hermitian matrix are real numbers. A Hermitian matrix **A** is said to be positive definite if all its eigenvalues are strictly positive. This condition implies in turn that for any nonzero real or complex vector **x**, **x***^{T}**A****x**>0. A Hermitian matrix **A** is called negative definite if −**A** is positive definite.

A Hermitian matrix **A**(**x**, *t*) dependent on state or time will be called uniformly positive definite if there exists a strictly positive constant such that for all states **x** and all *t* ⩾ 0, the eigenvalues of **A**(**x**, *t*) remain larger than that constant. A similar definition holds for uniform negative definiteness.

**F**a smooth, nonlinear function. The central result of contraction analysis, derived in Lohmiller and Slotine (1998) in both real and complex forms, can be stated as:

*Denote by the Jacobian matrix of*

**with respect to***f***. Assume that there exists a complex square matrix***x***(***Θ**x*,*t*) such that the Hermitian matrix**(***Θ**x*,*t*)*^{T}**(***Θ**x*,*t*) is uniformly positive definite, and the Hermitian part*F*_{H}of the matrix*is uniformly negative definite. Then, all system trajectories converge exponentially to a single trajectory, with convergence rate . The system is said to be contracting,*

**is called its generalized Jacobian, and***f***(***Θ**x*,*t*)*^{T}**(***Θ**x*,*t*) its contraction metric. The contraction rate is the absolute value of the largest eigenvalue (closest to zero, although still negative) λ = |λ_{max}*F*_{H}|.In the linear time-invariant case, a system is globally contracting if and only if it is strictly stable and **F** can be chosen as a normal Jordan form of the system, with ** Θ** a real matrix defining the coordinate transformation to that form (Lohmiller & Slotine, 1998). Alternatively, if the system is diagonalizable,

**F**can be chosen as the diagonal form of the system, with

**a complex matrix diagonalizing the system. In that case,**

*Θ***F**

_{H}is a diagonal matrix composed of the real parts of the eigenvalues of the original system matrix.

Note that the activation function *f*(*x*) = max(*x*, 0) (see equations 2.1 and 2.2) is not continuously differentiable, but it is continuous in both space and time, so that contraction results can still be directly applied (Lohmiller & Slotine, 2000). Furthermore, the activation function is piecewise linear with a derivative of either 0 or 1. This simple property is exploited in the following by inserting dummy terms *l _{j}*, which can be either 0 or 1 according to the derivative of

*f*(

*x*): . For a single WTA, there is a total of

*N*dummy terms.

### 2.4. Stability of a Single WTA.

We begin the contraction analysis by considering a single WTA. The conditions obtained in this section guarantee that the dynamics of the single map converge exponentially to a single equilibrium point for a given set of inputs. Actually the WTA has several equilibrium points (corresponding to each possible winner), but contraction analysis shows that for a given input, a particular equilibrium will be reached exponentially fast, while all others are unstable. Thus, as long as the network does not start out exactly at one of the unstable equilibria (which is impossible in practice), it is guaranteed to converge to the unique equilibrium point (the winner) determined by the given set of inputs. Our strategy is twofold. First, we show that the WTA is contracting only if one of the excitatory units is active (the “winner” in a hard-WTA configuration). Second, we show that in the presence of multiple active excitatory units, the dynamics diverge exponentially from the nonwinning states.

*x*

_{1}or

*x*

_{2}) that are represented by

*l*

_{1}= 1,

*l*

_{2}= 0 or

*l*

_{1}= 0,

*l*

_{2}= 1, respectively (

*l*

_{3}= 1 for both). Assuming the second unit is the winner, the Jacobian becomes Our approach consists in first finding a constant metric transformation

**describing the contraction properties of the simple Jacobian, equation 2.12, for appropriate parameter ranges, a process equivalent to standard linear stability analysis, and then using the same metric transformation to assess the contraction properties of the general nonlinear system.**

*Θ*Let us first find ranges for the parameters α, β_{1}, β_{2} such that **J**_{W2} is contracting. This is the case if τ*Θ***J**_{W2}*Θ*^{−1} < **0**, where ** Θ** defines a coordinate transform into a suitable metric. The left-hand side is the generalized Jacobian

**F**=

*Θ***J**

_{W2}

*Θ*^{−1}(see section 2.3). Based on the eigendecomposition

**J**

_{W2}=

**Q**

*Λ***Q**

^{−1}, where the columns of

**Q**correspond to the eigenvectors of

**J**

_{W2}, define

**=**

*Θ***Q**

^{−1}. This transformation represents a change of basis that diagonalizes

**F**(Horn, 1985). This choice of a constant invertible

**also implies that**

*Θ******

*Θ*^{T}

**is positive definite (since**

*Θ***x***

^{T}

*****

*Θ*^{T}

*Θ***x**= ||

*Θ***x**||

^{2}, ∀

**x**).

*G*= 1, the Hermitian part of

**F**(see equation 2.10) is negative definite if

^{1}Note that these conditions systematically relate α to the inhibitory loop gain β

_{1}β

_{2}and also permit α>1 (see below for discussion).

*l*

_{3}= 1) and one excitatory unit are active (here,

*l*

_{2}= 1 and

*l*

_{1}= 0, but the same bounds are valid for

*l*

_{2}= 0 and

*l*

_{1}= 1). The next key step is to use the same metric to study arbitrary terms

*l*

_{2,3}and

*l*

_{1}= 0, so as to show that the system is contracting for all combinations of

*l*

_{2,3}, except the combinations from which we want the system to be exponentially diverging. In the same metric

**and using the Jacobian equation 2.11 with**

*Θ**l*

_{1}= 0, the Hermitian part of

**F**becomes (with

*i*

^{2}= −1) Note that equation 2.16 was simplified assuming the bound given in equation 2.13. We require

**F**< 0. A matrix of the form is negative definite if λ

_{H}_{i}< 0 and |

*r*|

^{2}< λ

_{1}λ

_{2}(Wang & Slotine, 2005). For equation 2.16, this results in The bounds 2.13 to 2.15 on the parameters satisfy this condition whenever

*l*

_{2}=

*l*

_{3}and

*l*

_{2}= 0,

*l*

_{3}= 1. As expected, for the case

*l*

_{2}= 1,

*l*

_{3}= 0 (only excitation active), the system is not contracting for α>1. Rather, we require that in this case, the system is exponentially diverging, as we detail below.

*l*

_{1,2,3}= 1. For the network to be a hard-WTA, we require that this configuration is exponentially diverging. The dynamics of interest are the excitatory units, so that, following Pham and Slotine (2007), the system is exponentially diverging away from this state if where

**V**is the projection matrix The constraint 2.18 ensures that the system diverges from certain invariant subspaces where

**V**

**x**is constant. For

**V**as shown in equation 2.19,

Each row represents one excitatory unit. If condition 2.18 is satisfied, the network is guaranteed to diverge exponentially away from this equilibrium.

The above conditions were derived based on the system of inequalities λ_{min}>0 given by the eigenvalues of the Hermitian part of the left-hand side of equation 2.18. The same calculation using instead *l*_{1,2} = 1 and *l*_{3} = 0 (excitation, but no inhibition) results in the same bounds for exponential divergence from the state of no inhibition.

_{1}β

_{2}. The above conditions establish lower and upper bounds on the parameters for global exponential convergence to a unique winner for a given set of inputs.

Under these constraints (in particular, on the excitatory loop strength α), the system is globally convergent yet always selects a winner. The system does not depend on saturation to acquire this stability. Also, the constraints guarantee that the system does not oscillate, apart from transient oscillations during convergence. This has been established by demonstrating that the system is either contracting or exponentially diverging for any subset of the dummy terms *l*_{1,2,3}. Note that the system is contracting in the same metric ** Θ** for all contracting subsets. While we defined the metric

**for a particular winner, the same constraints result from defining a similar**

*Θ***for any of the other possible winners. Similar conditions can be derived for conditions where the winner is represented by multiple active units such as when a bump of activity is introduced by adding excitatory nearest-neighbor connections α**

*Θ*_{2}(Rutishauser & Douglas, 2009; Douglas & Martin, 2007; see section 2.6). Numerically, these ranges permit a wide range of parameters. For example, for β

_{1}= 3 and β

_{2}= 0.3, 1 < α < 1.89. Under these conditions, the system operates in a highly nonlinear regime (where the loop gain can be up to 50!).

The analysis above focused on the regime where α>*G* (with *G* = 1). In this mode, the system acts as a highly nonlinear WTA, always selecting a binary winner. What if the system operates in α < 1? In this configuration, the winner unit is still contracting (see equation 2.13).

What happens when all units (*l*_{1,2,3} = 1) are active and α < 1? Defining ** Θ** based on the Jacobian

**J**with all units on and solving

*Θ***J**

*Θ*^{−1}<

**0**, we find that this system is contracting for α < 1. The system where all excitatory units are active is thus contracting under this condition, implying that the system is in a soft WTA configuration. While the system still selects a winning unit, the activity of the losing unit is not completely suppressed. Also note that in this configuration, no persistent activity in the absence of external input is possible. A graphical illustration of both modes of operation is shown in Figure 3.

Finally, note that the time constant τ was assumed to be equal for all units. Note that in this case, the numerical value of τ does not influence the bounds (since τ>0 multiplies the entire Jacobian; see equation 2.10). Similar conditions can be derived for conditions where the time constants are not equal (see appendix C), in which case only the ratio of the time constants is relevant.

### 2.5. Stability of a Single WTA of Arbitrary Size.

Can this analysis be extended to maps of arbitrary size? While the approach in the previous section can be applied to maps of any size, an alternative approach is to first define contraction for a map consisting only of a single excitatory and inhibitory unit and then extend it recursively by one unit at a time, while showing that this extension does not change the contraction properties. This approach is illustrated in Figure 4.

This system (see Figure 4A) is contracting if the conditions shown in equations 2.23 to 2.25 for the parameters α, β_{1}, β_{2} hold. The approach used to derive the bounds is equivalent to the one described above. First, define a ** Θ** =

**Q**

^{−1}, where

**Q**is based on the eigendecomposition of

**A**with

*l*

_{1,2}= 1. Then define the valid parameter ranges based on equation 2.10. The same permissible parameters result (see equations 2.13–2.14).

**A**

_{1,2}=

**A**after adjusting the

*l*terms appropriately (

_{i}*l*

_{1,2}and

*l*

_{3,4}for

**A**

_{1}and

**A**

_{2}, respectively). Similarly,

**G**

_{1,2}=

**G**for

*l*

_{2}and

*l*

_{4}, respectively. Note that combining the two systems in this way adds only two (strictly positive) terms to the equations describing the dynamics of the inhibitory neurons. Thus, inhibition in this new system can be larger only compared to the smaller system. Thus, if the smaller system is contracting (as shown above), the combined system must also be contracting (shown in the next section).

**F**=

*Θ***J**

*Θ*^{−1}, defined for a system with either

*n*units or

*n*+ 1 units. A system with

*n*= 1 units has Jacobian

**A**

_{1}and is contracting as shown previously. The condition for it to be stable

**F**<

_{s}**0**requires (for the real part only) A system with

*n*= 2 units has Jacobian

**J**(see equation 2.27) and is stable if equation 2.29 holds. This requires Comparing equations 2.30 and 2.31 reveals that adding a unit

*n*+ 1 to a system of

*n*units does not change the conditions for contraction to a single winner. Thus, if the recurrent map consisting of

*n*excitatory unit is contracting, the system of

*n*+ 1 units is also contracting. By recursion, this proof can be applied to maps of arbitrary size.

What if multiple units on the map are active? Above conditions show that a single winner is contracting on an arbitrary-sized map. In a hard WTA configuration, the system should always emerge with a single winner. We have previously shown that our system has this property when α>1 (see equation 2.18). Here, we extend this argument to maps of arbitrary size. Note that only the *l _{i}* for the excitatory units can be switched inactive. All inhibitory neurons (since they all represent the same signal) are always

*l*= 1.

_{i}*n*= 2 units (since a system of

*n*= 1 does not have competition). The goal is to find conditions that enforce a single winner for

*n*=

*n*+ 1 units. For the

*n*= 2 system (

**J**with all

*l*

_{1,2,3,4}= 1), enforcing

**V**

**J**

**V**

^{T}>

**0**(see equation 2.18) with gives conditions for this configuration (both units on, that is,

*l*

_{1}=

*l*

_{3}= 1) to be exponentially unstable (thus converging to an other subset of the

*l*terms). Similar to equation 2.19, the system diverges from invariant subspaces where

_{i}**V**

*x*is constant. For the projection, equation 2.32, α

*x*

_{1}− β

_{1}

*x*

_{2}− β

_{1}

*x*

_{4}= 0, and α

*x*

_{3}− β

_{1}

*x*

_{2}− β

_{1}

*x*

_{4}= 0 defines the equilibrium. If condition 2.18 is satisfied, the network is guaranteed to diverge exponentially away from this equilibrium.

**V**constructed in analog to equation 2.32 shows that

**V**

**J**

**V**

^{T}>

**0**for this system if Note that a sufficient solution continues to require α>1. We have thus shown, under α>1, that a system with

*n*= 2 as well as

*N*= 3 can have only one active unit. By recursion, the same argument can be used to show that any system

*n*=

*n*+ 1 cannot have a subset of

*i*units (where 1 <

*i*< =

*n*) active. Any such system is thus always converging to a single winner. Any such system will have these properties if the parameters are within the ranges shown in equations 2.20 to 2.22 hold.

For purposes of this proof, we used additional inhibitory units—one for each excitatory unit. Note that this arrangement is for mathematical convenience only. In an implemented system, these units can be collapsed to one unit only (or several to implement local inhibition). Collapsing all units to one does not change the dynamics of the system, because all inhibitory units have the same activity (and its derivatives) at all times.

#### 2.5.1. Example.

This example shows how to apply the approach outlined above in order to calculate the permissible range of parameters for a toy recurrent map consisting of one excitatory and one inhibitory unit (see Figure 4A), whose Jacobian is **A** (see equation 2.26) with *l*_{1,2} = 1. Our intention is to illustrate in detail the procedural aspects involved in the calculation.

Due to the choice of the metric Θ, only terms on the diagonal remain. The network is contracting if the Hermitian part (see equation 2.37) is negative definite. A sufficient condition for this to be the case is *Re*(λ_{min}(**F**_{s})) < 0. Solving this system of inequalities results in the conditions shown in equations 2.13 to 2.15.

#### 2.5.2. Comparison with Numerical Simulations.

Do the analytical bounds derived above match the behavior of the system when it is simulated? We simulated a WTA network as described (with two excitatory units) and systematically tested different combinations of the parameters α, β_{1}, β_{2}. For each simulation, we determined whether all units in the network reach steady state with after a sufficient amount of time. Such networks were classified as stable or unstable, respectively (see Figure 5). Here, we vary α and β_{1} while keeping β_{2} = 0.25. While the analytically derived solution is slightly more conservative than necessary, it closely matches the results of the simulations (see Figure 5 for details). The crucial parameter is the excitatory strength relative to the inhibitory strength. This can be seen from the general increase of the permissible value of α as a function of β_{1} (see Figure 5). Note, however, that our analytical solution assigns an upper bound to β_{1} as well, which is unnecessary for the numerical simulations. However, strong values of β_{1} lead the system to oscillate, and keeping the parameter within the range derived analytically prevents this problem.

### 2.6. Stability of Single WTA: Bump of Activity.

The previous analysis considered WTA networks where only α_{1} = α>0 and α_{2} = 0 (see Figure 2A). In this configuration, the winner of the competition is represented by a single active unit. However, cooperation between neighboring units can also be introduced by setting 0 < α_{2} < α_{1}. The winner is now represented by a more distributed hill of activity. Our analysis can also be extended to this case.

*l*

_{1,2,3}= 1. Using the approach outlined previously, this system is stable if

*Θ***J**

*Θ*^{−1}<

**0**. After the

**coordinate transform is applied, examining the eigenvalues of the Hermitian part of this system reveals that is a required condition (plus others, not shown). Comparing this condition to the eigenvalues of the system with α**

*Θ*_{2}= 0 (see equation 2.31) reveals that α was replaced by α

_{1}+ α

_{2}(plus some other minor modifications). This result confirms the intuition that the crucial factor is the total excitatory input α

_{1}+ α

_{2}to any one unit. A sufficient condition for this system to be contracting is (compare to equation 2.13) This condition applies as long as α

_{1}+ α

_{2}< 2 and α

_{1}− α

_{2}< 1. Together, these conditions similarly permit a fairly wide range of parameters, including α

_{1}>1—for example, if β

_{1}= 3, β

_{2}= 0.3 and α

_{2}= 0.5, α

_{1}< 1.5. Note the critical trade-off between the inhibitory gain β

_{1}β

_{2}and the excitatory gain α

_{1}+ α

_{2}that is expressed in this section.

### 2.7. Stability of Two Bidirectionally Coupled WTAs.

*x*and

*y*can be coupled stably (by γ connections as shown above). The key idea is first to give sufficient conditions for stable synchronization of the two WTAs. Note that by

*synchronization*, we mean here that two variables have the same value (in contrast to other meanings of

*synchronization*, as in population coding). This allows the dimensionality of the stability analysis to be reduced. Indeed, synchronization implies that the overall system stability can then be analyzed simply by considering the stability of the individual target dynamics—any one of the subsystems where the external coupling variables have been replaced by the corresponding (endogenous) variables in the subsystem. For instance, in the target dynamics of

**x**, equation 2.3 is replaced by Next, we shall see that in fact, given the form of coupling we assume, stable synchronization of the subsystems comes “for free.” That is, it is automatically satisfied as long as sufficient conditions for the stability of the individual target dynamics are satisfied.

**J**is the Jacobian of the entire system. Here, we define

*synchrony*as equal activity on both maps—

*x*=

_{i}*y*for all

_{i}*i*. This condition is embedded in

**V**as shown. Note that the system need not start out as

*x*=

_{i}*y*to begin with; rather, the condition embedded in

_{i}**V**guarantees that the system will converge toward this solution. Other conditions of synchrony (such as only some neurons synchronizing) can similarly be specified by modifying

**V**accordingly.

**V**specifies a metric that is orthogonal to the linear subspace in which the system synchronizes (i.e., a flow-invariant subspace; see theorem 3 in Pham & Slotine, 2007).

**J**has dimension 2

*N*and is composed of the two sub-Jacobians

**J**

^{1}and

**J**

^{2}(as shown in equation 2.11), which describe a single WTA, and of the Jacobians of the couplings. Introducing the coupling term, results in the Jacobian

**J**of the full system: which can be written, using again dummy terms

*l*, as The above expression yields Note that α>0, β

_{j}_{1}>0, β

_{2}>0.

**J**

^{1}

_{sync}(and similarly

**J**

^{2}

_{sync}) to be negative definite automatically imply that

**V**

**J**

**V**

^{T}is negative definite. Indeed, since γ ⩾ 0, In other words, the basic requirement that the individual target dynamics are stable (as shown in the previous section) automatically implies stability of the synchronization mechanism itself.

Note the opposite signs of γ in equations 2.48 and 2.47. Intuitively, these express a key trade-off. Indeed, the stronger γ is, the easier and stronger the synchrony of the memory state (see equation 2.47). However, a stronger connection also makes the system less stable. This is expressed by the positive γ in equation 2.48, which imposes stricter constraints on the permissible values of the other weights for **J**^{1}_{sync} to remain negative definite.

*T*(i.e.,

*x*γ>

_{i}*T*), where

*x*is the steady-state amplitude during the application of input (which is a function of the gain ). Thus,

_{i}### 2.8. Stability of Unidirectionally Coupled WTAs.

Next, we extend our analysis to networks consisting of three WTAs **x**, **y**, and **z** of the kind shown in Figure 2D and described in section 2.2. WTAs **x** and **y** are bidirectionally coupled to express the current state and are equivalent to the network considered in the previous sections. A further WTA **z** is added that contains units *z _{i}*, referred to as transition neurons (TNs). In this example, there are two TNs

*z*

_{1}and

*z*

_{2}(see Figure 2D). Activation of the first (

*z*

_{1}) leads the network to transition from state

*x*

_{1}to

*x*

_{2}if the network is currently in

*x*

_{1}. Activation of the second (

*z*

_{2}) leaves the network in state

*x*

_{2}if the network is currently in this state. If it is not so, then no activity is triggered. The TN

*z*

_{1}is an example of a transition from one state to another. TN

*z*

_{2}is an example of a transition that starts and ends in the same state (a loop). This loop is intentionally introduced here because it poses a limit to stability. TNs receive and project input with weight ϕ>0.

#### 2.8.1. Case 1: Loop.

For purposes of worst-case analysis, assume that the TN *z*_{2} (which receives and projects to state 2) is permanently active. This is achieved by setting *T _{TN}* = 0. In this case, we require that the network remains synchronized in state

*z*

_{2}.

Note the similarity to equation 2.47. None of the nonlinearity terms of the third WTA *l*_{7}, *l*_{8}, *l*_{9}, or ϕ appears in this equation. Thus, the synchrony of the states is not influenced by the presence of a consistently active loop TN. The presence of ϕ thus does not influence the synchrony between *x* and *y*, which represent the state. However, this combined system also needs to be contracting for this system to be stable (i.e., reach steady state). Thus, we next derive the limits on ϕ for this to be the case.

Using the insight gained in section 2.7, we replace the *y _{i}* terms by

*x*terms for purposes of stability analysis. Note that the principle of showing synchronization first introduces a hierarchy (or series) of dynamic systems, so that the overall result converges if each step (sync and simplified system) does, with the convergence rate the slower of the two. In our case, the synchronization step is always the fastest, so the overall convergence rate is that of the reduced system.

_{i}**x**and

**z**). Here, only

*z*

_{2}(loop TN) is used;

*z*

_{1}is not connected. The corresponding Jacobian is Having

**J**

^{TN}be negative definite in the metric

**, guarantees that the coupled system is stable. Following Wang and Slotine (2005) and Slotine (2003; see also section 3.4), if the uncoupled systems are stable with contraction rates λ**

*Θ*_{x}and λ

_{z}, then the coupled system is stable if and its contraction rate is Note that λ

_{x,z}>0 is equivalent to condition 2.59. One then has 0 < λ

_{x,z}⩽ λ

_{x}⩽ λ

_{z}. Note that if the connection weights are not symmetric, ϕ in the expressions above can be replaced by .

**F**

_{s}(its real part) (Wang & Slotine, 2005). Following equation 2.10, the contraction rate for a WTA (such as

**z**) is . Similarly, for a symmetrically coupled system with coupling weight γ, the contraction rate is . These two conditions thus establish the upper bound on the permissible weight of . Since λ

_{x}< λ

_{z}, a good approximation is

#### 2.8.2. Case 2: Transition.

Here, the transition from one pattern of synchrony to another (two states) is investigated. For this purpose, both states *x*_{1} and *x*_{2} exist. Also, the TN *z*_{1} is connected since activating *z*_{1} leads from a transition from state 1 to 2 (represented by *x*_{1} and *x*_{2}). In the following, we assume that the network is in *x*_{1} when initialized and *z*_{1} is active (i.e., *T _{TN}* = 0). We then show that the network will end up in the second synchrony pattern, representing

*x*

_{2}.

**V**as above and

**J**the appropriate Jacobian of the full system yields Similarly to equation 2.56, the terms of the third WTA do not appear. Thus, activation of the TN does not disturb the synchrony between

**x**and

**y**as such but only which particular units synchronize (this is not visible in above equation).

Whether the system transitions from one pattern of synchrony to another is determined by the threshold behavior of the activation function. As long as the input to *x*_{2}, ϕ*z*_{1}>0 for sufficient amount of time, the network will switch its pattern of synchrony. If, on the other hand, *z*_{1} is not active for a sufficiently long time (relative to the contraction rate λ_{x}), the system will return to the previous pattern of synchrony. Also note that *z*_{1} switches off automatically as soon as the transition occurs (since *y*_{1} then ceases to be active). Thus, the timing of the external input does not have to be tightly connected with the external dynamics or can even be permanently present.

### 2.9. Verification by Simulation.

#### 2.9.1. Single WTA.

The properties of contraction during the application of external input can be demonstrated by numerical simulation of a single WTA network consisting of two excitatory and one inhibitory units (see Figure 2B for wiring). Figure 6 shows the dynamics of the three units in the network while external input is applied to the two excitatory units *x*_{1} and *x*_{2}. The input *I*(*t*) is a binary pulse of amplitude 2.0 to *x*_{1} and 1.8 to *x*_{2} (difference 10%). Note how the network amplifies the small difference (*x*_{1} is the winner). Parameters were α = 1.3, β_{1} = 2, β_{2} = 0.25, *T* = 0, *G* = 1, that is the gain was *g* = 5 (steady-state amplitude *g***I*). These parameters satisfy all conditions for contraction. The properties of contraction are evaluated at every point of time by evaluating the effective Jacobian at this point of time. The maximal eigenvalue *Re*(λ_{max}(*F _{s}*)) of the generalized Jacobian indicates whether the network is currently contracting. Whenever the value is below zero (dashed line),

*F*is negative definite (see Figure 6E).

_{s}The minimal eigenvalue of λ_{min}(*VJV _{s}*) (see Figure 6F) indicates points of time when the network is not contracting. Whenever the value is above the dashed line,

*VJV*is positive definite. Note the interplay of the dynamics of the maximal eigenvalue of the generalized Jacobian and the minimal eigenvalue of

^{T}*VJV*, which together reflect the dynamic state of the system. Whenever the system is contracting, λ

^{T}_{max}(

*F*) < 0. Shortly after the onset of the external input (

_{s}*t*= 1000), no winner has been selected, and the system is not contracting. Instead, it is diverging exponentially toward a contracting region of the state space. Note also the other important transition of the system after the offset of the input (

*t*= 6000). After a while, only the inhibitory neuron is active, which explains the change around

*t*= 7000.

#### 2.9.2. Two and Three WTA Networks.

Next we simulated two networks: one consisting of two (see Figures 7A and 7B) and one consisting of three (see Figures 7C and 7D) recursively coupled WTAs, connected by either bidirectional or directional connections. Parameters were α = 1.3, β_{1} = 2.8, β_{2} = 0.25, γ = 0.15, *T* = 1. For the second simulation, in addition, *T _{TN}* = 5, ϕ = 0.3. These simulations (see the legend of Figure 7 for details) illustrate the dynamics of the network during persistent state-dependent activity that exists due to the γ connections. Also, it shows how these states can be used to implement state-dependent transitions (see Figure 2D for connectivity).

Notice how, after application of external input to *x*_{1}, activity persists (see Figure 7A). After applying input to *x*_{2}, the winner switches. Increasing inhibition (onset at *t*= 15,000; see Figure 7B) globally resets both maps. Note also how application of external input to *z*_{1} leads to a transition from the first to the second state only if the network is in state *x*_{1} (case 1 versus 2 as indicated in Figure 7D). This illustrates a state-dependent reaction of the network to external input. The third input application (case 3 in Figure 7D) illustrates the stability of the network, even if the transition is onto itself. Note how the network reaches a stable level of activity before the external input to *z*_{2} is removed, indicating balanced inhibition or excitation despite multiple feedback loops both within and between maps.

#### 2.9.3. Large Networks.

Contraction properties are particularly useful because they are preserved if several contracting systems are combined to form a larger system (Slotine, 2003; Lohmiller & Slotine, 1998). Such systems can be combined in several ways, including parallel combinations, hierarchies, and certain types of feedback to form a new system. The resulting composite system is guaranteed to be contracting as well if the underlying subsystems are contracting (Slotine & Lohmiller, 2001; Slotine, 2003). Note that, in themselves, combinations of stable modules have no reason to be stable (Slotine & Lohmiller, 2001). This is guaranteed only if the constituting modules are contracting. To illustrate that contracting WTA networks can be used to construct larger networks, we next simulated a network consisting of six identical WTAs (see Figure 8). The purpose of this simulation is to demonstrate that satisfying contraction properties at the level of the constituting modules (one WTA) is sufficient to guarantee stability of a large connected network with many (potentially unknown) feedback loops.

Each WTA consists of four excitatory and one inhibitory unit. The first four WTAs represent states and remaining two state-dependent transitions. Bidirectional γ connections were generated randomly between the excitatory units of these four WTAs. Unidirectional ϕ connections were placed randomly between units on the last two WTAs and units on the first four WTAs. Parameters were α = 1.3, β_{1} = 3.2, β_{2} = 0.25, γ = 0.15, *T* = 1, and ϕ = 0.3. Inputs were generated randomly, first restricted to only the state-carrying WTAs and later only to the transition-inducing units. One instance of this simulation is shown in Figure 8. Note the following features of the result. First, the network is in a nonlinear hard WTA configuration: except for transients, only one excitatory unit in each WTA is active (see Figure 8D). Second, the reaction to the same external input depends on the state of the network. For example, consider the inputs to WTA 5 in Figure 8C. Input to unit 1 (blue) is provided twice, but only once does this induce a state switch (visible as a change of winners on the first WTAs). Third, the levels of activity are strictly bounded, despite external input and a high gain of the network (see Figure 8D). Similar simulations with other weights (even randomized) are also stable as long as the parameters are within the ranges permitted.

Simulation of a network consisting of 100 and 1000 WTAs, probabilistically connected as described above, was found to be stable as well. We assume the parameters of the WTA itself are static. The critical concerns are the connections between the WTAs (γ and ϕ). As long the pairwise connection probability between a given WTA and all other WTAs is sufficiently low, stability will be guaranteed. This is because the sum of all γ_{j} that connect to a particular unit needs to observe the constraint shown in equation 2.50: .

## 3. Discussion

Biology is replete with examples in which relatively independent subsystems are coupled together by various degrees of time-varying positive and negative feedback; nevertheless, the entire system is functionally stable. The neocortex, with its billion neurons parcellated into millions of interconnected local circuits, is one striking example of such a system. We have chosen to study this cortical case because the functional stability of the vastly interconnected cortical system is intimately associated with the expression of stable intelligent behavior.

Our contribution toward understanding this intriguing phenomenon of collective stability is that we have identified and properly analyzed a small, functional module (here, a WTA circuit) and shown that this knowledge alone allows us to guarantee stability of the larger system. By *instability*, we here mean anything that leads to runaway behavior, which in biological neurons means the neurons will latch into saturation and, if active long enough, die (Syntichaki & Tavernarakis, 2003). We do not consider this seizure-like pathological state to offer boundedness in any useful sense. Instead, it is the case that neurons usually operate at a small fraction of their maximum firing rate, and their networks are stable only because of tight inhibitory-excitatory balance. The brain pays considerable attention to maintaining this balance. For these reasons, the stability of neuronal networks is better cast as stability in a network of (effectively) positively unbounded linear threshold neurons than for example sigmoidal neurons. The simplified model neurons we used make the problem tractable, while preserving the features of real neurons that we believe are crucial for understanding neuronal circuits and their stability. We thus expect our observations to be directly applicable to spiking neurons.

The strength of these results is that they cast light on the problem of how global stability can be achieved in the face of the locally strong feedbacks required for active, nonlinear signal processing. We have examined this problem in networks of WTA circuits because the WTA is a rich computational primitive and because neuroanatomical connectivity data suggest that WTA-like circuits are a strong feature of the networks formed by neurons in the superficial layers of cortex.

In essence, we have employed contraction analysis to demonstrate that a WTA network can at the same time be contracting and strongly amplifying. Our key results are the bounds documented in equations 2.23 to 2.25, 2.50, and 2.61 and illustrated in Figure 3. It is important to note that this analysis could not have been performed with standard linear tools (such as eigenvalues of Jacobians at fixed points), which rely on linearizations and do not provide global stability conditions. While in principle the asymptotic convergence may have been demonstrated using a Lyapunov function (Slotine & Li, 1991), actually no such function is known for these kinds of networks. In contrast, using contraction analysis, we could demonstrate exponential (as opposed to asymptotic) convergence for arbitrary initial conditions. In addition to systematic analysis, we have also confirmed our results with simulation of random systems composed of WTAs. The WTA network thus constitutes a strong candidate for a canonical circuit, which can serve as a basis for the bottom-up construction of large biological and artificial computational systems. Our approach is similarly applicable to functional modules other than WTAs, such as liquid state machines with feedback (Maass, Joshi, & Sontag, 2007).

To systematically analyze subsets of active neurons that should be either contracting or not (i.e., permitted winners or not), we utilized what we called *l _{i}* terms. The parcellation of the active subsets of the network using such terms can be regarded as a generalization of the approach of permitted and forbidden sets (“switching matrix”) (Hahnloser et al., 2003). However, that previous approach is suitable only for fully symmetric networks, whereas the proper operation of a WTA network requires asymmetry of the inhibitory connections. Our approach is to exhaustively show for any possible subset of the

*l*terms that it is either exponentially diverging or contracting. This way, the network as a whole is guaranteed to exponentially converge to one of the fixed points. The concept we developed can be applied at any point of time during the operation of the network. In particular, it can be applied before the winner is known. Our reasoning indeed guarantees that a winner will be selected exponentially fast.

_{i}Winner-take-all networks are representatives of a broad class of networks, where a number of excitatory units share a common inhibitory signal that serves to enforce competition (Dayan & Abbott, 2001; Amari & Arbib, 1977; Yuille & Geiger, 2003; Hertz et al., 1991; Tank & Hopfield, 1986; Rabinovich et al., 2000; Ermentrout, 1992; Schmidhuber, 1989). There are many instances of networks that share this property, including various neural networks but also gene regulatory networks, in vitro DNA circuits (Kim, Hopfield, & Winfree, 2004), and development. Competition enforced by shared inhibition among excitatory units is a principal feature of brain organization (Kurt et al., 2008; Baca, Marin-Burgin, Wagenaar, & Kristan, 2008; Tomioka et al., 2005; Pouille, Marin-Burgin, Adesnik, Atallah, & Scanziani, 2009; Buzsaki, 1984; Mittmann, Koch, & Häusser, 2005; Gruber, Powell, & O'Donnell, 2009; Sasaki, Matsuki, & Ikegaya, 2007; Papadopoulou, Cassenaer, Nowotny, & Laurent, 2010), and our findings are thus directly applicable for reasoning about such circuits. WTA-type behavior has been experimentally demonstrated in a variety of brain structures and species including the mammalian hippocampus and cortex (Kurt et al., 2008; Baca et al., 2008; Mittmann et al., 2005; Gruber et al., 2009; Sasaki et al., 2007; Busse, Wade, & Carandini, 2009). Also, the existence of functional WTA circuits has been suggested based on strong anatomical evidence in others, in particular six-layer cortex (Binzegger et al., 2004; Tomioka et al., 2005).

Combining several WTA networks permits the implementation of computational operations that can either not be performed by a single WTA or would require an unrealistically large WTA. In this letter, we show the constraints that need to be imposed on combinations of such WTAs such that the new combined system is guaranteed to be stable. As an illustration of the computational ability of a network consisting of several WTAs, we simulated a network of three WTAs coupled both using symmetric and asymmetric connections (see Figure 7). This network demonstrates two crucial properties that combinations of WTAs permit: persistent activity in the absence of input and state-dependent reaction to external input. While originally designed to demonstrate the stability property, this is also a novel generalization of our previous state-dependent processing network (Rutishauser & Douglas, 2009), as well as the pointer-map approach (Hahnloser et al., 1999). Note that in contrast to the previous work, here all three WTAs are fully homogeneous (identical). The only modifications needed are to establish appropriate connections between the WTAs. In contrast, our previous networks consisted of one or several WTAs plus specialized units. This new and more generic version makes the networks more stable, as excitatory units can exist only as part of a WTA and thus always receive balanced inhibitory input. As in the original (Rutishauser & Douglas, 2009), this enhanced three-WTA network is also capable of implementing any regular language (a state automaton).

While we demonstrated our approach for WTAs, our approach is sufficiently general to reason systematically about the stability of any network, biological or technological, composed of networks of small modules that express competition through shared inhibition. For example, synthetic DNA circuits can perform computations, self-assemble, and provide a natural way to enforce competition through shared inhibition (Kim et al., 2004; Rothemund, Papadakis, & Winfree, 2004; Adleman, 1994). Both natural and synthetic gene regulatory networks depend on networks of stochastic chemical reactions, resulting in a system of many nested feedback loops. Robustness is thus a crucial issue (Soloveichik, 2009) and the notion of contracting modules to describe complex aggregates might provide crucial insights into such systems. These networks can also implement WTA-like computations (Kim et al., 2004). Self-assembly of synthetic DNA circuits and biological tissue in general relies on de novo bottom-up construction, analogous to our notion of first building small, contracting modules and then composing a system consisting of such modules.

We expect that our results will have immediate application in interpreting the behavior of biological neural networks. For example, most synapses in the nervous system are both unreliable and plastic, and so the postsynaptic effect elicited by given action potential is uncertain and time varying. Contracting systems and their aggregations remain stable under these constraints provided the parameters remain within certain well-defined and rather broad ranges. Time-varying changes in both the input and the structure of the network are thus permitted within a broad range without endangering the stability of the system (Lohmiller & Slotine, 1998). This is particularly important if the system is modifying itself through processes such as developmental growth, synaptic plasticity, or adult neurogenesis.

A key question in neuroscience is how a large system such as the brain can rapidly modify itself to adapt to new conditions. A possible solution that our results suggest is that some parts of the network (the modules, here WTAs) could be prespecified, whereas the connections between the modules are learned or modified. This would greatly reduce the required amount of learning or developmental growth processes. A key question is how rules of plasticity can be utilized to enable such learning. It is also an open question whether and how a given WTA can be systematically decomposed into a combination of smaller WTAs that still perform the same function. This question is crucial because the realistic size of a given WTA is restricted by the size of the projection field of recurrent inhibition. Our results provide a framework for investigation of these important questions.

## Appendix A: Constraints for Persistent Activity

*x*>0 for the unit

_{i}*i*that represents the last winner. At this point, some necessary constraints on the weights can be derived. Since

*x*>0, unit

_{i}*i*is fully linear. The activity of this unit is described by Thus,

*x*(α −

_{i}*G*) = β

_{1}

*x*+

_{N}*T*. It follows that α>

_{i}*G*for

*x*to be positive. Intuitively, α −

_{i}*G*represents the effective recurrent input the unit receives after accounting for the load (which causes it to decay exponentially to zero in the absence of input). This effective recurrent input needs to be strong enough to account for the negative inhibitory input, as well as the constant current subtracted by the threshold. Also, in the two-map case, γ can be incorporated in this argument as α + γ>

*G*(this follows from the synchrony; see below). Experimental measurements in neocortex indicate that α>

*G*(Douglas et al., 1995). The open-loop gain of such systems is >1 (sometimes substantially so). Thus, these networks are stable only because of balanced inhibition. The requirement to operate in the α>

*G*poses strict requirements for stability.

## Appendix B: Notes on Approximation of the Activation Function

To motivate the *l _{j}* terms, consider the smooth nonlinearity

*f*(

*x*) = log(

*a*+ exp(

*b*(

*x*+

*c*)) with

*a*= 1,

*b*= 1, and

*c*= 0. Then, . For large

*x*, this is approximately equal to 1 and for negative

*x*< 0 approximately equal to 0. An approximate solution to the derivative is thus either 0 or 1. Indeed, the sufficient stability conditions provided by the following analysis will be unchanged if the nonlinearity is replaced by a general sigmoid of slope within the interval [0, 1], corresponding to the dummy terms taking any time-varying value between 0 and 1. In optimization, a frequently used approximation is (the integral of the sigmoid function) (Chen & Mangasarian, 1995).

*a*>0 is a constant. Its first derivative is , and its second derivative is . Its derivatives are bounded between 0 and 1, and this function can thus be used as an approximation. The approach that we use in this letter is thus similarly valid for this and other smooth approximations (as long as the derivative is bounded <=1).

## Appendix C: Time Constants

_{I}, which are larger than the excitatory time constants τ

_{E}. Taking possibly different time constants into account, the bounds in equations 2.23 to 2.25 become Note that the key variable is the ratio . If τ

_{E}= τ

_{I}, the bounds reduce to equations 2.23 to 2.25.

## Acknowledgments

We acknowledge funding by the California Institute of Technology, the Massachussets Institute of Technology and the EU DAISY (FP6-2005-015803) and SECO (FP7-2009-216593) grants. We thank Emre Neftci for discussion and anonymous reviewers for useful suggestions.

## Notes

^{1}

These solutions are derived by considering the eigenvalues of the Hermitian part of equation 2.10, which is diagonal and real, and then solving the system of inequalities λ_{max} < 0.