## Abstract

An implementation of attentional bias is presented for a network model that couples excitatory and inhibitory oscillatory units in a manner that is inspired by the mechanisms that generate cortical gamma oscillations. Attentional biases are implemented as oscillatory coherences between excitatory units that encode the spatial location or features of the target and the pool of inhibitory units. This form of attentional bias is motivated by neurophysiological findings that relate selective attention to spike field coherence. Including also pattern recognition mechanisms, we demonstrate how this implementation of attentional bias leads to selection of an attentional target while suppressing distracters for cases of spatial and feature-based attention. With respect to neurophysiological observations, we argue that the recently found positive correlation between high firing rates and strong gamma locking with attention (Vinck, Womelsdorf, Buffalo, Desimone, & Fries, 2013) may point to an essential mechanism of the brain’s attentional selection and suppression processes.

## 1 Introduction

Attending to an object in an attentional focus requires the selection of this target and the suppression of distracters. Here, building on a recently proposed oscillatory neural network model (Burwick, 2007, 2008b, 2011), we discuss an implementation of such selection and suppression that is based on oscillatory coherence. As an inspiration for this implementation, we use neurophysiological observations of neural substrates of attentional focusing (Fries, Reynolds, Rorie, & Desimone, 2001; Bichot, Rossi, & Desimone, 2005; Womelsdorf, Fries, Mitra, & Desimone, 2006); see also the review in (Womelsdorf & Fries, 2011). In particular, we emphasize the relation of our model to the neurophysiological observations recently reported by Vinck, Womelsdorf, Buffalo, Desimone, and Fries (2013).

The motivation for our approach is twofold. On one hand, we want to take steps toward incorporating recent neurophysiological insights into a network model that allows for information processing, in particular, pattern recognition. This should pave the way for more advanced applications based on brain-inspired mechanisms. On the other hand, the study of the biological mechanisms in the context of information processing may also shed light on the functional relevance of the observed biological processes.

We consider an oscillatory neural network model that consists of coupled excitatory and inhibitory oscillatory units and stores patterns in a Hebbian manner (Burwick, 2011). Attentional selections should result in selecting patterns. The couplings among excitatory units establish a competition of coherence among the patterns (Burwick, 2007, 2008a, 2008b, 2008d). This competition is used as core ingredient to establish the selection and suppression processes. We find that implementing the attentional bias through coherence between excitatory activity in the attentional focus and inhibitory minima of the pool of the inhibitory units leads to a selection of the pattern in this focus and a suppression of the distracting patterns.

In section 2, we refer to earlier approaches of linking attentional phenomena to oscillatory models. In section 3, we present our oscillatory network model with its implementation of the attentional bias. In section 4, we use examples to illustrate the model’s dynamical content with respect to attentional selection of the target and suppression of the distracter. This is done for both spatial and feature-based attention. In section 5, we comment on the relation to the neurophysiological findings that inspired the model. In particular, we discuss an analogy with recent neurophysiological results reported in Vinck et al. (2013). In section 6, we compare our model to some other models. Section 7 contains the summary and an outlook on further issues that deserve to be studied.

## 2 Earlier Oscillatory Implementations of Attention

The network model presented in Burwick (2011) is built on coupled phase-model oscillators with dynamical amplitudes. It thus resembles earlier phase-model-based approaches to attentional modeling like the ones by Borisyuk and Kazanovich (see, e.g., Borisyuk & Kazanovich, 2004). Due to using simple phase-model oscillators, such approaches do not have an explicit representation of spiking neurons. However, by going beyond the earlier approach of Borisyuk and Kazanovich, the model we use here allows defining a notion of coherence that has some analogy with the spike field coherences observed in the neurophysiological context (the meaning of spike field coherence is described in section 3.3). In fact, the model discussed in Burwick (2011) was inspired by the understanding of PING (pyramidal-interneuron-gamma) mechanisms as discussed, for example, in Whittington, Traub, Kopell, Ermentrout, and Buhl (2000) and Börgers & Kopell (2005). Therefore, it is particularly suited to study the functionality of gamma frequency oscillations.

Following the proposal of temporal coding binding mechanisms from a theoretical perspective by von der Malsburg (1981) and the first experimental hints in this direction (Eckhorn et al., 1988; Gray & Singer, 1989), and due to the closeness of binding and attention (for a historical account, see Wolfe & Robertson, 2012), attention was first related to oscillatory models by Niebur and coworkers (Niebur, Koch, & Rosin, 1993; Niebur & Koch, 1994). This early approach, however, could not be built on the understanding of gamma oscillation mechanisms that was reached only in the following years. In particular, the PING (pyramidal-interneuron-gamma) mechanism of gamma rhythms (Whittington et al., 2000) that motivated several other models and ours was not known when this early approach was formulated.

Apart from the work of Borisyuk and Kazanovich, little effort was made to use the relative simplicity of phase-model-based networks to gain a deeper understanding of oscillatory mechanism behind the attentional phenomena. Instead, most models were based on coupling spiking neurons in order to keep biological realism close to the neurophysiological setting (Tiesinga, Fellous, Salinas, José, & Sejnowski, 2004; Mishra, Fellous, & Sejnowski, 2006; Buia & Tiesinga, 2006; Tiesinga, Fellous, & Sejnowski, 2008; Zeitler, Fries, & Gielen, 2008; Buehlmann & Deco, 2008; Tiesinga & Sejnowski, 2010; Börgers & Kopell, 2005, 2008; Buehlmann & Deco, 2008, 2010; Ardid, Wang, Gomez-Cabrero, & Compte, 2010; Akam & Kullmann, 2010). (See also Tsotsos, 2011, for a comprehensive view on the various other computational approaches to study attention.)

Most of these earlier network-based works do not relate the attentional selection to recognition processes. In contrast, as we demonstrate with the examples in section 4, the combination of attentional bias and recognition mechanisms is an essential aspect in the following. In fact, it is the simplicity of the phase-model approach that allows approaching oscillatory modeling while at the same time implementing pattern recognition capabilities.

## 3 Oscillatory Neural Network Model with Coherence-Based Implementation of Attentional Bias

### 3.1 Network Model with Exitatory and Inhibitory Oscillatory Units

In this section, we present the model that we use for simulations in the next section. The model coincides with the PING-inspired network model described in Burwick (2011) except for two modifications. The first of the modifications concerns the couplings from excitatory to inhibitory units. While these were local in Burwick (2011), we now allow these to include intercolumnar couplings (this modification shows up in equations 3.4c and 3.4d given below). The second modification concerns the inclusion of attentional bias described in section 3.4.2. Notice that this model is an inhibitory extension of an earlier model that coupled only excitatory units (Burwick, 2007, 2008a, 2008b, 2008c, 2008d). Therefore, the latter references may also be consulted for additional explanations.

The model consists of *N* columns labeled with . Each column *n* is composed of one excitatory and one inhibitory unit with coordinates given by amplitude *E _{n}* and phase for the excitatory unit and amplitude

*I*and phase for the inhibitory unit. (Evidently a physiological interpretation would require identifying “units” with groups of biological neurons inside the columns; see also the related remarks in section 7.)

_{n}*g*, the amplitudes

*E*and

_{n}*I*are related to coordinates and through This implies and () is interpreted as describing an on-state (off-state) of the excitatory unit, while () corresponds to the on-state (off-state) of the inhibitory unit. The dynamics of the network is then given by Burwick (2011): Here, where and are timescales. The

_{n}*e*describe external inputs, the are intrinsic frequencies, and the are shear parameters of the excitatory units. The corresponding parameters for the inhibitory units are , and .

_{n}The reason for modeling the two units of column *n* with two separate phases ( and ) and amplitudes (*E _{n}* and

*I*) is to allow for different phases and activations of the excitatory and inhibitory units. Consider, as a motivating neurophysiological example, the gamma oscillations in the hippocampus and modulations through theta rhythms. It is “generally believed that the theta rhythmic activity of hippocampal cells is entrained by rhythmic inhibition of inhibitory interneurons” (Tsodyks, Skaggs, Sejnowski, & McNaughton, 1997). Correspondingly, realistic modeling requires the suppression of the activity of the inhibitory units while nonsuppressing the excitatory units. Evidently such a separate activation of excitatory and inhibitory units is more naturally modeled with separate oscillators for the two classes of units as it may then be realized, for example, through using different external inputs

_{n}*e*and

_{n}*i*to the excitatory and inhibitory unit in equations 3.2a and 3.2c, respectively (see also the additional remarks regarding cross-frequency couplings in section 7).

_{n}### 3.2 Coupling Terms

*h*by attentional bias parameters and external inputs

_{nm}*e*. This will be made explicit in section 3.4.1. The are couplings strengths. The -dependent terms in equation 3.4a are the classical Cohen-Grossberg terms of continuous neural networks (Cohen & Grossberg, 1983; Hopfield, 1984), while the -dependent terms in equations 3.4a and 3.4b describe the synchronizing and accelerating terms, with strengths and , respectively, that were introduced in Burwick (2007) and discussed in Burwick (2008a–d). More will be said on these terms in section 3.4.2. The terms with parameters and describe the effects of inhibitory units. See Figure 1 for an illustration of the couplings. In section 3.4.2, we explain how this implementation of attentional bias is related to coherence.

_{n}An essential ingredient of biological gamma oscillations is the driving of inhibitory units through excitatory units. It constitutes one of the defining properties of the so-called PING rhythm (see, e.g., the discussion by Börgers & Kopell, 2005). The corresponding terms in the context of the above oscillatory neural network model are the -dependent terms in equations 3.4c and 3.4d (see Burwick, 2011, for further discussion of the relation between equations 3.4 and the other PING-defining properties).

With respect to the -dependent terms, the acceleration phase parameter obeys and is of crucial relevance (Burwick, 2007, 2008b). With a Hebbian storage of patterns (see section 3.4.2), a nonvanishing value of the introduces the competition for coherence among these patterns that is essential for the following.

### 3.3 Inspiration from Neurophysiological Findings

Several neurophysiological findings since the turn of the millennium have been observing spike field coherences as neural substrate of the attentional focus (see the review in Womelsdorf & Fries, 2011). The spike field coherence is a measure for the synchronization of spikes and the oscillations of the LFP as a function of frequency (Fries, Roelfsema, Engel, König, & Singer, 1997). Recordings from neurons in macaque monkey visual area V4 and studies of covert attention with a case of two stimuli, one inside and the other outside the receptive field (RF) of the recorded V4 neurons, indicated that attention to the stimulus in the RF increased the spike field coherence between recorded spiking and local field potential (LFP) in the gamma range (30–90 Hz) (Fries et al., 2001; Womelsdorf et al., 2006; Vinck et al., 2013). While earlier observations like the ones reported by Moran and Desimone (1985) did not consider oscillations and, correspondingly, did not find clear evidence for attentional effects with only one stimulus in the RF, it was the landmark next step reported in Fries et al. (2001) to identify such an effect as the increase in spike field coherence. While Fries et al. (2001) observed this effect for spatial attention, the experiments described in Bichot et al. (2005) extended these results to feature-based attention.

Motivated by these neurophysiological findings, we extend the model of equations 3.2 and 3.4 in the following section with an attentional mechanism that is based on oscillatory coherence. The relation of the resulting model to the neurophysiological observations is discussed with more detail in section 5.

### 3.4 Incorporation of Coherence-Based Attentional Bias

#### 3.4.1 Attentional Bias

We introduce attentional bias parameters , . In the following, these are used to define the attentional focus. For simplicity, we set , where if unit *n* is in the attentional focus and otherwise. (In section 4, we will give examples for choices of for two cases of spatial attention and one case of feature-based attention.)

*P*patterns, , with , . These patterns are stored as connections

*h*between the excitatory units in a Hebbian manner: where The are pattern weights (the ones used, for example, in Burwick, 2008b, 2008d), the are attention-dependent modulations that will be made explicit in the following, and parameterizes the strength of the modulations.

_{nm}*p*that shares more on-state units with the input

*e*at the sites of the attentional focus, specified through , will reach a higher weight and thereby a competitive advantage (in the competition specified in the next paragraph). In the context of our examples in section 4, we use where and its relation to the inputs

*e*is explained in section 4.2.

_{n}#### 3.4.2 Selection Through Competition for Coherence

The synchronization terms on the right-hand side of equation 3.8 introduce a tendency of each pattern *p* with to synchronize the phase with the pattern phase . With respect to the synchronization behavior, it is important to understand the role of the acceleration phase . With , one obtains , and the synchronization terms imply global coherence, that is, all phases approach the same value and, correspondingly, all pattern phases . It should then be obvious that any pattern-discriminating role of the phases is lost in this case as all oscillators synchronize to the same phase and the phases are no longer a marker of different patterns, a situation referred to as “superposition catastrophe” in the context of the binding problem (see the review in Burwick, 2014).

*q*and assume that it is coherent, , with , . Then where as . The dynamics arising from the interplay of the synchronizing and accelerating terms (and also the motivation for the labels “frequency splitting” and “pure pattern frequency” in equations 3.8 and 3.10, respectively) may then be understood as follows.

The competition for coherence is among the patterns *p* that introduce the tendency to synchronize each phase with its pattern phase if . This is a consequence of the terms in equation 3.8. Due to the coefficients of these terms in equation 3.8, the competitive advantage is larger for patterns with higher pattern weight and higher activation . Moreover, the competitive advantage increases with pattern coherence *C _{p}* itself, implying a winner-take-all behavior. It is the nonvanishing acceleration phase and that may prevent global coherence: as may be seen from equation 3.10, the different patterns tend to synchronize at different phase velocities. If a pattern

*q*would win the competition, , and the other patterns would turn completely decoherent, , then the phase velocities of the units of this pattern would be driven by the term that is indicated as pure pattern frequency in equation 3.10. These frequencies may differ for the different patterns, and therefore, the different patterns tend to synchronize at different frequencies and a competition for coherence may arise instead of global coherence.

Although complete decoherence of the nonwinning patterns is not the general case, the tendencies of the different patterns to synchronize with different phase velocities, referred to as frequency splitting in Burwick (2008b) (by using the denomination in equation 3.8, we indicate that it originates from the definition of ), is central also for the nonidealized situation. In the context of our examples in section 4, it will show up in Figure 7; more will be said on this when discussing the examples.

In section 3.4.1, we described how the choice of the attentional focus modulates the couplings. The dominance of a pattern in the attentional focus is thereby translated into a competitive advantage of this pattern in the competition for coherence.

#### 3.4.3 Suppression Through Decoherence with Inhibitory Pool

Beyond the selection process that lets a pattern win the competition for coherence given some input and attentional focus, we also need a mechanism that suppresses the nonwinning, that is, the distracting patterns. Such a selective inhibition that suppresses the distracting activity but spares the target of attention is provided by the dynamics of equations 3.2–3.6.

Consider now the situation that a particular attentional focus is specified through the bias parameters . If a pattern is dominating the input at the attentional focus, this may then lead to this pattern winning the competition for coherence. The input *e* and the bias affect the competition as specified in section 3.4.1. As a consequence, the pattern tends to oscillate at a characteristic (pure pattern) frequency (see section 3.4.2). This implies that the pattern in the attentional focus imprints its characteristic rhythm on the inhibitory pool. Through the recurrent effect of the inhibitory pool, as made explicit with equations 3.11, only the patterns survive that may cope with this inhibitory rhythm, that is, the winning pattern imprints its rhythm with the inhibitory pool and may therefore be active at times of inhibitory minima. In contrast, the distracters do not share the rhythm, their activity cannot be coherent with the inhibitory minima, and the distracting patterns are therefore subject to the suppressing effect of the inhibitory pool.

## 4 Examples: Target Selection and Distractor Suppression Through Choice of Attentional Focus

The mechanisms described in section 3.4 are now demonstrated with examples. We consider a network with stored patterns, where the patterns are obtained from images that are described in terms of Gabor wavelet responses (Lades et al., 1993; Lee, 1996). We use an input to the network that is constructed as the superposition of two of the patterns. We then demonstrate that different choices of attentional focus lead to different selections of the winning pattern. Two cases of spatial attention are considered and one case of feature-based attention.

### 4.1 Stored Patterns

The use of equation 3.5 requires the definition of *P* patterns , , . Following the earlier work reported in Burwick (2010, 2011), we use an example with patterns, where the components of each pattern are the thresholded magnitudes of Gabor wavelet responses of one of *P* images (see Figure 2 for the present choice of images). Starting from these gray-value images, Gabor wavelet responses are taken at lattice sites. At each site, the Gabor responses for scales and directions are computed, constituting a set of components at each lattice site referred to as “jet” (an analog of the neurophysiological hypercolumn) (Lades et al., 1993). Each pattern therefore consists of jets, giving a total of units. The thresholding implies . (For more details on how this encoding of the images is achieved see Burwick, 2011, section 6.1.) As an example for a pattern thus obtained, an illustration of the pattern , corresponding to the lyre image, is shown in Figure 3A.

In order to have a more compact way of illustrating essential aspects of the dynamics, we use the “jet maps” (Burwick, 2011, section 6.1) as a compact way of illustrating the distribution of the high-dimensional pattern vectors , input vector *e _{n}* and activities

*E*(see Figure 3B for the jet map of the pattern displayed in Figure 3A). For each of the pixels the jet map represents the sum over the jet components and displays the magnitude of these sums through a gray-valued image, where the normalization is such that black encodes the maximal value of the sums.

_{n}*N*of each pattern. For instance, pattern , the bonsai tree, has 2892 nonzero components, whereas pattern , the billiard table, has 1261 nonzero components. Pairs of distinct

_{p}*p*and

*q*give the overlap of nonzero vector components between patterns

*p*and

*q*. For example, patterns and , the bonsai and the lyre, have 623 overlapping nonzero components. We emphasize these overlaps since the competition for coherence is particularly suited to deal with overlapping patterns (see Burwick, 2008b).

Having defined the patterns above, we complete the definition of the Hebbian weights in equation 3.5 by choosing . This choice implies in equation 3.8, so that it is the percentage of activation that determines each pattern’s advantage in the competition for coherence. The other parameter choices in equations 3.4 and 3.6 are given in appendix B.

### 4.2 External Input

Our choice of external input, illustrated in Figure 4, is obtained by combining two of the stored images; one image, the lyre (), partially covers the other image, the mailbox (). We chose the lyre for two reasons. First, the gaps between the strings imply a mixing of the two images that makes the separation of the patterns particularly difficult. Second, the strings of the lyre have a well-defined orientation. It is then a natural challenge for the mechanism to select the lyre if the feature-based attentional focus is directed to the orientation of the strings (our example case 3 in section 4.3.3).

As with the stored patterns, the input image is represented as an *N*-dimensional input vector constructed from the (thresholded) magnitudes of Gabor wavelet responses (see Figure 4B for the corresponding jet map of *J*). The actual input that each excitatory unit receives is then given by with global suppression factor .

Because the dynamics of inhibitory units is claimed to be driven by the excitatory units (an essential ingredient of the PING mechanism that describes cortical gamma oscillations; see, e.g., Börgers & Kopell, 2005), external input to all inhibitory units *n* is set to vanish, . (See also the corresponding remark in section 7 regarding possible implementations of cross-frequency couplings.)

### 4.3 Three Choices of Attentional Focus

We now demonstrate the dynamics of the network of equations 3.2, 3.4, 3.5, and 3.6 and the patterns, parameters, and initial values specified in section 4.1 and appendix B. The input is chosen as described in section 4.2. What remains to be chosen is the attentional focus, that is, the attentional bias parameters . The following three examples differ only in this choice of the attentional focus. We give two examples of spatial attention and one example of feature-based attention. The findings show that the pattern that is dominating in the attentional focus is selected through the resulting dynamics, while the distracting pattern’s activity is suppressed (i.e., the activity of its units that do not participate in the selected pattern). This demonstrates the pattern-selecting and distracter-suppressing coherence-based dynamics described in section 3.4.

#### 4.3.1 Case 1: Spatial Attention with Focus on Postbox

Our first choice of attentional window is the rectangular light area shown in Figure 5A. For the units *n* that make up this region, we choose the attentional bias , while the biases at other sites are set to zero, .

Note that we begin with , that is, without inhibition of excitatory units. The inhibitory effect is then switched on gradually. Between and , the inhibition is increased linearly until it reaches its maximal value specified in appendix B (see Figure 7A for this time course). This allows the system to evolve from the initial values described in appendix B before inhibition steps in.

Inside the attentional window shown in Figure 5A, the postbox pattern is clearly dominating. Correspondingly, we find that the dynamics selects the postbox pattern and suppresses the distracter (see Figure 6A for some snapshots and Figures 7B to 7F for the dynamics). The sites of suppression are compared to the image of the winning pattern in Figures 5B and 5C. In the following, we discuss how the mechanisms described in section 3.4 imply the observed dynamics.

*n*and for all

*p*. The initial competitive advantage of each pattern is then given by . For case 1, we have Due to for , one may expect that the postbox pattern wins the competition for coherence. Indeed, this is found in panels B and C of Figure 7 for .

For discussing the suppression mechanism, we refer to the overlap of the two patterns and the nonoverlapping parts as three patches: the overlap patch consists of the units *n* with , and one nonoverlap patch consists of units *n* with while the other one is made of the units with (see appendix A for the definition of the corresponding collective quantities). With respect to the suppression of the distracter, the mechanism described in section 3.4.3 implies that the excitatory input to the inhibitory pool is dominated by the sites of the dominating pattern in the attentional focus. Therefore, the rhythm of the inhibitory pool is most similar to the rhythm of this pattern (see the dotted line in Figures 8A–C for the rhythm of the inhibitory pool in comparison with the solid lines for rhythms of the three patches).

Figure 8 illustrates the essence of the proposed selection and suppression mechanism. In case 1, since the selected postbox pattern is coherent (due to winning the competition for coherence) and due to the mentioned driving of the inhibitory pool through the attentional window, the two patches that constitute the postbox pattern are most coherent with the inhibitory pool and may therefore be active at a phase relation with minimal inhibition (see Figure 8A and 8C). (According to equations 3.11, minimal inhibition for an excitatory unit *n* is given by .) Therefore, these patches remain active even after inhibition is turned on (see Figures 7D and 7F). In contrast, the nonoverlapping part of the nonselected lyre pattern is not entirely coherent with the inhibitory pool (see Figure 8B). In consequence, the activity of this patch is suppressed when inhibition steps in (see Figure 7E).

#### 4.3.2 Case 2: Spatial Attention with Focus on Lyre

The discussion of the dynamics parallels the discussion of case 1 except for interchanged roles of the two patterns (see Figures 7A’–F’ and 8A’–C’). Now the lyre pattern remains active through coherence with the inhibitory pool, while the nonoverlapping part of the postbox pattern is not coherent and therefore its activity is suppressed. As with the foregoing case, case 2 also illustrates the workings of the coherence-based selection and suppression mechanisms.

#### 4.3.3 Case 3: Feature-Based Attention with Focus on the Orientation of the Lyre’s Strings

With respect to the biological observation of coherence as a neural substrate of attentional focusing, the experiments by Fries et al. (2001) and Womelsdorf et al. (2006) considered spatial attention, while the experiment of Bichot et al. (2005) studied feature-based attention. Coherence (between spikes and fields) was also found to be the central marker of attentional bias in the case of feature-based attention.

*n*has and the orientation and scale represented by this unit belong to the third column in Figure 3A. For other units we set . This leads to with for . (For an illustration of this feature-based focus see Figures 4C and 4D.) The remaining parameters and initial values were chosen to be identical with cases 1 and 2.

As a result, we find a dynamics that selects the lyre in a way that is qualitatively similar to the dynamics shown in Figures 7A’ to 7F’ and 8A’ to 8C’, so we need not display the dynamics here. Accordingly, the snapshots are like the ones in Figure 6A’. It turns out that the same mechanisms that we demonstrated with cases 1 and 2 also succeed in realizing the appropriate selection and suppression in the case that the attentional focus is on the feature given by the chosen orientation.

#### 4.3.4 The Relevance of Frequency Splitting Through Acceleration

The nonvanishing acceleration parameter in equation 3.8 causes a frequency splitting. It implies that different patterns have tendencies to synchronize to different characteristic frequencies (Burwick, 2008b). These show up as different frequencies of the selected patterns in cases 1 and 2; see pattern in Figures 8A and 8C (case 1) and pattern in Figures 8B’ and 8C’ (case 2) with, correspondingly, frequencies 58 Hz and 81 Hz. Due to these differing tendencies, here referred to as frequency splitting, there is competition for the overlap, that is, competition for coherence.

The relevance of this frequency splitting may be illustrated by repeating case 1 with identical parameters, initial values, and attentional focus, except for using vanishing acceleration, . No selection and suppression occurs. Instead, the networks take a state of global coherence, and all activities synchronize with the inhibitory pool, thereby escaping the inhibitory effect. This is illustrated with the snapshots shown in Figure 6B. The same lack of selection and suppression is observed if cases 2 and 3 are repeated with . This illustrates the relevance of a nonvanishing acceleration parameter and the resulting frequency splitting.

## 5 Analogies with Neurophysiological Observations

### 5.1 Coherence-Based Attentional Mechanism

In section 3.3, we stated that our implementation of the attentional bias in section 3.4 was inspired by the neurophysiological observations, most notably, by the increase of spike field coherence at the site of the recorded neurons if attention is focussed on the stimulus in their receptive field (Fries et al., 2001; Bichot et al., 2005; Womelsdorf et al., 2006. See also the review in Womelsdorf & Fries, 2011). Our oscillatory network model does not have an explicit representation of spikes. It rather has the character of a complex-valued firing rate model (see Schaffer, Ostojic, & Abbott, 2013) and our related remarks in section 7. However, note that also the neurophysiological data are expressed in terms of collective quantities: averaged single-unit acitivity (SUA), multiunit activity (MUA), and local field potential (LFP). In fact, in the following, we argue that some aspects of the neurophysiological observations have their analogs in the discussed model.

Our model describes the dynamics through interactions between excitatory and inhibitory units. For the comparison with biological observations, the recent article by Vinck et al. (2013) is therefore of particular interest. The studies in Fries et al. (2001), Bichot et al. (2005), and Womelsdorf et al. (2006) and several related works used SUA, MUA, and LFP to describe the observed coherences. Going beyond these earlier studies, Vinck et al. (2013) separated the SUAs into contributions from broad (BS) and narrow spiking (NS) cells, which correspond to excitatory and inhibitory neurons, respectively.

The analogy with the workings of the selection mechanism described above may then be established as follows. First, it is important to note that the SUA and MUA coherences in Vinck et al. (2013, Figure 6) refer to coherences with respect to the LFP. The maximal contribution to the LFP originates from pyramidal neurons; see the argument in Mazzoni, Panzeri, Logothetis, and Brunel (2008). Therefore, models built on detailed couplings of spiking neurons, like the ones in Mazzoni et al. (2008) and Lee, Whittington, and Kopell (2013) simulate the LFP by summing up synaptic currents to pyramidal neurons. Given that the pyramidal neurons constitute the excitatory units of the gamma-rhythm PING system (see, e.g., Börgers & Kopell, 2005), it is straightforward to find the analog of the LFP in the input to the excitatory units, that is, the input from the inhibitory pool and the coupled excitatory units as expressed in equations 3.11. Correspondingly, the excitatory units of our model may be seen as the analog of (groups of) the pyramidal neurons.

Given these relations to the neurophysiological context, the inclusion of the attentional bias described in section 3.4 may be understood as a simple way of establishing a coherence between the excitatory units of the attentional focus and the inhibitory pool. Together with the selectively synchronizing couplings among the excitatory units, this implies a coherence between the excitatory units of the attended pattern and our analog of the LFP, mimicking the neurophysiologically observed spike field coherences in case that the attention is on the stimuli in the receptive field of the recorded neurons. This coherence is then reached through driving the inhibitory pool (see section 3.4.3) with the rhythm that is dominating among the excitatory units in the attentional focus (section 3.4.2).

With the examples of section 4, we demonstrated that the coherence-based mechanism of section 3 indeed leads to a selection of the dominating pattern in the attentional focus and a suppression of the distracters. We also demonstrated that the inclusion of acceleration was essential for the working of this mechanism (se Figure 6B for the snapshots that illustrate the failure of the mechanism if the acceleration terms are set to vanish, resulting in the loss of frequency splitting). In section 5.2 we argue that recent observations may indicate the functional relevance of a related frequency spread also in the neurophysiological context.

### 5.2 On the Relevance of the Rate-Dependency of Gamma Locking Observed by Vinck et al. (2013)

With respect to the cortical dynamics, as a result of analyzing “the single units separately” it was “found that whereas some of them showed very strong oscillatory synchronization, many showed little or no synchronization” (Fries et al., 2001, supplementary material). More recently, it was observed that the different synchronization behaviors correlate with different levels of activation. Vinck et al. (2013) studied whether “high-rate SUAs might gamma lock disproportionally more with attention and/or strongly gamma locking SUAs might fire disproportionally more with attention” and found evidence for both scenarios.

It is intriguing to find that this property of cortical dynamics has some analog in the context of our model. It is the acceleration effect that implies different synchronization behaviors of the different units (see, e.g., our cases 1 and 2 and the dynamics displayed in Figure 8). This effect is essential for the attentional workings, as demonstrated with our examples. This property may be seen as an analog of the diverging synchronization behavior of the biological neurons, and one could speculate that some analog of the acceleration mechanism is at work also in the neurophysiological context.

As stated above, the cortical neurons with higher level of activation (higher firing rates) are the neurons that have a stronger tendency to lock to the gamma rhythm (note that the firing rate is not identical to gamma frequency; see Nikolić, 2009). There is a kinship of this condition to the property of the model units that reach coherent locking to the inhibitory pool: the strongest synchronization tendency comes with the stronger accelerating drive (see equation 3.8 and the discussion in Burwick, 2008b). Correspondingly, with respect to our example case 1 (2), we find that the units that lock to the inhibitory pool in panels A and C (B’ and C’) of Figure 8 have higher phase velocity than the units that do not lock in Figure 8B (8A’). The lower level of activation of the nonlocking biological neurons has its analog not only in the lower phase velocity but also in lower amplitude of the model units as the nonlocking units are suppressed, as found in Figure 7E (7D’), while the locking units are coherent with the inhibitory minima and therefore escape the inhibition and are active as shown in panels D and F (E’ and F’) of Figure 7.

The following scenario, with a dynamics modulated by the choice of attentional focus (e.g., as described in section 3.4), would therefore be compatible with the theoretical and experimental findings. Buffalo, Fries, Landman, Buschman, and Desimone (2011) found that in visual areas V1, V2, and V4, gamma rhythms are largely confined to the superficial layers. The excitatory units of our models may therefore correspond to groups of pyramidal neurons in layers 2 and 3 of minicolumns that encode different aspects (features) of visual inputs. (Of course, the cortical representation of the stimulus is more hierarchically distributed than the monolithic patterns of our examples in section 4.) These groups of neurons respond to different levels of activity (firing rates) to the incoming stimulus. Correspondingly, a competition for coherence between these groups of neurons may arise and could result in a subset of minicolumns that take a coherent state (Burwick, 2007, 2008b). Such a state would then correspond to the “disproportionally more” gamma locking of the high-rate firing neurons (Vinck et al., 2013). In particular, the winning pyramidal neurons would lock to the gamma rhythm (in analogy to Figure 8; see the discussion in the previous paragraph), and recurrent inhibition acting on the pyramidal neurons would suppress the nonlocking neurons’ activity while allowing the locked neurons to fire at minima of inhibition. The suppression of activity may correspond to suppression of distracting influences, while the activity of the locked neurons may encode attended targets.

Additional remarks on the relation to the biological context with an outlook on possible next steps to reach a more founded relation between model and neurophysiology are given in section 7.

## 6 Comparison with Other Models

In contrast to several phase-model based neural networks (see, e.g., the review in Hoppensteadt & Izhikevich, 1997), our model is not based on assuming weakly connected excitatory units; instead, it uses strong couplings, given by the Hebbian couplings described with equations 3.5 and 3.6 and the coupling strengths and in equations 3.4a and 3.4b. The strong couplings are an essential ingredient in order to establish attracting states that correspond to stored Hebbian patterns (see Hertz, Krogh, & Palmer, 1991, for an introduction to the corresponding Cohen-Grossberg-Hopfield model; our system may be understood as a complex-valued generalization—Burwick, 2007, 2008b, 2011). Because of these strong couplings and also due to the incorporation of excitatory and inhibitory units, the model appears to have kinship with the strong-coupling regime of inhibitory stabilized models (Tsodyks et al., 1997; see also the recent discussion in Jadi & Sejnowski, 2014). There are, however, essential differences with respect to the considered parameter regimes. In this section, we comment on relations of our model to this and some other models.

There is an immediately obvious distinction between the discussion of inhibitory stabilized model in Tsodyks et al. (1997) and our model. While the former assumes all-to-all couplings of equal strength, thereby allowing the treatment of the set of excitatory units as one population, the latter uses couplings of different strength, specified through the Hebbian couplings of *P* patterns with equation 3.5. The latter therefore models the interaction of different excitatory populations described with the collective quantities that are defined in appendix A.

One may overcome this distinction by setting so that both models deal with only one excitatory population. There is, however, another aspect that distinguishes the two models. Using a Wilson-Cowan model for coupling an excitatory with an inhibitory population, Tsodyks et al. (1997) compared cases of weak and strong couplings among excitatory units and found a “paradoxical” behavior in case of strong couplings. Increasing the excitatory input to the inhibitory population led to the intuitively expected decreased activity of the excitatory population for weak and strong couplings. The counterintuitive (paradoxical) behavior occurred for the case of strong couplings because the activity of the inhibitory population also decreased in spite of its higher external input. In the following, from a heuristic point without going into the details, we describe why the situation discussed in Tsodyks et al. (1997) differs from the situation dealt with in our case.

Consider equation 3.2a with vanishing couplings and the corresponding attractor implying . Its stability will not change for weak couplings. The analog situation for the network discussed in Tsodyks et al. (1997) is the dynamics resulting from weak self-couplings of the excitatory population and the fixed point in our case has its analog in the steady-state solution as illustrated with Tsodyks et al.’s Figure 4A. For our model, as well as for the model discussed in Tsodyks et al., increasing the excitatory couplings toward strong coupling values leads to a loss of stability of this fixed point if the inhibitory couplings are kept small enough: “its activity dies off or explodes to saturation, depending on the initial conditions” (Tsodyks et al., 1997, p. 4385). With respect to our discussion, this is what constitutes the essence of establishing off- and on-states, respectively, as attractors that correspond to memorized patterns. The fixed point turns unstable through bifurcations into stable states that correspond to on- and off-states of the memorized patterns (and, possibly, spurious states; Hertz et al., 1991).

In the context of Tsodyks et al. (1997), however, this destabilization was prevented through an accompanying increase of inhibitory weights so that the fixed point of the weak-coupling regime remained stable also for strong couplings among excitatory units (see the discussion with respect to Tsodyks et al. (Figure 4B). This is then the crucial property where the two approaches differ. The paradoxical behavior described in Tsodyks et al. also required strong inhibitory couplings so that the fixed point of the weak-coupling regime is also stable for strong excitatory couplings. In contrast, the mentioned fixed point in our case has to be destabilized in order to generate the attractors that correspond to patterns. This is realized through the discussed selective inhibition that implies that the winning pattern is not subject to inhibition due its coherence with the (minima of the) inhibitory pool.

The inhibitory stabilized model (Tsodyks et al., 1997) has been related to gamma rhythm generation. Alternative models of gamma rhythm generation have been based on synchronization of the inhibitory population. Again, these models do not combine the dynamics with attractor-based pattern-recognizing dynamics. Nevertheless, with respect to the coupling of excitatory and inhibitory populations, these alternative models are closer in spirit to our model, as we argue next.

Although we are using a comparably simple model for the network’s units based on phases and amplitudes (and not, for example, Hodgkin-Huxley neurons), the rationale behind the mechanism illustrated, for example, with Tiesinga & Sejnowski (2010, Figure 1) and related discussions in Börgers and Kopell (2008), Buehlmann and Deco (2010), Akam and Kullmann (2010), Gielen, Krupa, and Zeitler (2010), Eriksson, Vicente, and Schmidt (2011), and Wildie and Shanahan (2011) are based on mechanisms that are analogous to the mechanism of the model presented here. These earlier works discuss the effect of two sites of sending neurons and one site of receiving neurons with a view on the cases where only one of these sites synchronizes (phase-locks) with the receiving neurons. This corresponds to attention being focused on the phase-locked stimulus. Essentially this effect arises through the winning neurons exciting the receiving excitatory and inhibitory neurons so that the effect of nonwinning input is blocked and suppressed through the already active inhibition at the receiving site. This is what constitutes “communication through coherence” (CTC) (Fries, 2005), also described as “selective attention through selective synchronization” (Womelsdorf & Fries, 2011).

Here, the working of the pattern-selecting mechanism is analog in the sense that nonselected pattern activity, that is, activity that does not correspond to the dominant pattern in the attentional focus, is blocked through suppressing effects of the inhibitory pool that is driven by the activity of the selected pattern. As with the CTC mechanism, it is an incompatibility of excitatory and inhibitory phases (here, resulting from incompatible rhythms) that implies the suppression of nonselected activity. This is what we demonstrated with the examples in section 4. Going beyond the foregoing studies, the selection is not among two incoming stimuli but among patterns, being the result of a self-organized process.

## 7 Summary and Outlook

We presented an oscillatory neural network model that couples excitatory and inhibitory oscillators. The model is an extension of the models described in Burwick (2007, 2008b, 2011). Assuming Hebbian storage of patterns through the couplings between the excitatory units, we implemented an attentional selection mechanism that allows extracting a pattern that is specified through dominance in the attentional focus. This dominance leads to the pattern’s winning a competition for coherence among the stored patterns. The winning pattern then imprints its rhythm on the inhibitory pool and, through the recurrent inhibition of the excitatory units, the distracters’ activity is suppressed. The winning pattern survives the recurrent inhibition through participating in the inhibitory rhythm so that it may be active at inhibitory minima. In contrast, the distracters do not share this rhythm and therefore are suppressed through being exposed to the nonminimal inhibition. We demonstrated the workings of the proposed mechanism through two examples of spatial attention and one case of feature-based attention.

The proposed mechanism was motivated by neurophysiological observations that found spike field coherences as neural substrates of attention (Fries et al., 2001; Bichot et al., 2005; Womelsdorf et al., 2006; see also the review in Womelsdorf & Fries, 2011). As a central statement, using some analogy between the neurophysiological situation and the model dynamics, we pointed to the possibility that the observed correlation between high firing rates and strong gamma locking with attention (Vinck et al., 2013) indicates an essential aspect of the brain’s attentional selection and suppression mechanisms.

We are well aware that there is still work necessary until the linking of the dynamics of the described model with the workings of biological neurons is established. In that respect, an approach like the one discussed in Schaffer et al. (2013) may be of interest. There, complex-valued units are related to firing rate models. Future work may clarify whether an extension of work of Schaffer et al. toward describing networks of excitatory and inhibitory units may lead to models that incorporate network properties that have been essential for the discussion presented here. This would then give a more founded basis for the relation to biological observations.

Evidently another challenge for a more complete attentional model and the relation to neurobiology is to embed the described model in a system that determines the choice of attentional bias parameters and their effect in a dynamical manner. In that respect, our current conjecture would have two aspects. First, the attentional modulation of pattern weights (see section 3.4.2) may be the consequence of the top-down stream of a hierarchical recognition processes (see Buffalo et al., 2010) for observed backward processing of attentional effects in the biological context. Second, again with respect to biology, the modulation of coherence between excitatory units and inhibitory pool (see section 3.4.3) may result from modulations through thalamic structure, for example, the pulvinar, a thalamic nucleus discussed for its possible role in modulating the attentional focus (Saalmann, Pinsk, Wang, Li, & Kastner, 2012).

A future extension of the model may also include frequency differences that are induced by properties of the stimulus. Here, we restricted our discussion to the activating versus suppressing role of the external input. In the neurophysiological context, however, stimulus properties like, for example, size (Gieselmann & Thiele, 2008) and contrast (Ray & Maunsell, 2010; Roberts et al., 2013) were seen to have a substantial effect on induced frequencies of the gamma oscillations. In the context of our discussion, the relevant frequency differences are the result of self-organized processes among the units of the network. So far, we have not included external stimulus-induced frequency differences because their role in attention still needs to be clarified. Nevertheless, future discussions may also consider an extended version of our model that allows studying this aspect.

We also emphasize that this discussion points to a natural extension toward including cross-frequency couplings of the gamma rhythm with a slower rhythm. In that respect, it is important to state that the initialization as it was used for our simulation in section 4 used a delay in onset of the inhibitory effects. Therefore, the competition for coherence among the stored patterns was initially undisturbed by inhibitory effects. When inhibition sets in, it may then use results of this competition to enforce a selective suppression. With the simulation in section 4, we applied a simple implementation for this delayed onset of inhibition by adjusting the inhibitory couplings. As an alternative implementation (not described here), we tested an implementation through an initial suppression of the activity of the inhibitory units based on external inputs. We found essentially the same outcome as with the simulations in section 4. At the end of section 3.1, we mentioned that the latter implementation may correspond to the implementation of theta rhythms in the hippocampus. With respect to the visual cortex, it was speculated that a corresponding cross-frequency coupling with a theta rhythm might also be important for the attentional selection and suppression mechanism associated with the gamma oscillations (see, e.g., the discussion section in Bosman et al. (2012). It may therefore be of interest to see the proposed model as part of a more complete model that includes a kind of theta rhythm, where the processes demonstrated in section 4 constitute a particular cycle of the slower rhythm. The beginning of the next cycle may then include an update of the input to the excitatory units and a next initial suppression of the inhibitory effects.

A final remark should concern the application-oriented perspective of the proposed mechanisms. A recent review by Poggio and Ullman (2013) discussed whether computer vision models of visual object recognition are “catching up with the brain.” The authors describe the integration of segmentation and recognition as a major future challenge for computer vision. They point out that computer vision models traditionally treated segmentation and recognition as two separate, sequential processes. Only more recently have these tasks been treated together. In that respect, we emphasize that our attention-based selection process (corresponding also to the solution of a segmentation task) is intimately interwoven with the recognition process, an aspect that may also be of interest for steps into more advanced domains of applications.

### Appendix A: Collective Activities, Coherences, and Phases

*P*patterns with

*N*units, , , to define couplings that realize Hebbian memory. The dynamics resulting from equations 3.2 and 3.4 to 3.6 may be formulated in terms of global and pattern-related collective quantities. For each pattern

*p*, its activity

*A*, coherence

_{p}*C*, and phase are given by (Burwick, 2008b) where

_{p}*N*is the number of units

_{p}*n*with . Analogously, we may define the global quantities for the inhibitory units: The amplitudes are related to the activities and coherences through Due to the above definitions, and . The analog ranges apply to the global inhibitory quantities.

*p*and

*q*, here (postbox) and (lyre). The quantities that we use to describe the collective dynamics of the excitatory units of the patches are then given by for activity

*A*, coherence

_{pq}*C*, and phase of the overlapping part of the two patterns, and for the activity , coherence , and phase of the nonoverlapping part of pattern

_{pq}*p*(the bar on top of

*q*indicates that this patch excludes units from pattern

*q*). Replacing leads to the analog definition of , , and for the nonoverlapping part of pattern

*q*. Using the overlap matrix in equation 4.1, the normalizations are given by , and .

### Appendix B: Parameter Choices and Initial Values

The classical, that is, phase-independent coupling strength is set to . For the phase-dependent couplings, synchronization and acceleration strength are set to and , respectively, where . In order to demonstrate the isolated effects of acceleration, we set and . Excitatory units imprint their rhythm with coupling strength onto the inhibitory units. The strength of inhibitory couplings to the excitatory units is chosen to be . The strength of the couplings between inhibitory units is set to . The input parameter is set to .

The pattern weights are set to , where (see equation 4.1 for the values of *N _{p}*). This choice implies that in equation 3.8. In consequence, the competitive advantage of each pattern is proportional to , that is, the percentage of its units that are active and not, for example, to the total number of active units . Pattern weights are modulated, due to the attentional bias of patterns inside the attentional focus, with .

Initial amplitudes for excitatory and for inhibitory units, respectively, are randomly chosen such that . Thus, initial activities are . Initial phases are equally distributed between so that the network starts in a completely decoherent state ().

Equations 3.2a to 3.2d were simulated with an explicit Euler-Ansatz for discrete time steps with time constant (see equation 3.3) and regularization term . The regularization term affects the effective timescales expressed in equation 3.3 through and (Burwick, 2007, section 4.1). The global timescale may be arbitrarily chosen. For example, assuming that all times are given in ms implies that the oscillations in section 4 are in the gamma range.

## Acknowledgments

It is a pleasure to thank Pascal Fries for valuable discussions of the neurobiological background.