## Abstract

Selective attention is often accompanied by gamma oscillations in local field potentials and spike field coherence in brain areas related to visual, motor, and cognitive information processing. Gamma oscillations are implicated to play an important role in, for example, visual tasks including object search, shape perception, and speed detection. However, the mechanism by which gamma oscillations enhance cognitive and behavioral performance of attentive subjects is still elusive. Using feedforward fan-in networks composed of spiking neurons, we examine a possible role for gamma oscillations in selective attention and population rate coding of external stimuli. We implement the concept proposed by Fries (2005) that under dynamic stimuli, neural populations effectively communicate with each other only when there is a good phase relationship among associated gamma oscillations. We show that the downstream neural population selects a specific dynamic stimulus received by an upstream population and represents it by population rate coding. The encoded stimulus is the one for which gamma rhythm in the corresponding upstream population is resonant with the downstream gamma rhythm. The proposed role for gamma oscillations in stimulus selection is to enable top-down control, a neural version of time division multiple access used in communication engineering.

## 1. Introduction

The results of previous experimental studies suggest that the gamma band power of local field potentials and spike field coherence with gamma oscillations increase in area V4 and other areas upon attention. Therefore, the functions of gamma oscillations and synchronous firing in selective attention and working memory have been extensively examined (Gray, König, Engel, & Singer, 1989; Tallon-Baudry & Bertrand, 1999; Fries, Reynolds, Rorie, & Desimone, 2001; Engel, Fries, & Singer, 2001; Niebur, 2002; Bichot, Rossi, & Desimone, 2005; Jensen, Kaiser, & Lachaux, 2007; Womelsdorf & Fries, 2007; Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008). Increased gamma oscillations in V4 are known to enhance shape perception of visual stimuli (Taylor, Mandon, Freiwakl, & Kreiter, 2005) and rapidity of detecting changes in a stimulus (Womelsdorf, Fries, Mitra, & Desimone, 2006). Synchronous firing, both oscillatory (Lakatos et al., 2008) and nonoscillatory (Steinmetz et al., 2000), is also observed when subjects perform multimodal tasks that require attention.

Mechanisms for generating these gamma rhythms have been thoroughly explored. It was found that delayed recurrent connections among interneurons or excitatory-inhibitory connections are the main mechanism for the generation of gamma oscillations (Börgers, Epstein, & Kopell, 2005). How attention-driven gamma oscillations enhance information processing such as stimulus selection, gain control, and signal discriminability has not been sufficiently addressed, however. An earlier seminal work showed that gamma modulation of neurons subjected to an attended stimulus enables the selection of this stimulus among competing stimuli (Niebur, Koch, & Rosin, 1993). This result and those from more recent network models comprising spiking neurons aim to explain the existence of links between gamma rhythms and neural computation (Börgers et al., 2005; Börgers & Kopell, 2008; Buia & Tiesinga, 2006, 2008; Mishra, Fellous, & Sejnowski, 2006; Masuda & Doiron, 2007). Under bottom-up gamma oscillations, a strong (Börgers et al., 2005) or coherent (Börgers & Kopell, 2008) external stimulus is readily selected by neural populations receiving multiple stimuli. The attention signal that generates gamma oscillations can be also applied in a top-down manner. When the top-down signal is applied as a bias input to interneurons (Buia & Tiesinga, 2006, 2008), gamma oscillations (Niebur et al., 1993), synchronous spike volleys (Niebur & Koch, 1994), or the balance between excitation and inhibition along a feedforward pathway (Mishra et al., 2006) biases competition so that attended input is selected.

In most of the previous theoretical studies, gamma oscillations were assumed to be generated in one neural population or applied externally, and downstream neural populations were entrained into a single gamma rhythm. However, the mechanism of interaction of different gamma rhythms has not been sufficiently investigated (important exceptions are the studies conducted by Mishra et al., 2006, and Börgers & Kopell, 2008; we will compare their results to the results in section 4). In fact, there may exist multiple interacting neural populations that show gamma rhythms with different phases. It was proposed (Fries, 2005) and experimentally validated (Womelsdorf et al., 2007) that two cell assemblies subjected to gamma oscillations effectively communicate with each other only when there is a good phase relationship between the two gamma oscillations. To explain the concept of good phase relationship, we consider two populations that are unidirectionally coupled by excitatory synapses. Suppose that the relative timing of the two gamma oscillations is adjusted such that the peak time of the gamma rhythm applied to the presynaptic population shifted by a synaptic delay is close to the peak time of the gamma rhythm applied to the postsynaptic population. With this good phase relationship, spikes tend to excite the neurons that are relatively easy to fire. Otherwise excitatory synapses are not very effective in inducing postsynaptic firing.

In this study, we explore the possibility of a computational role for gamma oscillations in enabling the type of communication between neural populations as explained above. The emphasis of this study is on population rate coding of dynamic stimuli; examples of such dynamic stimuli include integration of many synaptic inputs and natural stimuli, such as visual scenes. The objective of this study is to investigate top-down gamma rhythms, which are considered to be important in selective attention (Tallon-Baudry & Bertrand, 1999; Buia & Tiesinga, 2006, 2008; Mishra et al., 2006), and not bottom-up gamma rhythms, which may be relevant in feature binding (Gray et al., 1989; Engel, Fries, & Singer, 2001. We consider multiple cell assemblies, each of which receives different external input and projects onto a single downstream neural population in a feedforward manner. We show that top-down gamma modulation enables the selection of one of the competing stimuli. The downstream population selects that stimulus whose associated gamma modulation is in a good phase relationship with the gamma modulation applied to the downstream neural population. As a special case, our neural network performs a type of gain modulation for static inputs.

## 2. Methods

### 2.1. Architecture of Neural Network.

In this study, we use a two-layer neural network. For expository purposes, consider the case of three neural populations, as shown in Figure 1. Each population consists of *N* = 100 neurons. Neurons in a population are uncoupled, except while carrying out supplementary numerical simulations shown in appendix B. One neural population, which we term population *d*, is downstream to the other two populations, which we term population *i* (*i* = 1, 2).

### 2.2. External and Noise Inputs.

*i*are subjected to a dynamic stimulus

*s*(

_{i}*t*) that obeys the Ornstein-Uhlenbeck process represented by where ξ

_{i}(

*t*) is a white gaussian process. In fact, we discretize equation 2.1 with the time step of 0.625 ms and realize ξ

_{i}(

*t*) as a normal random variable with standard deviation 0.025 applied every time step. The stimulus size is small compared to the amplitude of gamma modulation, which will be introduced later. The external stimuli have a decaying time constant of τ

_{OU}= 40 ms. We assume that

*s*

_{1}(

*t*) and

*s*

_{2}(

*t*) are independent of each other, that is, ξ

_{1}and ξ

_{2}are independent of each other.

Every neuron in each population, including population *d*, receives an oscillatory input of gamma band, which models the local field potential, or a bundle of synaptic inputs whose strength varies over time. We regard gamma modulation as a top-down modulatory signal and do not determine its source. The oscillation frequency is set to *f*_{γ} = 40 Hz. The amplitude of gamma modulation applied to population *i* (*i* = 1, 2, *d*) is denoted by *A _{i}*. We set

*A*

_{1}=

*A*

_{2}= 0.1 and vary

*A*. We assume that gamma oscillatory inputs applied to different populations are not necessarily in-phase synchronized (Gray et al., 1989; Fries, 2005; Womelsdorf et al., 2007). Therefore, the spikes obtained from population 1 and those obtained from population 2, which presumably represent

_{d}*s*

_{1}(

*t*) and

*s*

_{2}(

*t*), respectively, are generally correlated because of a fixed phase difference between the two gamma oscillations. Note that

*s*

_{1}(

*t*) and

*s*

_{2}(

*t*) are nevertheless independent of each other.

In addition to the dynamic stimulus and the gamma modulation, neurons are subjected to dynamical noise. Because noises applied to different neurons are partially correlated, we model noise in terms of a weighted sum of two factors. One noise source is the white noise process applied commonly to all the neurons in a population, which we denote by ξ_{i,c}(*t*) (*i* = 1, 2, *d*). The other noise source is independent (uncommon) for different neurons, which we denote by ξ_{i,j}(*t*) for the *j*th neuron in population *i*. We set the fraction and total strength of the common noise to *c* = 0.12 (Zohary, Shadlen, & Newsome, 1994; Masuda & Doiron, 2007) and σ = 0.1, respectively. This results in random-looking correlated spike statistics in the absence of gamma modulation.

### 2.3. Dynamics of Neurons and Synapses.

*v*

_{E,i}(

*i*= 1, 2,

*d*), and absolute refractory period =2 ms. Dynamics of the subthreshold membrane potential of the

*j*th neuron in the upstream population

*i*(

*i*= 1, 2), which is denoted by

*v*

_{i,j}, are represented by We set capacitance density and conductance density to

*C*= 1 μF/cm

^{2}and

*g*= 100 mS/cm

^{2}, respectively, so that the membrane time constant τ

_{m}=

*C*/

*g*= 10 ms. We set

*v*

_{E,1}=

*v*

_{E,2}= 0.97 so that neurons fire more or less randomly at a rate of approximately 18 Hz (the same for

*v*

_{E,d}, explained later). The last three terms in equation 2.2 represent the density of external currents. The phase of gamma modulation is denoted by θ

_{i}. Equation 2.2 implies that the dynamical noise is regarded as the voltage noise.

*d*. These spikes convey information about

*s*

_{1}(

*t*) or

*s*

_{2}(

*t*) by population rate coding, and

*s*

_{1}(

*t*) and

*s*

_{2}(

*t*) compete in population

*d*for being represented. As observed in populations 1 and 2, neurons in population

*d*are assumed to be leaky integrate-and-fire neurons that are subjected to two types of noise and gamma modulation, and not to external input. Dynamics of the membrane potential of the

*j*th neuron in population

*d*, denoted by

*v*

_{d,j}, are represented by We set

*v*

_{E,d}= 0.95.

*j*th neuron in population

*d*is defined by where is the

*k*th spike time of the

*j*′th neuron in population

*i*, and δ is the delta function. The variable represents the weight of the synapse from neuron

*j*′ in population

*i*(

*i*= 1, 2) to neuron

*j*in population

*d*. All synapses are assumed to be excitatory. An excitatory synapse exists () independently with probability 0.5 for different

*i*,

*j*′, and

*j*. Otherwise, . Consequently, on average, each neuron

*j*′in an upstream population (

*i*= 1 or 2) is presynaptic to

*N*/2neurons in the downstream population

*d*. In addition, on average, each neuron

*j*in population

*d*is postsynaptic to

*N*/2 neurons in population 1 and

*N*/2 neurons in population 2. Except the feedforward structure of the neural network, the synaptic connectivity is spatially unstructured. Given a synapse, is selected independently from the uniform density on [0, 0.4/

*N*], unless otherwise stated. Spikes are propagated instantaneously from populations 1 and 2 to population

*d*, with an infinitesimally rapid synaptic time course represented by the delta function used in equation 2.4. Since the network is purely feedforward, this instantaneous propagation of spikes is justified. A homogeneous synaptic delay can be modeled by temporally translating the response of population

*d*by the length of synaptic delay.

### 2.4. Measurement of Neural Activity.

We examine the possibility of population rate coding in population *d* on a timescale of gamma rhythm. To quantify the effectiveness of population rate coding, we use the cross-correlation between the external input and dynamic population spike count. The bin width for defining the population firing rate is defined as *T _{bin}* ≡ 1/

*f*

_{γ}= 25 ms, and the bin is aligned such that the center of the bin corresponds to the peak of gamma modulation. The underlying logic of aligning the center of the bin to such a position is that when neurons are gamma modulated, they tend to fire around peaks of gamma oscillations. The binned population firing rate of population

*i*′ (

*i*′ = 1, 2,

*d*) is denoted by .

*s*(

_{i}*t*) (

*i*= 1, 2) and is defined by where

*n*is the total number of bins. We set

_{bin}*n*= 100 so that the total measurement time is equal to

_{bin}*T*= 2.5 s. The temporal averages of

_{bin}n_{bin}*s*(

_{i}*t*) and over the measurement time are denoted by and , respectively. Note that only

*s*(

_{i}*t*) averaged over each bin is required for calculating , because is constant within a bin.

## 3. Results

### 3.1. Selective Population Rate Coding by Gamma Modulation.

We start with a simplified situation, where we consider two upstream populations and the gamma modulation of population 1 and that of population 2 are in antiphase such that θ_{2} = θ_{1} + π. Without loss of generality, we set θ_{1} = 0. Neurons in population 1 tend to fire near the peaks of gamma modulation of population 1. Those in population 2 tend to fire near the peaks of gamma modulation of population 2, which is equivalent to troughs of gamma modulation of population 1. Therefore, the total synaptic input applied to population *d* represented by equation 2.4 presumably consists of alternation of spike packets obtained from population 1 and ones obtained from population 2. An important implicit assumption is that in populations 1 and 2, stimulus-driven spikes rarely occur during the troughs of gamma modulation (this point is discussed in section 4).

Population firing rates ν_{1}(*t*) and ν_{2}(*t*), which are defined with resolution *T _{bin}* = 25 ms, approximate

*s*

_{1}(

*t*) and

*s*

_{2}(

*t*), respectively. We obtain 〈

*s*

_{1}(

*t*), ν

_{1}(

*t*)〉, 〈

*s*

_{2}(

*t*), ν

_{2}(

*t*)〉 ≈0.88, and 〈

*s*

_{1}(

*t*), ν

_{2}(

*t*)〉, 〈

*s*

_{2}(

*t*), ν

_{1}(

*t*)〉≈0. We examine how gamma modulation of the downstream population

*d*modifies ν

_{d}(

*t*). A rastergram obtained without the gamma modulation of population

*d*(

*A*= 0) is shown in Figure 2a. Neurons in population 1 (index 0 through 99) and those in population 2 (index 100 through 199) fire stochastically, but with an apparent alternative pattern. Neurons in population

_{d}*d*receive spikes from populations 1 and 2 with the same average strength. As shown in Figure 2a, the rastergram of the neurons in population

*d*(index 200 through 299) shows a periodic pattern with period

*T*/2 = 12.5 ms due to rather regular spike packets arriving from the upstream populations. In particular, large instantaneous firing rates of population 1 (population 2), which reflect large

_{bin}*s*

_{1}(

*t*) (

*s*

_{2}(

*t*)), tend to elicit large instantaneous firing rates of population

*d*. Therefore, ν

_{d}(

*t*) represents both stimuli

*s*

_{1}(

*t*) and

*s*

_{2}(

*t*) in a degraded manner. We obtain 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉, 〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉≈0.38.

When gamma oscillations modulate population *d*, the situation noted changes drastically. For illustration, we set θ_{d} = θ_{1} = θ_{2} − π. Then, populations 1 and *d* effectively interact, whereas populations 2 and *d* do not, as suggested by Fries (2005).

A rastergram when the magnitude of gamma modulation is equal to *A _{d}* = 0.25 is shown in Figure 2b. Neurons in population

*d*are likely to fire near the peaks of the associated gamma rhythm, which roughly coincides with the arrival time of spike packets from population 1. However, spikes from population 2 arrive at population

*d*at around the troughs of the gamma rhythm associated with population

*d*. After half a gamma cycle, neurons in population

*d*get highly excited because of the gamma modulation and are likely to fire. However, at this moment, the impacts of spikes transmitted from population 2 on the postsynaptic potentials of neurons in population

*d*reduce significantly. In other words, neurons in population 2 do not fire when the synaptic input applied to population

*d*is most effective. We obtain 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉 ≈ 0.54 and 〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉 ≈ 0.28. We call this regime selective population rate coding of

*s*

_{1}(

*t*). There exists a positive correlation between

*s*

_{2}(

*t*) and ν

_{d}(

*t*), because the assumed membrane time constant τ

_{m}= 10 ms allows some of the information about

*s*

_{2}(

*t*) carried by ν

_{2}(

*t*) to persist in population

*d*after half a gamma cycle (=12.5 ms). When τ

_{m}is smaller, which is characteristic of a high-conductance state (Destexhe, Rudolph, & Paré 2003), 〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉 further decreases, as shown in section A.2.

A rastergram obtained under strong gamma modulation (*A _{d}* = 0.5) is shown in Figure 2c. Due to saturation, each neuron in population

*d*fires once per gamma cycle with a probability of approximately unity, regardless of the applied synaptic inputs. Consequently ν

_{d}(

*t*) does not represent either

*s*

_{1}(

*t*) or

*s*

_{2}(

*t*), and 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉, 〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉 ≈0. Each neuron in population

*d*does not fire more than once in one gamma cycle because of the combination of refractory period and short duration of the peaks of the gamma oscillation (Masuda & Doiron, 2007).

For fixed *s*_{1}(*t*), *s*_{2}(*t*), and *A _{d}*, we carried out the Monte Carlo simulation 100 times to assess the reliability of ν

_{d}(

*t*). The time course of

*s*

_{1}(

*t*), which is supposedly encoded by ν

_{d}(

*t*), is shown in Figure 3a. In Figures 3b to 3d, binned spike count ν

_{d}(

*t*) averaged over 100 trials (thick lines), together with the standard deviation (thin lines, added as an error bar to the average spike count), is shown for the three values of

*A*, which correspond to those shown in Figure 2. ν

_{d}_{d}(

*t*) when

*A*= 0.25 (see Figure 3c) is more correlated to

_{d}*s*

_{1}(

*t*) (see Figure 3a) than ν

_{d}(

*t*) when

*A*= 0 (see Figure 3b) and

_{d}*A*= 0.5 (see Figure 3d). Even though the difference between the cases

_{d}*A*= 0 (see Figure 3b) and

_{d}*A*= 0.25 (see Figure 3c) is not visually remarkable, we quantify it in section 3.2.

_{d}### 3.2. Dependence of Selective Population Rate Coding on the Amplitude and Phase of Downstream Gamma Rhythm.

We expect that *s*_{2}(*t*) instead of *s*_{1}(*t*) is represented by ν_{d}(*t*), when θ_{d} is close to θ_{2}, and not to θ_{1}. To systematically clarify whether this is true and quantify the effects of *A _{d}*, we show in Figure 4 the dependence of correlation coefficients defined in equation 2.5 on

*A*and θ

_{d}_{d}. In this figure, each data point corresponds to the correlation coefficient averaged over 50 trials. As expected, ν

_{d}(

*t*) is more correlated with

*s*

_{1}(

*t*) than with

*s*

_{2}(

*t*) when θ

_{d}is closer to θ

_{1}than to θ

_{2}, and vice versa when θ

_{d}is closer to θ

_{2}than to θ

_{1}. The value of θ

_{d}that realizes the largest 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉 (〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉) is slightly larger than θ

_{1}(θ

_{2}). The precession of the optimal downstream gamma rhythm relative to the upstream gamma rhythm does not violate causality, because neurons in upstream populations 1 and 2 tend to fire during the rising phase of the associated gamma modulation.

Around the optimal θ_{d}, the correlation coefficient first increases with *A _{d}* and then decreases for excessively large

*A*. The correlation coefficient is the largest at around

_{d}*A*= 0.3. At this value of

_{d}*A*, the firing rate of the neurons saturates near

_{d}*f*

_{γ}= 40 Hz, and the spike count variance, which is the main determinant of the reliability of ν

_{d}(

*t*), decreases significantly; these results are consistent with the experimental results (Murthy & Fetz, 1996, Figure 9; Mitchell, Sundberg, & Reynolds, 2007). Such reliable spiking stems from binomial like spike count statistics; it appears when a neuron fires only once per gamma cycle with high probability (DeWeese, Wehr, & Zador, 2003; Masuda & Doiron, 2007). This saturation effect is an advantage, though not mandatory, for selective population rate coding; as shown in Figure 4c, 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉 − 〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉 for a suitable θ

_{d}remains positive for small values of

*A*, such as

_{d}*A*≈ 0.1.

_{d}### 3.3. Robustness of Selective Population Rate Coding Against Parameter Variation.

We test the robustness of the obtained results against variation in a few key parameters. The robustness results for the phase of gamma modulation and membrane time constant are shown in Figures 11 and 12 in appendix A, respectively. We have also assessed the robustness of the obtained results against variation in the reversal potential *v*_{E,d} and noise intensity σ (not shown).

#### 3.3.1. Uneven Feedforward Input.

An upstream population that encodes a preferred stimulus may project onto the downstream population with a larger number of excitatory synapses than one that does not encode a preferred stimulus (Mishra et al., 2006). We examine the effect of biased projection by letting the average synaptic weight from population 2 to population *d* be 40% smaller than the average synaptic weight from population 1 to population *d*. The average synaptic weight from population 1 to population *d* remains the same.

Correlation coefficients for different values of *A _{d}* and θ

_{d}are shown in Figure 5. As expected, the weak stimulus (

*s*

_{2}(

*t*)) is represented by ν

_{d}(

*t*) under severe conditions compared to when the projection is not biased (see Figure 4). However, population

*d*selects

*s*

_{2}(

*t*) when θ

_{d}is around θ

_{2}and

*A*(

_{d}*t*) is moderately large. Similar results are obtained when the amplitude of

*s*

_{2}(

*t*) is assumed to be smaller than that of

*s*

_{1}(

*t*) and the average synaptic weight from population 1 to population

*d*is equal to that from population 2 to population

*d*(not shown). These results suggest that a weak stimulus can be selected by subjecting the downstream neurons to an appropriate top-down gamma modulation.

#### 3.3.2. More Than Two Stimuli.

More than two stimuli may compete in the downstream population. In the case of three upstream populations whose gamma oscillations have equal phase distances—θ_{1} = 0, θ_{2} = 2π/3, and θ_{3} = 4π/3—the correlation coefficients between ν_{d}(*t*) and *s _{i}*(

*t*) (

*i*= 1, 2, 3) are shown in separate panels of Figure 6. Population

*d*selects

*s*(

_{i}*t*) (

*i*= 1, 2, 3) when θ

_{d}≈ θ

_{i}.

The amount of stochasticity in spiking, membrane time constant, timescale of the gamma rhythm, and other factors collectively determine the number of stimuli that can be distinguished by population *d*. For 2 ⩽ *N _{g}* ⩽ 6 and θ

_{i}= 2π(

*i*− 1)/

*N*(1 ⩽

_{g}*i*⩽

*N*), we quantify the ability of population

_{g}*d*to select a suitable stimulus. In Figure 7, the maximal value of 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉 in the

*A*-θ

_{d}_{d}parameter space (circles) and the value of 〈

*s*

_{2}(

*t*), ν

_{d}(

*t*)〉 for the pair (

*A*, θ

_{d}_{d}) that gives the maximal 〈

*s*

_{1}(

*t*), ν

_{d}(

*t*)〉 (squares) are shown together with their standard deviations. If these two values are close to each other, it is difficult for population

*d*to distinguish the most represented stimulus, which is

*s*

_{1}(

*t*), from the second most represented stimulus, which is presumably

*s*

_{2}(

*t*). Figure 7 shows that our neural network is capable of selecting the most preferred stimuli up to about

*N*= 4.

_{g}#### 3.3.3. Conserved Firing Rates.

Due to the threshold nonlinearity of neuronal dynamics, gamma modulation (Niebur et al., 1993; Tiesinga, Fellous, Salinas, José, & Sejnowski, 2004; Masuda & Doiron, 2007) or synchrony in general (Niebur & Koch, 1994; Salinas & Sejnowski, 2000) increases the firing rate of a neuron in an excitable regime. Average firing rates of a neuron in population *d* for different values of *v*_{E,d} and *A _{d}* are shown in Figure 8, where

*v*

_{E,d}is selected as a control parameter that modulates the excitability of the neuron. In our model, gamma oscillation additively modulates the input-output relationship, which is consistent with some evidence (Reynolds, Pasternak, & Desimone, 2000); however, this modulation differs from the multiplicative gain modulation observed in experiments (McAdams & Maunsell, 1999; Treue & Martínez-Trujillo, 1999). In other words, as

*A*increases, the dependence of the firing rate on the level of constant current input, which is equivalent to

_{d}*v*

_{E,d}, is shifted upward without a drastic change in the shape of the curve. This result is obtained in general, except when the firing rates are close to 40 Hz. A plateau firing rate, which is equal to the frequency of gamma oscillation

*f*

_{γ}= 40 Hz, indicates the saturation regime, where a neuron fires once per gamma cycle. For example, a very large bias or noise is required to achieve a firing rate greater than 40 Hz.

Selective attention often enhances gamma-band activities with a conserved firing rate (Fries et al., 2001; Womelsdorf et al., 2006). To show that our results are not obtained due to increased firing rates, we vary *A _{d}* while maintaining the firing rate of population

*d*constant at approximately 18 Hz. We achieve this by gradually decreasing

*v*

_{E,d}with a gradual increase in

*A*. The numerical results shown in Figure 9 support the viability of selective population rate coding when

_{d}*A*⩾ 0.05.

_{d}#### 3.3.4. Noncoherent Gamma Oscillations.

We have assumed that gamma oscillations applied to different upstream populations are highly coherent with a fixed phase relationship. A more realistic assumption may be that different upstream populations receive noncoherent gamma oscillations in a frequency band of some width, for example, 40 to 70 Hz (Gray et al., 1989; Engel et al., 2001). In addition, the amplitude of gamma oscillations generally fluctuates. We investigate whether stimulus selection proposed in the preceding sections is effective under noisy and noncoherent gamma oscillations.

*i*, we set where and are the white gaussian processes with standard deviations of 0.04 and 0.02, respectively. We set ms and ms. Two sample traces shown in Figure 10a indicate that different gamma oscillations are generally noncoherent. Note that the time averages of

*A*(

_{i}*t*) and

*f*(

_{i}*t*)are equal to those of coherent gamma oscillations used in the previous sections.

We assume that population *d* is subjected to the same gamma modulation as the one applied to the upstream population 1, except that the baseline amplitude 0.1 is replaced by *A _{d}*.

In the previous sections, we calculated the correlation coefficient between the stimulus and the spike train with a bin width of 1/*f*_{γ} = 25 ms. However, this method of computation is inappropriate for noisy gamma modulation because in this case, peaks of spike counts occur at irregular intervals. Therefore, we calculate the so-called coding fraction, denoted by γ. The coding fraction is large when the best linear stimulus estimator applied to an observed spike train yields a small estimation error. The coding fraction is normalized between 0 and 1, where γ = 1 represents an ideal stimulus estimation and γ = 0 indicates the absence of correlation between *s _{i}*(

*t*) and the spike train. Details of the coding fraction are found elsewhere (Gabbiani, Metzner, Wesse, & Koch, 1996; Wessel, Koch, & Gabbiani, 1996; Gabbiani & Koch, 1998). In appendix C, we briefly summarize the definition of γ and show the parameter values used in this study. To calculate γ, we use the Matlab code available at Fabrizio Gabbiani's Web site.

We assume two upstream populations (*g* = 2), θ_{1} = θ_{d} = 0, and θ_{2} = π. The coding fraction for *s*_{1}(*t*) and *s*_{2}(*t*) for different values of *A _{d}* is shown in Figure 10b. The error bar shows the standard deviation obtained by conducting 50 trials. As reference, the corresponding results obtained for coherent gamma modulation, described in section 3.2, are shown in Figure 10c. The value of

*A*that provides the best estimation of

_{d}*s*

_{1}(

*t*) is smaller in the case of noisy gamma modulation (see Figure 10b) than in the case of coherent gamma modulation (see Figure 10c; also see Figure 4a). However, the results obtained from these two cases are qualitatively similar; the stimulus selection is enhanced for intermediate

*A*. Therefore, our main results hold true under noisy and noncoherent gamma modulation to some extent.

_{d}### 3.4. Theory.

In this section, we explain our main finding by analyzing a phenomenological mean field model with stochastic firing rates. We consider a neural network composed of three populations as shown in Figure 1 and assume that θ_{2} = θ_{1} + π and θ_{d} = θ_{1}, corresponding to the case examined in section 3.1. For this simple case, we determine the effect of the amplitude of the downstream gamma rhythm on the fidelity of population rate coding of each competing stimulus.

For simplicity, we assume that time is discrete and that one gamma cycle consists of two time units. Therefore, the instantaneous magnitude of gamma modulation takes either of the two levels, that is, the peak or the trough. This assumption can be justified by the facts that external stimuli are sufficiently weak such that neurons spike only near the peaks of gamma oscillations and that we neglect the phase codes. We denote the peak and trough levels by *A _{i}* and −

*A*(

_{i}*A*⩾ 0,

_{i}*i*= 1, 2,

*d*), respectively. Gamma oscillations applied to populations 1 and

*d*are synchronized. When these oscillations attain peak values (

*A*

_{1}and

*A*), the gamma oscillation applied to population 2 takes the trough value (−

_{d}*A*

_{2}), and vice versa.

The probability that neurons fire in one gamma cycle is not significantly affected by the probability of firing in the preceding gamma cycles, when the membrane time constant of a neuron is not very large. For simplicity, we let firing events within a single gamma cycle be independent of those in the previous gamma cycles. Furthermore, we assume that the stimuli take the constant values *s*_{1} and *s*_{2} in a gamma cycle, because *s*_{1}(*t*) and *s*_{2}(*t*) change slowly with respect to the gamma rhythm.

*f*(

*x*). The average firing rates of population

*i*(

*i*= 1, 2) in phase 0 relative to gamma oscillation applied to population 1 and phase π are denoted by and , respectively. When

*A*≡

*A*

_{1}=

*A*

_{2}, as set in the numerical simulations, we obtain The total inputs applied to population

*d*for phases 0 and π are given by respectively. The average firing rate of population

*d*per gamma cycle is estimated by In the case of spiking neuron models, equation 3.12 does not strictly hold owing to the refractory effect of a preceding response that reduces the proceeding response. However, we use this equation as an approximate expression because the bin width of

*T*/2 = 12.5 ms used in the analysis is not very small compared to the membrane time constant (τ

_{bin}_{m}= 10 ms) plus refractory period (2 ms).

The correlation coefficient 〈*s _{i}*, ν

_{d}〉 (

*i*= 1, 2) measured in the numerical simulations is large when ν

_{d}is sensitive to

*s*and is reliable over trials for a fixed

_{i}*s*. Sensitivity can be quantified by the amount of the change in the average firing rate with

_{i}*s*. To assess the reliability of ν

_{i}_{d}, we neglect stochasticity of the synaptic input arriving at population

*d*, because it affects the following results only slightly. Instead, we focus on the following source of stochasticity. A neuron in population

*d*fires at most once in half a gamma cycle. Then the firing statistics over a bin of width 12.5 ms is binomial-like, such that the average firing rates in phases 0 and π accompany the variances equal to and , respectively. Based on the assumption that spike count statistics in different bins are independent of each other, the spike count variance in a gamma cycle is equal to .

*d*′-discriminability, which is valid for discriminability of two gaussian distributed signals (Green & Swets, 1966; Johnson et al., 2001), adapted to our purpose. The conventional

*d*′-discriminability for two gaussian distributions with means μ

_{1}and μ

_{2}and a common variance σ

^{2}is defined by

*d*′ = |μ

_{1}− μ

_{2}|/σ. In our case, the effectiveness of population rate coding of

*s*(

_{i}*i*= 1, 2) by population

*d*is quantified by the discriminability of two spike count distributions for population

*d*resulting from two test values of

*s*. For each value of

_{i}*s*, we model |μ

_{i}_{1}− μ

_{2}| by —the infinitesimal distance between the averages of the two spike count distributions for infinitesimally close

*s*

_{1}and

*s*

_{2}. The standard deviation is almost the same for the two infinitesimally close distributions, and it is equal to . We define

Figure 11 shows the discriminability *d*′_{1} for *s*_{1} and *d*′_{2} for *s*_{2}, which are plotted as a thick solid line and a thick dashed line, respectively. When *A _{d}*>0, the difference between

*d*′

_{1}and

*d*′

_{2}is always positive. This difference is the largest at an intermediate amplitude of gamma oscillation, which is qualitatively consistent with the numerical results shown in the previous sections. Due to our phenomenological approach, there does not exist a quantitative agreement between the optimal value of

*A*derived from numerical simulations and that derived from the theory.

_{d}For comparison, the sensitivities and are plotted by a thin solid line and a thin dashed line, respectively, in Figure 11. (thin solid line) is maximized at a spike count of 0.5, that is, when the input applied to the sigmoid transfer function is equal to . In contrast, *d*′_{1} (thick solid line) is maximized at a larger firing rate, that is, when the spike count variance decreases drastically near the saturation regime. Accordingly, the peak of is attained with a smaller value of *A _{d}* than is the peak of

*d*′

_{1}.

### 4. Discussion

#### 4.1. Summary of the Results.

On the basis of the concept that neural populations effectively interact with each other when there exists a compatible phase relationship between their associated gamma oscillations (Fries, 2005; Womelsdorf et al., 2007), we have proposed a feedforward network that enables stimulus selection. We have shown that among competing stimuli, a dynamic stimulus whose associated gamma oscillation is in a good phase relationship with the gamma oscillation applied to the downstream population is selected by a downstream neural population by population rate coding. By adjusting the phase of the downstream gamma rhythm, which we interpret as a top-down control signal generated by selective attention, the time course of the attended dynamic stimulus is recovered in the downstream population. An emphasis of this study is the use of dynamic stimuli; many natural stimuli such as visual stimuli and summed synaptic input are dynamic. Our results suggest that gamma-band synchrony facilitates selective routing of dynamic information about the input to downstream information processing areas.

The stimulus selection occurs even when the attended stimulus is an unpreferred one, as shown in Tiesinga et al. (2004). The selected stimulus does not have to be stronger (Börgers et al., 2005) or more coherent (Börgers & Kopell, 2008) than the other stimuli. The proposed mechanism is robust against parameter variation. Although we have exploited the fact that neurons tend to fire near peaks of gamma oscillations, precise locking of spikes to oscillatory peaks is not mandatory.

We assumed that neurons in the same upstream population, which supposedly encode a common stimulus, are in-phase synchronized and that different upstream populations have different oscillatory phases, as assumed in other models (Mishra et al., 2006). We also showed that our results can be extended to the case in which different upstream populations are subjected to noncoherent and noisy gamma modulations, as observed in experiments (Gray et al., 1989). Evidence in sensory areas such as V1 (Gray et al., 1989) and computational modeling studies (von der Malsburg & Schneider, 1986; Terman & Wang, 1995) support that a combination of different oscillations may be bottom-up emergent behavior due to input segmentation. Alternatively, as observed in olfactory neurons (Schaefer, Angelo, Spors, & Margrie, 2006), even a single population may emit oscillatory spike packets with different phases depending on stimulus dynamics. In this study, we did not model the genesis of gamma oscillations.

We have also assumed that the threshold nonlinearity of neurons prohibits the occurrence of stimulus-related spikes during troughs of gamma modulation. Therefore, the information about *s*_{1}(*t*) and that about *s*_{2}(*t*) do not interfere with each other in population *d*, in a manner similar to time division multiple access in communication theory. Although this may be a strong assumption, it was imposed in previous models of selective attention (Mishra et al., 2006) and various phase coding schemes (Hopfield, 1995; Fries, Nikolić, & Singer, 2007).

#### 4.2. Possible Roles for Inhibitory Coupling.

In this study, we neglected inhibitory interneurons. There are several possible roles for inhibition in selective attention.

First, inhibitory neurons in the downstream population promote competition among various stimuli by winner-take-all mechanisms (von der Malsburg & Schneider, 1986; Terman & Wang, 1995; Niebur et al., 1993; Niebur & Koch, 1994; Reynolds et al., 1999) or by disinhibition of the attended pools (Buia & Tiesinga, 2008). For example, with recurrent inhibition, firing is suppressed for some time after an attended stimulus induces spikes. Then it is difficult for an unattended stimulus to induce spikes (Börgers et al., 2005; Börgers & Kopell, 2008). Such stimulus separation will be manifested if the synaptic delay is equal to a fraction of a gamma cycle.

With recurrent inhibition, stimulus selection can be carried out without the assumed top-down gamma rhythm; however, it is carried out with some modifications. We assumed that a top-down gamma rhythm is applied to the downstream population because stimulus selection is likely to occur as a result of cognitive top-down control (Engel et al., 2001; Niebur, 2002; Buia & Tiesinga, 2006, 2008; Mishra et al., 2006; Womelsdorf & Fries, 2007). However, such an ideal top-down gamma modulation would not be applied externally in real neural networks. On the other hand, local circuits composed of interneurons are theorized to generate gamma oscillations for selective attention, which facilitate recurrent interactions between input and output populations (Tiesinga et al., 2004; Börgers et al., 2005; Buia & Tiesinga, 2006, 2008). When there are internally generated gamma rhythms, the stimulus with the largest magnitude is represented by the downstream population with a high probability; this result is similar to the previous results (Börgers et al., 2005). When we apply dynamic stimuli in the neural network with recurrent inhibition, the stimulus to be represented by the downstream population varies over time, as shown by a sample rastergram in Figure 14.

Second, synaptic inhibition in conductance-based neuron models enhances selectivity of coherent dynamic stimuli (Börgers & Kopell, 2008). If we replace our current-based neuron model with a conductance-based neuron model, synaptic inhibition decreases the effective membrane time constant, particularly when excitation and inhibition are balanced, as in a high-conductance state (Destexhe et al., 2003). Then downstream neurons will be able to readily identify spike packets resulting from different stimuli due to the fast decay of stimulus information stored in the downstream neurons. This will enhance signal separation, as shown in Figure 13.

Third, recurrent inhibition constrains the firing rate. As observed in the rate-controlled scenario (see Figure 9), constraining the firing rate by recurrent inhibition is unlikely to change our main results.

Recurrent inhibition may cause inhibitory gamma modulation. We assumed sinusoidal gamma oscillations. Alternatively, gamma modulation may be inhibitory for half a cycle and not excitatory for the other half-cycle. Compared to sinusoidal gamma modulation, such inhibitory gamma modulation is advantageous for carrying out stimulus selection under strong gamma modulation. This is because strong inhibitory gamma modulation blocks the out-of-phase stimulus but does not interfere with the representation of the in-phase stimulus.

#### 4.3. Rate Coding and Gain Modulation.

For static stimuli, we showed previously that enhanced discriminability occurs with a rather strong gamma modulation, which shifts the dynamics near the saturation regime (Masuda & Doiron, 2007). Then, neurons fire at a rate slightly less than once per cycle, which results in a small spike count variance (DeWeese et al. 2003; Masuda & Doiron, 2007). We extended these results to the case of dynamic and competing stimuli. Note that strong gamma modulation and the saturation regime are optional for the proposed mechanism to be effective. From our numerical results, stimulus selection is achieved even with a gamma rhythm that is weaker than one that realizes the saturation regime.

For a successful population rate coding by the downstream neural population, the dynamic stimuli cannot change much within a gamma cycle. Fast oscillatory drives (*f*_{γ} = 100 Hz) will not yield a high upper cutoff frequency of the stimulus. This is because with fast oscillations, even a small membrane time constant (such as τ_{m} = 5 ms) is relatively low pass, and the downstream population would not be able to identify the information about different stimuli. In contrast, slow oscillations such as theta rhythm would allow only encoding of very slow stimuli. Furthermore, enhanced coding near the saturation regime (Masuda & Doiron, 2007) cannot be carried out with slow oscillations because a neuron then spikes multiple times per cycle. In conclusion, approximate gamma-band frequency is beneficial to the proposed mechanism.

We have considered a rate code whose temporal resolution is about one gamma cycle and ignored the timing or phase codes, which may be relevant in a fine timescale. For example, a large input bias elicits a reliable spike in an earlier phase of an oscillation (Hopfield, 1995, 2004; Fries et al., 2007). In our neural network, this phase code can coexist with the population rate code. In fact, when the attended dynamic stimulus is large, neurons tend to fire in an early phase (not shown). Therefore, the information about the intensity of external stimulus may be coded onto the phase as well as the spike count.

In the case of static inputs, we can also interpret our results in terms of gain control. Experimental results show that selective attention increases the gain of the selected stimulus and decreases the gain of the unselected stimulus (McAdams & Maunsell, 1999; Reynolds et al., 1999; Treue & Martínez-Trujillo, 1999; Reynolds, Pasternak, & Desimone, 2000), and this gain modulation may be mediated by gamma modulation. The correlation coefficient that we measured represents the sensitivity of the population firing rate to stimulus changes, which can be interpreted as the gain. The correlation coefficient increases (decreases) for the stimulus whose gamma modulation is in a good (bad) phase relationship with the downstream gamma rhythm; this result is consistent with the experimental results.

#### 4.4. Relation to Previous Literature.

Previous theoretical studies were conducted on interaction between different gamma rhythms (Mishra et al., 2006; Börgers & Kopell, 2008). The results obtained by Mishra et al. suggest that if top-down gamma modulation is applied with a suitable phase to the neurons encoding the preferred (unpreferred) static stimulus, the firing rate resembles the one when only the preferred (unpreferred) stimulus is present. In the case of static stimuli, our results are consistent with those obtained by Mishra et al. Börgers and Kopell (2008) propose that oscillatory inputs with different phases compete in a single population such that a more coherent input wins via a bottom-up competition mechanism. In this study, we did not assume that either input is more coherent or stronger than the other, and we examined the role for top-down gamma modulation in stimulus competition.

Our network can separate up to about four stimuli (see Figure 7). This limit is determined by various factors such as oscillation frequency, membrane time constant of a single neuron, noise intensity, and degree of heterogeneity. With regard to the number of gamma oscillations accommodated by a neural network, neurons encoding different features are considered to belong to different groups, each of which collectively oscillates with distinct phases (Gray et al., 1989; Engel et al., 2001). In theory, up to several clusters of neurons oscillating with different phases can coexist (von der Malsburg & Schneider, 1986; Terman & Wang, 1995); this result is similar to our results. The number of stimuli that our network can separate might indicate the number of items that can be held in working memory (Cowan, 2000; Miller, 1956).

## Appendix A: Robustness tests

To assess the robustness of the main results, we carry out additional numerical simulations.

### A.1. General Relative Phases of Gamma Oscillations.

When the antiphase relationship between the two upstream gamma oscillations is expressed as θ_{2} = θ_{1} + π, the spike packets from population 1 and those from population 2 are most distinguished in population *d*. Therefore, the information about *s*_{1}(*t*) and that about *s*_{2}(*t*) arrive at population *d* with the least overlap, so that selective rate coding is relatively easy. Here we relax this assumption.

When θ_{2} = θ_{1} + 2π/3, the effectiveness of rate coding by population *d* is shown in Figure 12. When θ_{d} ≈ θ_{1}, 〈*s*_{1}(*t*), ν_{d}(*t*)〉 is the largest (see Figure 12a), and when θ_{d} ≈ θ_{2} = 2π/3, 〈*s*_{2}(*t*), ν_{d}(*t*)〉 is the largest (see Figure 12b). It is more difficult to select *s*_{1}(*t*) or *s*_{2}(*t*) when θ_{2} = θ_{1} + 2π/3 than it is when θ_{2} = θ_{1} + π, because spike packets transmitted from populations 1 and 2 partially overlap in population *d*. However, stimulus selection is still viable; ν_{d}(*t*) represents *s*_{1}(*t*) (*s*_{2}(*t*)) more accurately than *s*_{2}(*t*) (*s*_{1}(*t*)) when θ_{d} ≈ θ_{1} (θ_{d} ≈ θ_{2}).

### A.2. Small Membrane Time Constant.

The membrane time constant τ_{m} controls the duration for which the information about stimuli is stored in the neurons. With τ_{m} = 10 ms, which we assumed in the main text, 〈*s*_{2}(*t*), ν_{d}(*t*)〉 is rather large (e.g., 〈*s*_{2}(*t*), ν_{d}(*t*)〉 = 0.3) even when population *d* presumably selects *s*_{1}(*t*) (e.g., when θ_{d} = θ_{1}). A good stimulus separation is expected with a small τ_{m}.

To examine this point, we set τ_{m} of the neurons in population *d* to τ_{m} = 5 ms by setting *g* = 200 mS/cm^{2}, which is twice the original value, in equation 2.2. Figure 13 shows that stimulus separation under gamma modulation is somewhat better when τ_{m} = 5 ms than when τ_{m} = 10 ms, as indicated by a sharper contrast in Figure 13c than in Figure 4c. When θ_{d} ≈ θ_{1}, a decrease in 〈*s*_{2}(*t*), ν_{d}(*t*)〉 with an increase in *A _{d}* is greater when τ

_{m}= 5 ms (from 0.47 when

*A*= 0 to 0.28 when

_{d}*A*≈ 0.15) than when τ

_{d}_{m}= 10 ms (from 0.38 when

*A*= 0 to 0.28 when

_{d}*A*≈ 0.25), although 〈

_{d}*s*

_{2}(

*t*), ν

_{d}(

*t*)〉 does not decrease to zero even in the regime of selective population rate coding.

A sufficiently large τ_{m}, such as τ_{m} = 20 ms, does not accommodate selective population rate coding (not shown). Then, population *d* cannot dissociate information from different upstream populations.

## Appendix B: Rastergram with Recurrent Inhibition

To illustrate the role of inhibition, we generate downstream gamma rhythm by recurrent inhibition instead of applying a top-down external input. Accordingly, we set *A*_{3} = 0. An arbitrary pair of neurons in population *d* is connected by an inhibitory synapse with probability 0.5 independently for different pairs. Each synaptic weight is chosen independently from the uniform density on [0, 1.2/*n*]. The synaptic delay is equal to 8 ms. The time course of inhibitory synapses is expressed using the delta function. To avoid an excessive decrease in firing rates due to recurrent inhibition, we increase the average weight of the synapse from an upstream neuron to a neuron in population *d* twice that used in the other numerical simulations.

Figure 14 shows a sample rastergram for *N _{g}* = 2. The stimulus that is encoded by population

*d*changes dynamically and that with the larger instantaneous magnitude tends to be selected.

## Appendix C: Calculations of the Coding Fraction

Here, we briefly explain the method and define the parameter values for calculating the coding fraction, introduced in section 3.3.4. Refer to Gabbiani et al. (1996), Wessel et al. (1996), and Gabbiani and Koch (1998) for the methods in detail.

The duration of each simulation is 5000 gamma cycles, that is, 125 s. We merge the spike trains obtained from *N* = 100 neurons in population *d*. The superposed spike train is a time series of the spike count with a bin width of 0.02 ms, which is the discretized unit time of the Monte Carlo simulations. In total, there are 125 × 10^{3}/Δ*t* = 6.25 × 10^{6} bins. The superposed spike train whose mean is adjusted to zero is denoted by *x*(*t*). Given the stimulus *s _{i}*(

*t*), the best linear estimator is given by (

*h**

*x*)(

*t*), where * denotes convolution, and

*h*(

*t*) is the linear filter that minimizes the mean square error of the stimulus estimation, denoted by ϵ

^{2}. The best linear filter is given in the frequency domain by

*h*(

*f*) =

*S*(−

_{sx}*f*)/

*S*(

_{xx}*f*), where

*S*(

_{sx}*f*) is the Fourier transform of the cross-correlation between

*x*(

*t*) and

*s*(

_{i}*t*),

*S*is the power spectrum of

_{xx}*x*(

*t*), and

*f*is the frequency.

To reliably compute the Fourier transform, we adopt the following procedure. We consider a sequence of consecutive *N _{f}* = 32768 bins in

*x*(

*t*), preprocess this sequence using a Bartlett window to suppress the boundary effect, perform Fourier transform, square the obtained result, and normalize it to obtain an empirical periodogram. We calculate periodograms for different sequences of

*N*bins, so that two adjacent sequences overlap by

_{f}*N*/2 bins. Consequently, we obtain approximately 6.25 × 10

_{f}^{6}/(

*N*/2) ≈ 380 periodograms. By averaging them, we obtain the estimate of

_{f}*S*(

_{xx}*f*). Similarly, we calculate

*S*(

_{sx}*f*).

## Acknowledgments

We thank Ernst Niebur, Brent Doiron, and Hiroyuki Nakahara for critically reading the manuscript and Hideaki Shimazaki for his valuable discussions. This study is supported by the Grant-in-Aid for Scientific Research on Priority Areas: Integrative Brain Research (No. 20019012) from MEXT, Japan.

## References

*in vivo*