Selective attention is often accompanied by gamma oscillations in local field potentials and spike field coherence in brain areas related to visual, motor, and cognitive information processing. Gamma oscillations are implicated to play an important role in, for example, visual tasks including object search, shape perception, and speed detection. However, the mechanism by which gamma oscillations enhance cognitive and behavioral performance of attentive subjects is still elusive. Using feedforward fan-in networks composed of spiking neurons, we examine a possible role for gamma oscillations in selective attention and population rate coding of external stimuli. We implement the concept proposed by Fries (2005) that under dynamic stimuli, neural populations effectively communicate with each other only when there is a good phase relationship among associated gamma oscillations. We show that the downstream neural population selects a specific dynamic stimulus received by an upstream population and represents it by population rate coding. The encoded stimulus is the one for which gamma rhythm in the corresponding upstream population is resonant with the downstream gamma rhythm. The proposed role for gamma oscillations in stimulus selection is to enable top-down control, a neural version of time division multiple access used in communication engineering.
The results of previous experimental studies suggest that the gamma band power of local field potentials and spike field coherence with gamma oscillations increase in area V4 and other areas upon attention. Therefore, the functions of gamma oscillations and synchronous firing in selective attention and working memory have been extensively examined (Gray, König, Engel, & Singer, 1989; Tallon-Baudry & Bertrand, 1999; Fries, Reynolds, Rorie, & Desimone, 2001; Engel, Fries, & Singer, 2001; Niebur, 2002; Bichot, Rossi, & Desimone, 2005; Jensen, Kaiser, & Lachaux, 2007; Womelsdorf & Fries, 2007; Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008). Increased gamma oscillations in V4 are known to enhance shape perception of visual stimuli (Taylor, Mandon, Freiwakl, & Kreiter, 2005) and rapidity of detecting changes in a stimulus (Womelsdorf, Fries, Mitra, & Desimone, 2006). Synchronous firing, both oscillatory (Lakatos et al., 2008) and nonoscillatory (Steinmetz et al., 2000), is also observed when subjects perform multimodal tasks that require attention.
Mechanisms for generating these gamma rhythms have been thoroughly explored. It was found that delayed recurrent connections among interneurons or excitatory-inhibitory connections are the main mechanism for the generation of gamma oscillations (Börgers, Epstein, & Kopell, 2005). How attention-driven gamma oscillations enhance information processing such as stimulus selection, gain control, and signal discriminability has not been sufficiently addressed, however. An earlier seminal work showed that gamma modulation of neurons subjected to an attended stimulus enables the selection of this stimulus among competing stimuli (Niebur, Koch, & Rosin, 1993). This result and those from more recent network models comprising spiking neurons aim to explain the existence of links between gamma rhythms and neural computation (Börgers et al., 2005; Börgers & Kopell, 2008; Buia & Tiesinga, 2006, 2008; Mishra, Fellous, & Sejnowski, 2006; Masuda & Doiron, 2007). Under bottom-up gamma oscillations, a strong (Börgers et al., 2005) or coherent (Börgers & Kopell, 2008) external stimulus is readily selected by neural populations receiving multiple stimuli. The attention signal that generates gamma oscillations can be also applied in a top-down manner. When the top-down signal is applied as a bias input to interneurons (Buia & Tiesinga, 2006, 2008), gamma oscillations (Niebur et al., 1993), synchronous spike volleys (Niebur & Koch, 1994), or the balance between excitation and inhibition along a feedforward pathway (Mishra et al., 2006) biases competition so that attended input is selected.
In most of the previous theoretical studies, gamma oscillations were assumed to be generated in one neural population or applied externally, and downstream neural populations were entrained into a single gamma rhythm. However, the mechanism of interaction of different gamma rhythms has not been sufficiently investigated (important exceptions are the studies conducted by Mishra et al., 2006, and Börgers & Kopell, 2008; we will compare their results to the results in section 4). In fact, there may exist multiple interacting neural populations that show gamma rhythms with different phases. It was proposed (Fries, 2005) and experimentally validated (Womelsdorf et al., 2007) that two cell assemblies subjected to gamma oscillations effectively communicate with each other only when there is a good phase relationship between the two gamma oscillations. To explain the concept of good phase relationship, we consider two populations that are unidirectionally coupled by excitatory synapses. Suppose that the relative timing of the two gamma oscillations is adjusted such that the peak time of the gamma rhythm applied to the presynaptic population shifted by a synaptic delay is close to the peak time of the gamma rhythm applied to the postsynaptic population. With this good phase relationship, spikes tend to excite the neurons that are relatively easy to fire. Otherwise excitatory synapses are not very effective in inducing postsynaptic firing.
In this study, we explore the possibility of a computational role for gamma oscillations in enabling the type of communication between neural populations as explained above. The emphasis of this study is on population rate coding of dynamic stimuli; examples of such dynamic stimuli include integration of many synaptic inputs and natural stimuli, such as visual scenes. The objective of this study is to investigate top-down gamma rhythms, which are considered to be important in selective attention (Tallon-Baudry & Bertrand, 1999; Buia & Tiesinga, 2006, 2008; Mishra et al., 2006), and not bottom-up gamma rhythms, which may be relevant in feature binding (Gray et al., 1989; Engel, Fries, & Singer, 2001. We consider multiple cell assemblies, each of which receives different external input and projects onto a single downstream neural population in a feedforward manner. We show that top-down gamma modulation enables the selection of one of the competing stimuli. The downstream population selects that stimulus whose associated gamma modulation is in a good phase relationship with the gamma modulation applied to the downstream neural population. As a special case, our neural network performs a type of gain modulation for static inputs.
2.1. Architecture of Neural Network.
In this study, we use a two-layer neural network. For expository purposes, consider the case of three neural populations, as shown in Figure 1. Each population consists of N = 100 neurons. Neurons in a population are uncoupled, except while carrying out supplementary numerical simulations shown in appendix B. One neural population, which we term population d, is downstream to the other two populations, which we term population i (i = 1, 2).
2.2. External and Noise Inputs.
Every neuron in each population, including population d, receives an oscillatory input of gamma band, which models the local field potential, or a bundle of synaptic inputs whose strength varies over time. We regard gamma modulation as a top-down modulatory signal and do not determine its source. The oscillation frequency is set to fγ = 40 Hz. The amplitude of gamma modulation applied to population i (i = 1, 2, d) is denoted by Ai. We set A1 = A2 = 0.1 and vary Ad. We assume that gamma oscillatory inputs applied to different populations are not necessarily in-phase synchronized (Gray et al., 1989; Fries, 2005; Womelsdorf et al., 2007). Therefore, the spikes obtained from population 1 and those obtained from population 2, which presumably represent s1(t) and s2(t), respectively, are generally correlated because of a fixed phase difference between the two gamma oscillations. Note that s1(t) and s2(t) are nevertheless independent of each other.
In addition to the dynamic stimulus and the gamma modulation, neurons are subjected to dynamical noise. Because noises applied to different neurons are partially correlated, we model noise in terms of a weighted sum of two factors. One noise source is the white noise process applied commonly to all the neurons in a population, which we denote by ξi,c(t) (i = 1, 2, d). The other noise source is independent (uncommon) for different neurons, which we denote by ξi,j(t) for the jth neuron in population i. We set the fraction and total strength of the common noise to c = 0.12 (Zohary, Shadlen, & Newsome, 1994; Masuda & Doiron, 2007) and σ = 0.1, respectively. This results in random-looking correlated spike statistics in the absence of gamma modulation.
2.3. Dynamics of Neurons and Synapses.
2.4. Measurement of Neural Activity.
We examine the possibility of population rate coding in population d on a timescale of gamma rhythm. To quantify the effectiveness of population rate coding, we use the cross-correlation between the external input and dynamic population spike count. The bin width for defining the population firing rate is defined as Tbin ≡ 1/fγ = 25 ms, and the bin is aligned such that the center of the bin corresponds to the peak of gamma modulation. The underlying logic of aligning the center of the bin to such a position is that when neurons are gamma modulated, they tend to fire around peaks of gamma oscillations. The binned population firing rate of population i′ (i′ = 1, 2, d) is denoted by .
3.1. Selective Population Rate Coding by Gamma Modulation.
We start with a simplified situation, where we consider two upstream populations and the gamma modulation of population 1 and that of population 2 are in antiphase such that θ2 = θ1 + π. Without loss of generality, we set θ1 = 0. Neurons in population 1 tend to fire near the peaks of gamma modulation of population 1. Those in population 2 tend to fire near the peaks of gamma modulation of population 2, which is equivalent to troughs of gamma modulation of population 1. Therefore, the total synaptic input applied to population d represented by equation 2.4 presumably consists of alternation of spike packets obtained from population 1 and ones obtained from population 2. An important implicit assumption is that in populations 1 and 2, stimulus-driven spikes rarely occur during the troughs of gamma modulation (this point is discussed in section 4).
Population firing rates ν1(t) and ν2(t), which are defined with resolution Tbin = 25 ms, approximate s1(t) and s2(t), respectively. We obtain 〈s1(t), ν1(t)〉, 〈s2(t), ν2(t)〉 ≈0.88, and 〈s1(t), ν2(t)〉, 〈s2(t), ν1(t)〉≈0. We examine how gamma modulation of the downstream population d modifies νd(t). A rastergram obtained without the gamma modulation of population d (Ad = 0) is shown in Figure 2a. Neurons in population 1 (index 0 through 99) and those in population 2 (index 100 through 199) fire stochastically, but with an apparent alternative pattern. Neurons in population d receive spikes from populations 1 and 2 with the same average strength. As shown in Figure 2a, the rastergram of the neurons in population d (index 200 through 299) shows a periodic pattern with period Tbin/2 = 12.5 ms due to rather regular spike packets arriving from the upstream populations. In particular, large instantaneous firing rates of population 1 (population 2), which reflect large s1(t) (s2(t)), tend to elicit large instantaneous firing rates of population d. Therefore, νd(t) represents both stimuli s1(t) and s2(t) in a degraded manner. We obtain 〈s1(t), νd(t)〉, 〈s2(t), νd(t)〉≈0.38.
When gamma oscillations modulate population d, the situation noted changes drastically. For illustration, we set θd = θ1 = θ2 − π. Then, populations 1 and d effectively interact, whereas populations 2 and d do not, as suggested by Fries (2005).
A rastergram when the magnitude of gamma modulation is equal to Ad = 0.25 is shown in Figure 2b. Neurons in population d are likely to fire near the peaks of the associated gamma rhythm, which roughly coincides with the arrival time of spike packets from population 1. However, spikes from population 2 arrive at population d at around the troughs of the gamma rhythm associated with population d. After half a gamma cycle, neurons in population d get highly excited because of the gamma modulation and are likely to fire. However, at this moment, the impacts of spikes transmitted from population 2 on the postsynaptic potentials of neurons in population d reduce significantly. In other words, neurons in population 2 do not fire when the synaptic input applied to population d is most effective. We obtain 〈s1(t), νd(t)〉 ≈ 0.54 and 〈s2(t), νd(t)〉 ≈ 0.28. We call this regime selective population rate coding of s1(t). There exists a positive correlation between s2(t) and νd(t), because the assumed membrane time constant τm = 10 ms allows some of the information about s2(t) carried by ν2(t) to persist in population d after half a gamma cycle (=12.5 ms). When τm is smaller, which is characteristic of a high-conductance state (Destexhe, Rudolph, & Paré 2003), 〈s2(t), νd(t)〉 further decreases, as shown in section A.2.
A rastergram obtained under strong gamma modulation (Ad = 0.5) is shown in Figure 2c. Due to saturation, each neuron in population d fires once per gamma cycle with a probability of approximately unity, regardless of the applied synaptic inputs. Consequently νd(t) does not represent either s1(t) or s2(t), and 〈s1(t), νd(t)〉, 〈s2(t), νd(t)〉 ≈0. Each neuron in population d does not fire more than once in one gamma cycle because of the combination of refractory period and short duration of the peaks of the gamma oscillation (Masuda & Doiron, 2007).
For fixed s1(t), s2(t), and Ad, we carried out the Monte Carlo simulation 100 times to assess the reliability of νd(t). The time course of s1(t), which is supposedly encoded by νd(t), is shown in Figure 3a. In Figures 3b to 3d, binned spike count νd(t) averaged over 100 trials (thick lines), together with the standard deviation (thin lines, added as an error bar to the average spike count), is shown for the three values of Ad, which correspond to those shown in Figure 2. νd(t) when Ad = 0.25 (see Figure 3c) is more correlated to s1(t) (see Figure 3a) than νd(t) when Ad = 0 (see Figure 3b) and Ad = 0.5 (see Figure 3d). Even though the difference between the cases Ad = 0 (see Figure 3b) and Ad = 0.25 (see Figure 3c) is not visually remarkable, we quantify it in section 3.2.
3.2. Dependence of Selective Population Rate Coding on the Amplitude and Phase of Downstream Gamma Rhythm.
We expect that s2(t) instead of s1(t) is represented by νd(t), when θd is close to θ2, and not to θ1. To systematically clarify whether this is true and quantify the effects of Ad, we show in Figure 4 the dependence of correlation coefficients defined in equation 2.5 on Ad and θd. In this figure, each data point corresponds to the correlation coefficient averaged over 50 trials. As expected, νd(t) is more correlated with s1(t) than with s2(t) when θd is closer to θ1 than to θ2, and vice versa when θd is closer to θ2 than to θ1. The value of θd that realizes the largest 〈s1(t), νd(t)〉 (〈s2(t), νd(t)〉) is slightly larger than θ1 (θ2). The precession of the optimal downstream gamma rhythm relative to the upstream gamma rhythm does not violate causality, because neurons in upstream populations 1 and 2 tend to fire during the rising phase of the associated gamma modulation.
Around the optimal θd, the correlation coefficient first increases with Ad and then decreases for excessively large Ad. The correlation coefficient is the largest at around Ad = 0.3. At this value of Ad, the firing rate of the neurons saturates near fγ = 40 Hz, and the spike count variance, which is the main determinant of the reliability of νd(t), decreases significantly; these results are consistent with the experimental results (Murthy & Fetz, 1996, Figure 9; Mitchell, Sundberg, & Reynolds, 2007). Such reliable spiking stems from binomial like spike count statistics; it appears when a neuron fires only once per gamma cycle with high probability (DeWeese, Wehr, & Zador, 2003; Masuda & Doiron, 2007). This saturation effect is an advantage, though not mandatory, for selective population rate coding; as shown in Figure 4c, 〈s1(t), νd(t)〉 − 〈s2(t), νd(t)〉 for a suitable θd remains positive for small values of Ad, such as Ad ≈ 0.1.
3.3. Robustness of Selective Population Rate Coding Against Parameter Variation.
We test the robustness of the obtained results against variation in a few key parameters. The robustness results for the phase of gamma modulation and membrane time constant are shown in Figures 11 and 12 in appendix A, respectively. We have also assessed the robustness of the obtained results against variation in the reversal potential vE,d and noise intensity σ (not shown).
3.3.1. Uneven Feedforward Input.
An upstream population that encodes a preferred stimulus may project onto the downstream population with a larger number of excitatory synapses than one that does not encode a preferred stimulus (Mishra et al., 2006). We examine the effect of biased projection by letting the average synaptic weight from population 2 to population d be 40% smaller than the average synaptic weight from population 1 to population d. The average synaptic weight from population 1 to population d remains the same.
Correlation coefficients for different values of Ad and θd are shown in Figure 5. As expected, the weak stimulus (s2(t)) is represented by νd(t) under severe conditions compared to when the projection is not biased (see Figure 4). However, population d selects s2(t) when θd is around θ2 and Ad(t) is moderately large. Similar results are obtained when the amplitude of s2(t) is assumed to be smaller than that of s1(t) and the average synaptic weight from population 1 to population d is equal to that from population 2 to population d (not shown). These results suggest that a weak stimulus can be selected by subjecting the downstream neurons to an appropriate top-down gamma modulation.
3.3.2. More Than Two Stimuli.
More than two stimuli may compete in the downstream population. In the case of three upstream populations whose gamma oscillations have equal phase distances—θ1 = 0, θ2 = 2π/3, and θ3 = 4π/3—the correlation coefficients between νd(t) and si(t) (i = 1, 2, 3) are shown in separate panels of Figure 6. Population d selects si(t) (i = 1, 2, 3) when θd ≈ θi.
The amount of stochasticity in spiking, membrane time constant, timescale of the gamma rhythm, and other factors collectively determine the number of stimuli that can be distinguished by population d. For 2 ⩽ Ng ⩽ 6 and θi = 2π(i − 1)/Ng (1 ⩽ i ⩽ Ng), we quantify the ability of population d to select a suitable stimulus. In Figure 7, the maximal value of 〈s1(t), νd(t)〉 in the Ad-θd parameter space (circles) and the value of 〈s2(t), νd(t)〉 for the pair (Ad, θd) that gives the maximal 〈s1(t), νd(t)〉 (squares) are shown together with their standard deviations. If these two values are close to each other, it is difficult for population d to distinguish the most represented stimulus, which is s1(t), from the second most represented stimulus, which is presumably s2(t). Figure 7 shows that our neural network is capable of selecting the most preferred stimuli up to about Ng = 4.
3.3.3. Conserved Firing Rates.
Due to the threshold nonlinearity of neuronal dynamics, gamma modulation (Niebur et al., 1993; Tiesinga, Fellous, Salinas, José, & Sejnowski, 2004; Masuda & Doiron, 2007) or synchrony in general (Niebur & Koch, 1994; Salinas & Sejnowski, 2000) increases the firing rate of a neuron in an excitable regime. Average firing rates of a neuron in population d for different values of vE,d and Ad are shown in Figure 8, where vE,d is selected as a control parameter that modulates the excitability of the neuron. In our model, gamma oscillation additively modulates the input-output relationship, which is consistent with some evidence (Reynolds, Pasternak, & Desimone, 2000); however, this modulation differs from the multiplicative gain modulation observed in experiments (McAdams & Maunsell, 1999; Treue & Martínez-Trujillo, 1999). In other words, as Ad increases, the dependence of the firing rate on the level of constant current input, which is equivalent to vE,d, is shifted upward without a drastic change in the shape of the curve. This result is obtained in general, except when the firing rates are close to 40 Hz. A plateau firing rate, which is equal to the frequency of gamma oscillation fγ = 40 Hz, indicates the saturation regime, where a neuron fires once per gamma cycle. For example, a very large bias or noise is required to achieve a firing rate greater than 40 Hz.
Selective attention often enhances gamma-band activities with a conserved firing rate (Fries et al., 2001; Womelsdorf et al., 2006). To show that our results are not obtained due to increased firing rates, we vary Ad while maintaining the firing rate of population d constant at approximately 18 Hz. We achieve this by gradually decreasing vE,d with a gradual increase in Ad. The numerical results shown in Figure 9 support the viability of selective population rate coding when Ad ⩾ 0.05.
3.3.4. Noncoherent Gamma Oscillations.
We have assumed that gamma oscillations applied to different upstream populations are highly coherent with a fixed phase relationship. A more realistic assumption may be that different upstream populations receive noncoherent gamma oscillations in a frequency band of some width, for example, 40 to 70 Hz (Gray et al., 1989; Engel et al., 2001). In addition, the amplitude of gamma oscillations generally fluctuates. We investigate whether stimulus selection proposed in the preceding sections is effective under noisy and noncoherent gamma oscillations.
We assume that population d is subjected to the same gamma modulation as the one applied to the upstream population 1, except that the baseline amplitude 0.1 is replaced by Ad.
In the previous sections, we calculated the correlation coefficient between the stimulus and the spike train with a bin width of 1/fγ = 25 ms. However, this method of computation is inappropriate for noisy gamma modulation because in this case, peaks of spike counts occur at irregular intervals. Therefore, we calculate the so-called coding fraction, denoted by γ. The coding fraction is large when the best linear stimulus estimator applied to an observed spike train yields a small estimation error. The coding fraction is normalized between 0 and 1, where γ = 1 represents an ideal stimulus estimation and γ = 0 indicates the absence of correlation between si(t) and the spike train. Details of the coding fraction are found elsewhere (Gabbiani, Metzner, Wesse, & Koch, 1996; Wessel, Koch, & Gabbiani, 1996; Gabbiani & Koch, 1998). In appendix C, we briefly summarize the definition of γ and show the parameter values used in this study. To calculate γ, we use the Matlab code available at Fabrizio Gabbiani's Web site.
We assume two upstream populations (g = 2), θ1 = θd = 0, and θ2 = π. The coding fraction for s1(t) and s2(t) for different values of Ad is shown in Figure 10b. The error bar shows the standard deviation obtained by conducting 50 trials. As reference, the corresponding results obtained for coherent gamma modulation, described in section 3.2, are shown in Figure 10c. The value of Ad that provides the best estimation of s1(t) is smaller in the case of noisy gamma modulation (see Figure 10b) than in the case of coherent gamma modulation (see Figure 10c; also see Figure 4a). However, the results obtained from these two cases are qualitatively similar; the stimulus selection is enhanced for intermediate Ad. Therefore, our main results hold true under noisy and noncoherent gamma modulation to some extent.
In this section, we explain our main finding by analyzing a phenomenological mean field model with stochastic firing rates. We consider a neural network composed of three populations as shown in Figure 1 and assume that θ2 = θ1 + π and θd = θ1, corresponding to the case examined in section 3.1. For this simple case, we determine the effect of the amplitude of the downstream gamma rhythm on the fidelity of population rate coding of each competing stimulus.
For simplicity, we assume that time is discrete and that one gamma cycle consists of two time units. Therefore, the instantaneous magnitude of gamma modulation takes either of the two levels, that is, the peak or the trough. This assumption can be justified by the facts that external stimuli are sufficiently weak such that neurons spike only near the peaks of gamma oscillations and that we neglect the phase codes. We denote the peak and trough levels by Ai and −Ai (Ai ⩾ 0, i = 1, 2, d), respectively. Gamma oscillations applied to populations 1 and d are synchronized. When these oscillations attain peak values (A1 and Ad), the gamma oscillation applied to population 2 takes the trough value (−A2), and vice versa.
The probability that neurons fire in one gamma cycle is not significantly affected by the probability of firing in the preceding gamma cycles, when the membrane time constant of a neuron is not very large. For simplicity, we let firing events within a single gamma cycle be independent of those in the previous gamma cycles. Furthermore, we assume that the stimuli take the constant values s1 and s2 in a gamma cycle, because s1(t) and s2(t) change slowly with respect to the gamma rhythm.
The correlation coefficient 〈si, νd〉 (i = 1, 2) measured in the numerical simulations is large when νd is sensitive to si and is reliable over trials for a fixed si. Sensitivity can be quantified by the amount of the change in the average firing rate with si. To assess the reliability of νd, we neglect stochasticity of the synaptic input arriving at population d, because it affects the following results only slightly. Instead, we focus on the following source of stochasticity. A neuron in population d fires at most once in half a gamma cycle. Then the firing statistics over a bin of width 12.5 ms is binomial-like, such that the average firing rates in phases 0 and π accompany the variances equal to and , respectively. Based on the assumption that spike count statistics in different bins are independent of each other, the spike count variance in a gamma cycle is equal to .
Figure 11 shows the discriminability d′1 for s1 and d′2 for s2, which are plotted as a thick solid line and a thick dashed line, respectively. When Ad>0, the difference between d′1 and d′2 is always positive. This difference is the largest at an intermediate amplitude of gamma oscillation, which is qualitatively consistent with the numerical results shown in the previous sections. Due to our phenomenological approach, there does not exist a quantitative agreement between the optimal value of Ad derived from numerical simulations and that derived from the theory.
For comparison, the sensitivities and are plotted by a thin solid line and a thin dashed line, respectively, in Figure 11. (thin solid line) is maximized at a spike count of 0.5, that is, when the input applied to the sigmoid transfer function is equal to . In contrast, d′1 (thick solid line) is maximized at a larger firing rate, that is, when the spike count variance decreases drastically near the saturation regime. Accordingly, the peak of is attained with a smaller value of Ad than is the peak of d′1.
4.1. Summary of the Results.
On the basis of the concept that neural populations effectively interact with each other when there exists a compatible phase relationship between their associated gamma oscillations (Fries, 2005; Womelsdorf et al., 2007), we have proposed a feedforward network that enables stimulus selection. We have shown that among competing stimuli, a dynamic stimulus whose associated gamma oscillation is in a good phase relationship with the gamma oscillation applied to the downstream population is selected by a downstream neural population by population rate coding. By adjusting the phase of the downstream gamma rhythm, which we interpret as a top-down control signal generated by selective attention, the time course of the attended dynamic stimulus is recovered in the downstream population. An emphasis of this study is the use of dynamic stimuli; many natural stimuli such as visual stimuli and summed synaptic input are dynamic. Our results suggest that gamma-band synchrony facilitates selective routing of dynamic information about the input to downstream information processing areas.
The stimulus selection occurs even when the attended stimulus is an unpreferred one, as shown in Tiesinga et al. (2004). The selected stimulus does not have to be stronger (Börgers et al., 2005) or more coherent (Börgers & Kopell, 2008) than the other stimuli. The proposed mechanism is robust against parameter variation. Although we have exploited the fact that neurons tend to fire near peaks of gamma oscillations, precise locking of spikes to oscillatory peaks is not mandatory.
We assumed that neurons in the same upstream population, which supposedly encode a common stimulus, are in-phase synchronized and that different upstream populations have different oscillatory phases, as assumed in other models (Mishra et al., 2006). We also showed that our results can be extended to the case in which different upstream populations are subjected to noncoherent and noisy gamma modulations, as observed in experiments (Gray et al., 1989). Evidence in sensory areas such as V1 (Gray et al., 1989) and computational modeling studies (von der Malsburg & Schneider, 1986; Terman & Wang, 1995) support that a combination of different oscillations may be bottom-up emergent behavior due to input segmentation. Alternatively, as observed in olfactory neurons (Schaefer, Angelo, Spors, & Margrie, 2006), even a single population may emit oscillatory spike packets with different phases depending on stimulus dynamics. In this study, we did not model the genesis of gamma oscillations.
We have also assumed that the threshold nonlinearity of neurons prohibits the occurrence of stimulus-related spikes during troughs of gamma modulation. Therefore, the information about s1(t) and that about s2(t) do not interfere with each other in population d, in a manner similar to time division multiple access in communication theory. Although this may be a strong assumption, it was imposed in previous models of selective attention (Mishra et al., 2006) and various phase coding schemes (Hopfield, 1995; Fries, Nikolić, & Singer, 2007).
4.2. Possible Roles for Inhibitory Coupling.
In this study, we neglected inhibitory interneurons. There are several possible roles for inhibition in selective attention.
First, inhibitory neurons in the downstream population promote competition among various stimuli by winner-take-all mechanisms (von der Malsburg & Schneider, 1986; Terman & Wang, 1995; Niebur et al., 1993; Niebur & Koch, 1994; Reynolds et al., 1999) or by disinhibition of the attended pools (Buia & Tiesinga, 2008). For example, with recurrent inhibition, firing is suppressed for some time after an attended stimulus induces spikes. Then it is difficult for an unattended stimulus to induce spikes (Börgers et al., 2005; Börgers & Kopell, 2008). Such stimulus separation will be manifested if the synaptic delay is equal to a fraction of a gamma cycle.
With recurrent inhibition, stimulus selection can be carried out without the assumed top-down gamma rhythm; however, it is carried out with some modifications. We assumed that a top-down gamma rhythm is applied to the downstream population because stimulus selection is likely to occur as a result of cognitive top-down control (Engel et al., 2001; Niebur, 2002; Buia & Tiesinga, 2006, 2008; Mishra et al., 2006; Womelsdorf & Fries, 2007). However, such an ideal top-down gamma modulation would not be applied externally in real neural networks. On the other hand, local circuits composed of interneurons are theorized to generate gamma oscillations for selective attention, which facilitate recurrent interactions between input and output populations (Tiesinga et al., 2004; Börgers et al., 2005; Buia & Tiesinga, 2006, 2008). When there are internally generated gamma rhythms, the stimulus with the largest magnitude is represented by the downstream population with a high probability; this result is similar to the previous results (Börgers et al., 2005). When we apply dynamic stimuli in the neural network with recurrent inhibition, the stimulus to be represented by the downstream population varies over time, as shown by a sample rastergram in Figure 14.
Second, synaptic inhibition in conductance-based neuron models enhances selectivity of coherent dynamic stimuli (Börgers & Kopell, 2008). If we replace our current-based neuron model with a conductance-based neuron model, synaptic inhibition decreases the effective membrane time constant, particularly when excitation and inhibition are balanced, as in a high-conductance state (Destexhe et al., 2003). Then downstream neurons will be able to readily identify spike packets resulting from different stimuli due to the fast decay of stimulus information stored in the downstream neurons. This will enhance signal separation, as shown in Figure 13.
Third, recurrent inhibition constrains the firing rate. As observed in the rate-controlled scenario (see Figure 9), constraining the firing rate by recurrent inhibition is unlikely to change our main results.
Recurrent inhibition may cause inhibitory gamma modulation. We assumed sinusoidal gamma oscillations. Alternatively, gamma modulation may be inhibitory for half a cycle and not excitatory for the other half-cycle. Compared to sinusoidal gamma modulation, such inhibitory gamma modulation is advantageous for carrying out stimulus selection under strong gamma modulation. This is because strong inhibitory gamma modulation blocks the out-of-phase stimulus but does not interfere with the representation of the in-phase stimulus.
4.3. Rate Coding and Gain Modulation.
For static stimuli, we showed previously that enhanced discriminability occurs with a rather strong gamma modulation, which shifts the dynamics near the saturation regime (Masuda & Doiron, 2007). Then, neurons fire at a rate slightly less than once per cycle, which results in a small spike count variance (DeWeese et al. 2003; Masuda & Doiron, 2007). We extended these results to the case of dynamic and competing stimuli. Note that strong gamma modulation and the saturation regime are optional for the proposed mechanism to be effective. From our numerical results, stimulus selection is achieved even with a gamma rhythm that is weaker than one that realizes the saturation regime.
For a successful population rate coding by the downstream neural population, the dynamic stimuli cannot change much within a gamma cycle. Fast oscillatory drives (fγ = 100 Hz) will not yield a high upper cutoff frequency of the stimulus. This is because with fast oscillations, even a small membrane time constant (such as τm = 5 ms) is relatively low pass, and the downstream population would not be able to identify the information about different stimuli. In contrast, slow oscillations such as theta rhythm would allow only encoding of very slow stimuli. Furthermore, enhanced coding near the saturation regime (Masuda & Doiron, 2007) cannot be carried out with slow oscillations because a neuron then spikes multiple times per cycle. In conclusion, approximate gamma-band frequency is beneficial to the proposed mechanism.
We have considered a rate code whose temporal resolution is about one gamma cycle and ignored the timing or phase codes, which may be relevant in a fine timescale. For example, a large input bias elicits a reliable spike in an earlier phase of an oscillation (Hopfield, 1995, 2004; Fries et al., 2007). In our neural network, this phase code can coexist with the population rate code. In fact, when the attended dynamic stimulus is large, neurons tend to fire in an early phase (not shown). Therefore, the information about the intensity of external stimulus may be coded onto the phase as well as the spike count.
In the case of static inputs, we can also interpret our results in terms of gain control. Experimental results show that selective attention increases the gain of the selected stimulus and decreases the gain of the unselected stimulus (McAdams & Maunsell, 1999; Reynolds et al., 1999; Treue & Martínez-Trujillo, 1999; Reynolds, Pasternak, & Desimone, 2000), and this gain modulation may be mediated by gamma modulation. The correlation coefficient that we measured represents the sensitivity of the population firing rate to stimulus changes, which can be interpreted as the gain. The correlation coefficient increases (decreases) for the stimulus whose gamma modulation is in a good (bad) phase relationship with the downstream gamma rhythm; this result is consistent with the experimental results.
4.4. Relation to Previous Literature.
Previous theoretical studies were conducted on interaction between different gamma rhythms (Mishra et al., 2006; Börgers & Kopell, 2008). The results obtained by Mishra et al. suggest that if top-down gamma modulation is applied with a suitable phase to the neurons encoding the preferred (unpreferred) static stimulus, the firing rate resembles the one when only the preferred (unpreferred) stimulus is present. In the case of static stimuli, our results are consistent with those obtained by Mishra et al. Börgers and Kopell (2008) propose that oscillatory inputs with different phases compete in a single population such that a more coherent input wins via a bottom-up competition mechanism. In this study, we did not assume that either input is more coherent or stronger than the other, and we examined the role for top-down gamma modulation in stimulus competition.
Our network can separate up to about four stimuli (see Figure 7). This limit is determined by various factors such as oscillation frequency, membrane time constant of a single neuron, noise intensity, and degree of heterogeneity. With regard to the number of gamma oscillations accommodated by a neural network, neurons encoding different features are considered to belong to different groups, each of which collectively oscillates with distinct phases (Gray et al., 1989; Engel et al., 2001). In theory, up to several clusters of neurons oscillating with different phases can coexist (von der Malsburg & Schneider, 1986; Terman & Wang, 1995); this result is similar to our results. The number of stimuli that our network can separate might indicate the number of items that can be held in working memory (Cowan, 2000; Miller, 1956).
Appendix A: Robustness tests
To assess the robustness of the main results, we carry out additional numerical simulations.
A.1. General Relative Phases of Gamma Oscillations.
When the antiphase relationship between the two upstream gamma oscillations is expressed as θ2 = θ1 + π, the spike packets from population 1 and those from population 2 are most distinguished in population d. Therefore, the information about s1(t) and that about s2(t) arrive at population d with the least overlap, so that selective rate coding is relatively easy. Here we relax this assumption.
When θ2 = θ1 + 2π/3, the effectiveness of rate coding by population d is shown in Figure 12. When θd ≈ θ1, 〈s1(t), νd(t)〉 is the largest (see Figure 12a), and when θd ≈ θ2 = 2π/3, 〈s2(t), νd(t)〉 is the largest (see Figure 12b). It is more difficult to select s1(t) or s2(t) when θ2 = θ1 + 2π/3 than it is when θ2 = θ1 + π, because spike packets transmitted from populations 1 and 2 partially overlap in population d. However, stimulus selection is still viable; νd(t) represents s1(t) (s2(t)) more accurately than s2(t) (s1(t)) when θd ≈ θ1 (θd ≈ θ2).
A.2. Small Membrane Time Constant.
The membrane time constant τm controls the duration for which the information about stimuli is stored in the neurons. With τm = 10 ms, which we assumed in the main text, 〈s2(t), νd(t)〉 is rather large (e.g., 〈s2(t), νd(t)〉 = 0.3) even when population d presumably selects s1(t) (e.g., when θd = θ1). A good stimulus separation is expected with a small τm.
To examine this point, we set τm of the neurons in population d to τm = 5 ms by setting g = 200 mS/cm2, which is twice the original value, in equation 2.2. Figure 13 shows that stimulus separation under gamma modulation is somewhat better when τm = 5 ms than when τm = 10 ms, as indicated by a sharper contrast in Figure 13c than in Figure 4c. When θd ≈ θ1, a decrease in 〈s2(t), νd(t)〉 with an increase in Ad is greater when τm = 5 ms (from 0.47 when Ad = 0 to 0.28 when Ad ≈ 0.15) than when τm = 10 ms (from 0.38 when Ad = 0 to 0.28 when Ad ≈ 0.25), although 〈s2(t), νd(t)〉 does not decrease to zero even in the regime of selective population rate coding.
A sufficiently large τm, such as τm = 20 ms, does not accommodate selective population rate coding (not shown). Then, population d cannot dissociate information from different upstream populations.
Appendix B: Rastergram with Recurrent Inhibition
To illustrate the role of inhibition, we generate downstream gamma rhythm by recurrent inhibition instead of applying a top-down external input. Accordingly, we set A3 = 0. An arbitrary pair of neurons in population d is connected by an inhibitory synapse with probability 0.5 independently for different pairs. Each synaptic weight is chosen independently from the uniform density on [0, 1.2/n]. The synaptic delay is equal to 8 ms. The time course of inhibitory synapses is expressed using the delta function. To avoid an excessive decrease in firing rates due to recurrent inhibition, we increase the average weight of the synapse from an upstream neuron to a neuron in population d twice that used in the other numerical simulations.
Figure 14 shows a sample rastergram for Ng = 2. The stimulus that is encoded by population d changes dynamically and that with the larger instantaneous magnitude tends to be selected.
Appendix C: Calculations of the Coding Fraction
Here, we briefly explain the method and define the parameter values for calculating the coding fraction, introduced in section 3.3.4. Refer to Gabbiani et al. (1996), Wessel et al. (1996), and Gabbiani and Koch (1998) for the methods in detail.
The duration of each simulation is 5000 gamma cycles, that is, 125 s. We merge the spike trains obtained from N = 100 neurons in population d. The superposed spike train is a time series of the spike count with a bin width of 0.02 ms, which is the discretized unit time of the Monte Carlo simulations. In total, there are 125 × 103/Δt = 6.25 × 106 bins. The superposed spike train whose mean is adjusted to zero is denoted by x(t). Given the stimulus si(t), the best linear estimator is given by (h*x)(t), where * denotes convolution, and h(t) is the linear filter that minimizes the mean square error of the stimulus estimation, denoted by ϵ2. The best linear filter is given in the frequency domain by h(f) = Ssx(−f)/Sxx(f), where Ssx(f) is the Fourier transform of the cross-correlation between x(t) and si(t), Sxx is the power spectrum of x(t), and f is the frequency.
To reliably compute the Fourier transform, we adopt the following procedure. We consider a sequence of consecutive Nf = 32768 bins in x(t), preprocess this sequence using a Bartlett window to suppress the boundary effect, perform Fourier transform, square the obtained result, and normalize it to obtain an empirical periodogram. We calculate periodograms for different sequences of Nf bins, so that two adjacent sequences overlap by Nf/2 bins. Consequently, we obtain approximately 6.25 × 106/(Nf/2) ≈ 380 periodograms. By averaging them, we obtain the estimate of Sxx(f). Similarly, we calculate Ssx(f).
We thank Ernst Niebur, Brent Doiron, and Hiroyuki Nakahara for critically reading the manuscript and Hideaki Shimazaki for his valuable discussions. This study is supported by the Grant-in-Aid for Scientific Research on Priority Areas: Integrative Brain Research (No. 20019012) from MEXT, Japan.