Models for perceptual grouping and contour integration are presented. Connection weights depend on distances and angle differences, while neurons evolve following a spiking dynamics (Izhikevich’s model in most of the considered cases). Although the studied synapses depend on discrete three-valued functions, simulations display the emergence of approximate synchrony, making these cognitive tasks possible. Noise effects are examined, and the possibility of achieving similar results with a different neuron model is discussed.
Locally coupled oscillators can yield long-range synchrony. Even when only neighboring or regional couplings are included, cells separated by longer distances can eventually fire in a synchronic way, depending on the conditions that determine the efficiency of the connections. A key question of perception is to gather spatially separate features in order to form objects. Temporal correlation of feature detections is one of the binding mechanisms (Singer, 1999).
After a scene is presented, feature extraction takes place, and the extracted features form the basis for determining the connection weights between oscillators, external inputs, or cortical oscillator phases. The oscillator network then evolves autonomously. After a number of oscillator cycles required for the synchronization/desynchronization process, oscillator assemblies are formed and they represent the resulting segments.
Thus, neural oscillations supply a way of implementing time correlation: synchronized oscillators are said to group representing a common object, while desynchronized units stand for different objects. In the brain, remote cells associated with different features may be bound together when their activities show temporal correlation. Thus, the object in question can be represented by neuron groups across different regions. The whole subject of brain rhythms is a remarkably wide field, because a huge variety of cortical oscillations has been found. For instance, theta phases in hippocampal cells of rats encode spatial information as a result of the precise timing of the elicited spikes, while gamma oscillations in the parietal and frontal areas of cats relate to hypervigilance states when watching a prey. (General reviews on brain oscillations are offered in, for example, Buzsáki, 2006; Wang, 2005; Wang, 2010.)
Works such as Mirollo and Strogatz (1990), Eckhorn, Reitboeck, Arndt, and Dicke (1990), and von der Malsburg and Buhmann (1992) stand out in the early study of coupled oscillators and their roles in binding and segmentation processes by synchronization. Wang (1995) found that locally coupled neural oscillators can give rise to global synchrony. The idea of using only local connections came from observing that long-range all-to-all connections often fail to preserve the geometrical relationships important for perceptual grouping. Using a spike oscillator modified from that proposed by Eckhorn et al. (1990), Kuntimad and Ranganath (1999) studied a laterally coupled network for image segmentation. Yen and Finkel (1997) used a network of phase oscillators to extract salient contours. Excitatory and inhibitory connections encode orientation and distance relations. Global inhibition plays the role of suppressing background activity that arises in response to scattered stimulus items. The salient contours in an image correspond to oscillator assemblies that can overcome global inhibition and emerge from the network.
Similar to Li (1998), Choe (2001) used a network of spike oscillators to analyze the issue of contour segmentation via desynchronized activity, in addition to contour integration. Rhouma and Frigui (2001) studied networks of spike oscillators to perform data clustering. They extended the analysis of Mirollo and Strogatz (1990) to cases with certain inhomogeneity and specified conditions where a fully coupled network synchronizes. Regarding how to represent different objects of a scene, two possible schemes are phase (e.g., Ghose & Maunsel, 1999) and frequency (Kuntimad & Ranganath, 1999).
In this letter, we address the question of perceptual grouping using distance-limited connections and simulating neuron evolutions by means of Izhikevich’s spiking model, which leads us to consider interactions mediated by spikes, taking us into the realm of discrete couplings. For orientation-coded stimuli, the question of contour integration is considered using connections limited by angle differences as well as distances and maintaining the same type of discrete coupling in the lateral synapses. Further simulations will convince us that other spiking models can perform the same tasks as long as the employed structure and couplings are maintained.
2 Model Description
In Wang, 1995 (see also Terman & Wang, 1995; Wang, 2005), every single oscillator was defined as a feedback loop between an excitatory and an inhibitory constituent. The same duality is present in the models by Li (1998). Such roles can also be attributed to the V and u variables in the simple model of spiking neurons (see equations 1–3 in Izhikevich, 2003), which we use in this work. The parameters corresponding to the fast spiking neuron type have been chosen: (ms), current units/mV, mV, current units. Concerning the use of other parameter sets, recent studies have focused on the synchronous regimes that arise during variations of Izhikevich model parameters in Yatsiuk and Kononov (2013).
In addition to direct sensory input, a horizontal or lateral interaction limited in range by a given space scale will be present. Regarding lateral interactions, we have considered some of the ideas around the Kuramoto model and its generalizations (see, e.g., Sompolinsky, Golomb, & Kleinfeld, 1991, or Strogatz, 2000), which involve couplings of the form for site i, where the s indicate the phases of the oscillators. In order to spatially limit the interaction, the chosen wij is proportional to a decreasing function of the distance d(i, j) between sites i and j, the simplest case being just a step function for some given scale R, say, . As usual, for , and 0 otherwise. Somehow the R parameter is a measure of the perceptual segregation distance. Our chosen R values are arbitrarily set, depending on the task.
At the same time, the nonvanishing w coefficients may be Hebbian relative to the visual stimulus , in the sense that ~, where the ’s indicate the values of the stimulus pixels. Arguments in favor of Hebbian weights can be found (e.g., in Sompolinsky et al., 1991). In the processing of binary images, the presence of this additional factor is helpful for suppressing spikes outside the active areas but not strictly necessary. Furthermore, following the usual practice, the w matrix is row-normalized, requiring for every i labeling a nonempty row.
Both Izhikevich and Edelman (2008) and Kim and Lim (2013) have studied neural systems of Izhikevich neurons with horizontal interactions through synaptic currents, which are functions of the potential values. Such a dependency makes them more or less smooth. Unlike in these works, here we imagine that interactions among spiking neurons are mediated only by spikes Si, which are binary variables. This choice is motivated by a quest for computational simplicity. Neither gating variables nor noise is included at this stage (however, noise will be introduced in section 5). While phase differences or potential differences are continuous quantities, can take on only the values −1, 0, 1. In view of this discreteness, there is no advantage in keeping the original sine function of the Kuramoto model, and we adopt an interaction of the type , which is no longer a continuous function. The value of its jumps will be adjusted, multiplying the whole sum by some overall constant WL.
For the first spikes to be elicited, an excitatory feedforward input from the stimulus is used. Such a contribution will have the usual form , WFF being another overall constant. In our examples, the coefficients are binary (0, 1) for (dark, clear) pixels. Since a single spike is enough to start the process, we have set the WFF value close to the excitability limit for the employed neuron model. Calculation of this limit follows from a study of the dynamical equations around the bifurcation point (Izhikevich, 2007). In view of our neuron type, we set current units, and for the horizontal part, we adopt WFF, which proves to be a suitable choice. In general, the synchronizing performance depends on the ratio between the strength of these couplings, that is, on the value of .
3 Perceptual Grouping
Interesting initial conditions should involve some sort of stochastic asynchrony so that the success of the synchronization process can be manifestly exhibited by overcoming that initial randomness. At every site we have set the same initial values for the V and u Izhikevich variables, but the cell starts its activity at some randomly chosen time within a given window. Starting times are uniformly distributed within the span of that interval. Such a method is like setting random phase offsets (generalizing the concepts of period and phase to spiking neurons). An alternative might be to start from spatially uniform values, present a random mask in order to cause the initial asynchrony, and then replace the mask by the studied image.
Time-correlated spiking can often be graphically noticed. When spike groups are sensibly gathered into separate time windows, their sites are assigned to different subobjects. It would be appealing to separate spike groups on the basis of their relative phases, but at least an approximate knowledge of the spiking period would be required. A possibility would be to use a reference neuron for signaling the start of every cycle. A decision on one subobject or the other would depend on the spike time relative to the start of each cycle, for example, within the first half or the second half. Since the length of the current cycle cannot be known until its end is reached, the working assumption that every cycle is approximately equal to the previous one would be needed.
The network architecture is shown in Figure 1, where the vertical line at the bottom indicates the feedforward input from the visual stimulus and the cyclic line stands for lateral interactions.
The resulting form of as a function of / is depicted in Figure 2 (bottom). Not surprisingly, the performance improves as the / ratio is increased.
For the image in Figure 3 the process is sometimes the same as in Figure 2A, but in the rest of cases (4–6 cases out of every 10 trials), two subobjects are formed, that is, two distinct spike groups become visible. The length of the time interval between the formed groups can vary substantially. In the top plot, they are very close, while in the bottom plot, they are almost in a situation analogous to antiphase.
Time correlations can be considered, as shown in Figure 4. For sites in the same subobject, the cross-correlation plot has a peak close to zero (in the lag axis), while for points in different subobjects, there is no peak at the origin and the nearest one indicates the observed time shift.
More than two subobjects can be segregated, as illustrated by the examples shown in Figures 5 and 6 where three and six spike groups are formed. The number of distinct groups is often smaller than the number of subobjects because two or more can eventually achieve synchrony among themselves. In the language of Mirollo and Strogatz (1990), absorptions may take place. For six squares, there are 15 possible ways of having synchronized pairs, 20 ways of having synchronized triplets, and so on.
The result of applying our model to a well-known example from Wang (1995) is illustrated by Figure 7. Predictably, synchrony is not so perfect as in a continuous model, but it is noticeable in the form of short separate firing intervals. For rounds of 10 trials, we usually obtain distinct spike groups for the two objects on three or four occasions, and in the other cases there is no separation. When we include the Hebbian factor into W, separation typically takes place in 7 or 8 out of 10 trials.
4 Contour Integration
The visual system often faces the task of gathering different elements into meaningful global features in order to infer the presence of objects. Sometimes these local features group into two-dimensional regions, as in texture segmentation. On other occasions, they group into contour lines, which may represent boundaries. When visual input has already gone through a number of orientation-selective fields (in the sense of Gray & Singer, 1989) and a convenient architecture has been established for the horizontal intracortical connections, the edge segments forming contours oscillate in synchrony. (For a detailed review, see also Li, 1998.)
In contour integration tasks, interactions have to be limited by angular differences, as well as by distance scales. Therefore, after assuming orientation-coded stimuli, it will be sensible to employ weights of the type for , where is the angular direction linking sites i and j, , indicate the orientations of the stimulus segments at sites , and is some angular threshold or tolerance. Step functions might be hypothetically replaced by more elaborate tuning curves. Note that the horizontal interactions depend now on the stimulus through the variables. In our orientation coding scheme, the s may take on the angular values /na, , all them in radians, na being the chosen number of angular sectors ( was adopted). Again, W is row-normalized.
In every 10 trials, we typically obtain similar synchronizations five or six times. After evaluating a synchrony measure analogous to the of equations 3.1, we obtain, for fixed current units and for several ratios (taking epochs of 50 simulations), the values listed in Table 2. The results seem to indicate the presence of an optimal region around the values of 15 to 20, as further increase leads to lower performance.
It is also interesting to consider the case where two disjoint contours are present. Quite often, good synchronization takes place for only one of them, as shown in Figure 10 (top) and in a smaller number of cases both of them separately synchronize (see Figure 10, bottom).
5 Effect of Noise
Noise effects on visual perception have been previously studied (see Terman & Wang, 1995). It is not farfetched to say that in real situations, noise is inevitable. In order to examine the robustness of the proposed model, a random contribution is added to the considered inputs. Randomness is here represented by white gaussian noise of given amplitude added to the Inp inputs of equations 2.1 or 4.1. This contribution randomly perturbs the strength of the net applied current.
The consequences of introducing this type of noise in the grouping task of Figure 2A and in the contour integration task of Figure 9 are illustrated by Figure 12, which depicts the values of our synchronicity parameter for increasing noise amplitudes.
As in the noiseless situation, the averages were taken for epochs of (25, 50) runs in the case of the (grouping, contour) task. In both cases, we observe that noise brings about a continuous and monotonic degradation of the performance; as far as we can see, there is no optimal > 0. No facilitation effect can be clearly spotted and the possible benefits of stochastic resonance appear to be unlikely (however, one may speculate that the presence of noise with specific power spectra might lead to other situations, as indicated by Nozaki, Mar, Grigg, & Collins, 1999). The corresponding noiseless values are the result in Table 1 and the result in Table 2. As one can see, noisy results exhibit some closeness to noiseless performance only for the smallest values. In this respect, the considered mechanism can be called robust in only the narrowest sense.
6 Alternative Model
The possibility of synchronization by spike-dependent discrete inputs in a network of Izhikevich neurons was exhibited in the pulse-coupled network of Izhikevich 2003, although no specific visual task was considered there. All of our simulated neurons are of the same type, and no propagation delays have been included (but our initial setup somehow amounts to the presence of stochastic delays). The novelty of this work lies in the particularly simple form of our horizontal interaction, which has a discrete nature, and in its application to the studied grouping and integration tasks.
The use of the employed neuron model has become convenient for a number of reasons, but not always mandatory for achieving success. Our discrete couplings may be suitable for schemes with the same architecture and spiking neurons of other types. In particular, we have maintained the same network structures as in Figures 1 and 8, replacing the Izhikevich neurons by an integrate-and-fire model of the form , with (current units) × ms/mV, (current unit)/mV, a peak voltage mV, and a reset voltage mV. The purely excitatory weights and were both set equal to current units. For the grouping task, we have adopted WFF, and for the contour task has been chosen. Our resulting simulations are shown in Figures 13, 14, and 15, which are analogous to those in Figures 2A, 3, and 9, respectively, and exhibit the adequacy of the alternative model for the considered tasks.
On the whole, there are strong indications that oscillator networks provide a general and effective mechanism for scene segmentation. It is sensible to wonder whether substantial changes should be expected when using the different available models as binding mechanisms. At the level of the basic oscillator, a number of distinct models, properly coupled, can all meet the computational demands. Hence, the details of an individual oscillator do not matter as long as the model fulfills these computational requirements. In addition to the illustration we have provided, an example was offered by Campbell and Wang (1999) and Campbell, Wang, and Jayaprakash (1999), where a segmentation task achievable by a network of relaxation oscillators can be similarly accomplished using that of spike oscillators.
Population-approximate synchronization of coupling-induced firings, in the form of short time windows for spiking activity, may lead to an emergence of synchronous brain rhythms (Buzsáki, 2006), which contribute to sensory perception (Wang, 2010). States of attention and expectancy are usually accompanied by a general increase of synchronous activity. For instance, before the appearance of a reported stimulus, neural activity is stronger and more correlated than for an unreported stimulus, indicating that the strength of the activity and the connectivity in the primary visual cortex participate in the perceptual processing of stimulus information (Supèr et al., 2003; van der Togt et al., 2005, 2006). On the whole, computational functions of synchronous oscillations and their role in cognition are still an issue of contention among neuroscientists. Our work highlights the fact that couplings do not always require the presence of continuous functions, and a discrete model with the adequate connectivity can go so far as to account for contour integration tasks. Despite the apparent crudeness of the synaptic three-valued inputs that we have considered and the absence of gate variables, our designs show moderate robustness to noise effects and yield satisfactory results at the studied level of approximation.
We thank Alejandro Lerer for useful discussions.