Abstract

The human visual system has the remarkable ability to largely recognize objects invariant of their position, rotation, and scale. A good interpretation of neurobiological findings involves a computational model that simulates signal processing of the visual cortex. In part, this is likely achieved step by step from early to late areas of visual perception. While several algorithms have been proposed for learning feature detectors, only few studies at hand cover the issue of biologically plausible learning of such invariance. In this study, a set of Hebbian learning rules based on calcium dynamics and homeostatic regulations of single neurons is proposed. Their performance is verified within a simple model of the primary visual cortex to learn so-called complex cells, based on a sequence of static images. As a result, the learned complex-cell responses are largely invariant to phase and position.

1.  Introduction

Object recognition, which means the ability to perceive, recognize, and distinguish objects in the real world, is one of the most remarkable properties of the human visual system. Although researchers have made great strides in the development of artificial vision systems over the past few decades, they are still no match for the huge variety of tasks the human brain can perform. Despite the fact that objects can vary in form, texture, size, color, and other characteristics, the brain can recognize them effortlessly even from different viewpoints, different backgrounds, or partially obscured. One possible key to successful object recognition in artificial vision systems lies in studying the underlying principles of the visual system of the primate brain.

Fortunately, the primate visual system has been well investigated functionally, anatomically, and computationally. It is widely accepted that the visual system achieves some degree of invariance gradually from early visual areas to high-level areas. The first area in the cortical visual system where invariant cell properties can be found is the primary visual cortex (V1). The behaviors of V1 cells were first explored by Hubel & Wiesel, who coined the terms simple and complex cells. The edge-detecting ability of V1 simple cells gives important information about the structure of the visual input. The receptive fields of simple cells consist of excitatory and inhibitory regions (Hubel & Wiesel, 1962), and their arrangement can be well described by Gabor filters (Jones & Palmer, 1987). In contrast to simple cells, complex-cell responses cannot be fully described by a simple map of inhibitory and excitatory regions. Optimal stimuli for complex cells do not have to be at a special position in their receptive fields, that is, complex cells are slightly invariant to position and phase (Hubel & Wiesel, 1962; De Valois, Albrecht, & Thorell, 1982; Adelson & Bergen, 1985; Carandini et al., 2005). This behavior can be explained as deriving from simple-cell behavior, that is, complex cells in layer 2/3 of the primary visual cortex obtain their inputs from many simple cells in layer 4 with a similar orientation tuning (Hubel & Wiesel, 1962).

There are many approaches for learning simple-cell properties (Olshausen & Field, 1996; Bell & Sejnowski, 1997; van Hateren & van der Schaaf, 1998; Hoyer & Hyvärinen, 2000; Falconbridge, Stamps, & Badcock, 2006; Hamker & Wiltschut, 2007; Rehn & Sommer, 2007; Weber & Triesch, 2008; Wiltschut & Hamker, 2009), but only a few for learning the invariance properties of complex cells using temporal correlations in the input. It has been shown that invariance properties can be learned by Hebbian learning with the additional constraint of an activity trace, often modeled by using a (artificial) history of previous activations in exchange of the activation in Hebb-type learning rules (Földiák, 1991; Wallis & Rolls, 1997; Einhäuser, Kayser, König, & Körding, 2002; Spratling, 2005). Other approaches use an objective function for minimizing the difference of the output units between two consecutive inputs (Kayser, Einhäuser, Dümmer, König, & Körding, 2001; Körding, Kayser, Einhäuser, & König, 2004; Berkes & Wiskott, 2005), or similarly maximize the sparseness of this difference (Hashimoto, 2003). Berkes and Wiskott (2005) applied the slow feature analysis method (SFA; Wiskott & Sejnowski, 2002) to determine parameters of polynomial functions by optimizing their slowness in variation to slight image transformations. Hashimoto (2003) has proposed to maximize the sparseness of temporal output differences to obtain complex-cell properties. Kayser et al. (2001) and Körding et al. (2004) used subspace energy detectors and learned their receptive fields by optimizing the temporal stability of the output. An earlier approach by Kohonen (1996) used subspace energy detectors. All of these approaches are based on the use of temporal coherent input because it is present in the real world. Other research has shown that complex-cell properties can emerge without using temporal correlations of the input (Hyvärinen & Hoyer, 2000, 2001; Osindero, Welling, & Hinton, 2006; Karklin & Lewicki, 2009; Köster & Hyvärinen, 2010). In the independent subspace analysis (Hyvärinen & Hoyer, 2000) and the topographic ICA (Hyvärinen & Hoyer, 2001), pooling residual dependencies between linear filters leads to units with complex-cell-like properties. Köster and Hyvärinen (2010) unify these approaches using weight estimation by score matching (Hyvärinen, 2005), an estimation principle for energy-based models. The model learns to connect features with similar orientation and frequency, but differing in phase. Similarly, Osindero et al. (2006) use a product of Student-t (PoT) approach to model the statistical structure of natural image data. Karklin and Lewicki (2009) shows that a neuronal population encoding the statistical distribution of natural images also shows complex-cell and V2-cell properties. Beside these approaches, Stringer, Perry, Rolls, and Proske (2006), using Hebbian learning, report that it is possible to learn invariance with spatial than temporal continuity. From this previous work, it is an open question if temporal coherence in the input is exploited by the visual cortex to learn invariance properties.

While much previous work has used artificial data sets such as bar problems (Földiák, 1991; Spratling, 2005; Stringer et al., 2006), or seminatural stimuli such as faces on different backgrounds (Wallis & Rolls, 1997; Stringer & Rolls, 2000), more recent work is based on natural images and video sequences (Kohonen, 1996; Hyvärinen and Hoyer, 2000, 2001; Kayser et al., 2001; Einhäuser et al., 2002; Hashimoto, 2003; Körding et al., 2004; Berkes & Wiskott, 2005; Osindero et al., 2006; Karklin & Lewicki, 2009; Köster & Hyvärinen, 2010). Despite this advancement in the field, previous algorithms have been tailored to learn only a subset of weights simultaneously and often require a trial-based design.

In this letter, we focus on the learning of shift and phase invariance properties comparable to those of complex cells in area V1 using a sequence of slightly shifted static natural images. We propose a new learning rule based on the conceptual design of the previously developed learning rule for simple cells (Wiltschut & Hamker, 2009). The previous rule was developed on ideas of Hebbian learning (Oja, 1982), covariance learning (Sejnowski, 1977), and anti-Hebbian decorrelation (Földiák, 1990) and lead to largely independent responses of V1 simple cells when trained on natural scenes. Our new rule expands these ideas by incorporating aspects of BCM learning (Bienenstock, Cooper, & Munro, 1982; Shouval, Castellani, Blais, Yeung, & Cooper, 2002; Yeung, Shouval, Blais, & Cooper, 2004; Castellani, Quinlan, Bersani, Cooper, & Shouval, 2005), particularly by using the level of calcium, rather than neural activity, in the learning rules. Moreover, learning in this model is fully continuous, and no reset or change of values between successive presentations is used.

2.  Model

2.1.  Architecture.

The model (see Figure 1) consists of two layers, the V1-Simple layer and the V1-Complex layer. The V1-Simple layer is simulated using Gabor functions (Jones & Palmer, 1987) to resemble properties of layer 4 simple cells in the primary visual cortex. The V1-Complex layer of the model could refer to layer 2/3 of the primary visual cortex, in which the complex cells have mostly been found (Hubel & Wiesel, 1962). Within a patch, all simple cells are connected to complex cells, initially by random weights, and learning changes these weights over time.
Figure 1:

Scheme of the model architecture. The input image is filtered by a set of Gabor functions. Small patches are cut out of the Gabor-filtered images, which represent the V1-Simple cell responses. This simple-cell population obtains the input for the V1-Complex layer, where the learning of invariance, using Hebbian and anti-Hebbian learning mechanisms, occurs.

Figure 1:

Scheme of the model architecture. The input image is filtered by a set of Gabor functions. Small patches are cut out of the Gabor-filtered images, which represent the V1-Simple cell responses. This simple-cell population obtains the input for the V1-Complex layer, where the learning of invariance, using Hebbian and anti-Hebbian learning mechanisms, occurs.

2.2.  V1-Simple.

For simplicity, the activations of the V1-Simple neurons are obtained from preprocessed activation maps. For precalculating these activation maps, every image in the image data set is convolved via certain Gabor functions:
formula
2.1
Edge-like, oriented Gabors with n (n = 4 or 8) different equidistant orientations θ (e.g., 0°, 45°, 90°, 135°) in four different phases ψ, with constant frequency f and gaussian extent σ (σx = 2.4, σy = 3.2) are used. This set of Gabor functions is applied as convolution kernel (11 × 11 pixel) for every image at every image position. As a result of the convolution, 16 (resp. 32) different maps of simple-cell responses per image are obtained. These maps are normalized to the admissible range [0, 1]. To avoid boundary effects, the convolution is calculated only in the valid inner area of an image.

2.3.  V1-Complex.

The neurons in the V1-Complex layer are driven by the V1-Simple layer neurons. Their activation function updates the firing rate rj of a neuron j by summarizing the weighted inputs of the neuron and discounts this feedforward activation by lateral inhibition. The inhibition is calculated as a sum over the nonlinear weighted firing rates of all other neurons in the layer. The membrane potential mj of each neuron j is described by a differential equation,
formula
2.2
with the nonlinearity function
formula
and the output function
formula
to ensure that the firing rate is not negative, (mj)+ = max(mj, 0). Additionally, the following constraint is used to ensure that the output firing rate cannot go beyond the admissible range [0, 1.5]:
formula
wij is the weight of the presynaptic neuron i to the postsynaptic neuron j, and ckj is the weight of a neuron k to the neuron j from the same layer. dnl = 0.8 is a constant modulation factor for the nonlinearity function f(x). The time constant of the temporal dynamics is τr = 10 ms. The V1-Complex layer is simulated using 32 neurons (4 orientations) or 64 neurons (8 orientations).

2.4.  Neuronal Calcium Level.

It has been shown that many forms of bidirectional synaptic plasticity (long-term potentiation, LTP, and long-term depression, LTD) are calcium dependent (Malinow, Schulman, & Tsien, 1989; Daw, Stein, & Fox, 1993; Pettit, Perlman, & Malinow, 1994; Lledo et al., 1995; Hu et al., 2001). Calcium/calmodulin-dependent protein kinase (CaMKK) signaling following NMDA receptor activation has been identified as a cell-autonomous homeostatic regulator of synaptic strength in response to activity (Goold & Nicoll, 2010) such that potentiation and depression of synaptic connection strength depend on the level of calcium at the corresponding synapses (Cummings, Mulkey, Nicoll, & Malenka, 1996; Yang, Tang, & Zucker, 1999; Cho, Aggleton, Brown, & Bashir, 2001; Cormier, Greenwood, & Connor, 2001). These and other electrophysiological studies have led to a framework of calcium-based learning (Lisman, 1989; Shouval, Castellani et al., 2002; Shouval, Bear, & Cooper, 2002) which suggests that the intracellular calcium concentration influences the strength and temporal dynamics of neuronal learning (Shouval, Castellani et al., 2002). In this model, the well-known assumption is used that the level of calcium at the synaptic sites is dependent on corresponding cell activations:
formula
2.3
In general, calcium activity follows the rate activity of the corresponding cell, but more slowly. As such, it encodes a trace of neural activity. We simulated the model with a small time constant (τCa,Simple = 10 ms) where calcium quickly follows the neural rate and a large time constant (τCa,Complex = 500 ms), which leads to a long-lasting calcium trace.

2.5.  Calcium-Dependent Hebbian Learning.

2.5.1.  Time Constant for Calcium-Dependent Synaptic Change.

The time constant for the synaptic change τLearn,k determines the speed of the synaptic change dependent on the postsynaptic calcium level Capostk of cell k. The function has been proposed by Shouval, Castellani et al. (2002) as an appropriate function to describe the relation of calcium to the strength of synaptic changes found in biological recordings. On a higher calcium level in the neuron, a higher alteration of the synaptic efficiency can be observed than in neurons with a low calcium level,
formula
2.4
where a = 5000 and b = 30,000 are parameters, which give the lower and the upper bound of the learning rate, and c = 10 defines the decay of the exponential function.

2.5.2.  Calcium-Dependent Synaptic Change.

The design of this learning rule follows the ideas of normalized Hebbian learning using a covariance rule (Sejnowski, 1977; Oja, 1982; Wiltschut & Hamker, 2009), but also includes the biological considerations of calcium dependency (Shouval, Castellani et al., 2002; Yeung et al., 2004; Castellani et al., 2005), where the amount and speed of learning are directly dependent not on the cellular activity but on the synaptic calcium level.

Equation 2.5 describes the synaptic weight change as a process dependent on the pre- and postsynaptic calcium levels (see equation 2.3) and the thresholds , :
formula
2.5
formula
The subtraction of the threshold from the pre- or postsynaptic synaptic calcium level is denoted by , respectively by . Furthermore, the velocity of the synaptic change is given by a calcium-dependent learning rate (see equation 2.4). The thresholds , are population means, and they are calculated as means over the corresponding calcium levels of the neuronal population (here, V1-Simple and V1-Complex). Moreover, the factor for weight normalization is not static; it is constrained to a maximum firing rate dependent on the alpha constraint Calphsjk, where α is adaptive and given by equation 2.7.
Similar to the BCM theory, whether LTP or LTD occurs depends on the postsynaptic calcium activity (see equation 2.5). If the postsynaptic calcium level of the neuron is higher than the population mean of the postsynaptic calcium levels, LTP, or dependent on the presynaptic calcium-level heterosynaptic (resp. homosynaptic), LTD occurs. If the calcium level is below the population mean but above of the mean, homosynaptic, LTD occurs, similar to BCM learning. Thus, cells that are not significantly excited (or lose the competition against other cells) slowly decrease their connection strength, but only for those parts of the input that drive the other (strong firing cells). Beacuse LTD happens on a much slower timescale than LTP (due to the calcium-dependent learning rate), the cells slowly decrease their connections to specific input configurations that other cells prefer. The alpha constraint is applied only when it does not amplify the weight change:.
formula
2.6

2.5.3.  Metaplasticity and Homeostatic Regulation.

Metaplasticity refers to mechanisms that regulate neural parameters such as a synaptic weight in dependence of other parameters (Abraham & Bear, 1996). A number of recent studies support the idea of homeostatic regulation (Turrigiano, Leslie, Desai, Rutherford, & Nelson, 1998; Desai, Cudmore, Nelson, & Turrigiano, 2002). Neurons seem to stabilize their firing rate within a certain target range through global homeostatic regulation of synaptic strength. Increased activity of a cell results in a reduction of the sensitivity, and reduced activity is followed by an enhancement of the sensitivity. Evidence supports a synaptic scaling mechanism in which, unlike in Hebbian learning, all synapses onto a cell are scaled up or down (Turrigiano & Nelson, 2004). The process operates over hours to days and seems to be linked to activity or input current sensors in the neuron (Marder & Prinz, 2002; MacLean, Zhang, Johnson, & Harris-Warrick, 2003). Despite strong evidence for a homeostatic regulation of synaptic strength, little is known about the induction mechanisms, and little computational work exists so far.

We here use a very simple but effective mechanism of synaptic scaling. In the Oja (1982) learning rule, the square of the weights relaxes over time to . We introduce a dependency of α to the firing rate of a neuron αk(rk),
formula
2.7
with
formula
and
formula
2.8
with
formula
where the increase of αk is determined by Hk and the decrease by α = 0.0005, a small chosen constant. Hk increases (see equation 2.8) if the firing rate rk is above a certain threshold γ = 0.7 and decreases by its own value and a small constant value, K = 0.05. In consequence of this mechanism, αk will reach a certain value and restrict the increase of weights to appropriate values, so that maximum firing rates are kept close to γ. αk increases only until the firing rate stops to breach the γ threshold and decreases slowly enough so that it remains nearly stable over a sufficiently long period of time. The speed of adaption is given by the temporal constants τα = 10,000 ms and τH = 100 ms.

2.6.  Anti-Hebbian Learning of Lateral Inhibitory Connections.

We use anti-Hebbian learning to learn lateral inhibitory weights in the way that a cell inhibits another cell if they often fire together. This mechanism leads to statistically independent responses and a sparse code (Wiltschut & Hamker, 2009).

2.6.1.  Time Constant for Anti-Hebbian Learning.

An output firing rate rk dependent time constant τc,k
formula
2.9
is introduced to speed up the depletion of the inhibitory weights of a neuron k. This time constant for the anti-Hebbian weight changes is determined by the parameters a = 10,000, b = 0.5 and c = 15, where a is the main time constant and b is a reduction factor. The parameter c controls the decay of the exponential function, giving the amount of the τc decay dependent on the firing rate rk, reduced by the threshold γc = 0.3. γc denotes the correlation threshold of the anti-Hebbian weight change.

2.6.2.  Learning Rule.

For learning lateral inhibitory weights ckj; a normalized Hebbian covariance learning rule is used (Wiltschut & Hamker, 2009):
formula
2.10
If two neurons fire together, they increase their inhibitory weight and start to inhibit each other and compete for the input pattern. To decrease the sensitivity to random fluctuations elicited by the input sequence, a threshold γc = 0.3 is introduced. The lateral weights increase only if both neurons fire above the threshold γc. The anti-Hebbian weights ckj from cell k to cell j decrease if the efferent neuron fires above threshold and the afferent one below threshold (αc = 0.1).
Table 1:
Distances and Their Probability for the Input Patch Shift.
DistanceProbability
1 pixel 0.51 
2 pixel 0.25 
3 pixel 0.12 
4 pixel 0.06 
5 pixel 0.03 
6 pixel 0.02 
7 pixel 0.01 
DistanceProbability
1 pixel 0.51 
2 pixel 0.25 
3 pixel 0.12 
4 pixel 0.06 
5 pixel 0.03 
6 pixel 0.02 
7 pixel 0.01 

3.  Materials and Methods

3.1.  Training.

A picture set of 10 monochrome images (512 × 512 pixel) of natural scenes, taken from Bruno Olshausen's Sparsenet (http://redwood.berkeley.edu/bruno/sparsenet/) was used in the learning phase. The same image set has been successfully used in the literature to learn from natural scenes (Olshausen & Field, 1996; Rehn & Sommer, 2007; Wiltschut & Hamker, 2009). Each image has been normalized to the range [0,1].

The network was initialized with small, randomly chosen weights and trained for around 1 million presentations. Convergence to stable receptive fields starts after about 500,000 presentations. For learning position invariance at the level of V1, changes in the input from one fixation to the next must be very small, as in fixational eye movements (Dodge, 1907; Zuber, Crider, & Stark, 1964; Martinez-Conde, Macknik, & Hubel, 2004; Rolfs, 2009). Here we follow the observation that during fixation of a certain point in space, the eyes still perform several movements of very small amplitude around the fixation point, thus leading to slightly different views of the scene successively. We generated sequences of 50 image patch presentations using the same image, but slightly shifted patch position due to fixational eye movements (see Table 1). To rule out the possibility that the results depend on too long a sequence length, our model has been tested with a shorter sequence length of 10 consecutive image patches as well. Furthermore, the ability of our model to maintain orientation selectivity is tested using a simple-cell input representing four and eight orientations.

The differential equations, describing the neuronal behavior, are computed through the Euler method. Note that there is no reset of the neuron activations and weights.

3.2.  Circular Response Images.

We use circular response images (see Figure 2) to visualize phase and orientation tuning of the learned complex cells within a single image. The circular test image is a 256 × 256 pixel image (codomain [0, 0.04]), generated with the same Matlab function as used by Berkes and Wiskott (2005). The inner 15 pixel radius of the image around the center is empty (value 0.02), and emanating from the center the image is a set of circular sine waves with a logarithmic decreasing frequency to the borders of the image. The frequency spectrum of the test image lies between .

Figure 2:

Circular test image.

Figure 2:

Circular test image.

The input to each neuron is determined by shifting a patch over the whole test image in 1 pixel steps. White denotes maximal excitation of a neuron from one population to the corresponding part in the test image, and black denotes maximal inhibition. For presentation purposes in small plots, we use discrete rather than continuous gray values. In the resulting images, orientation selectivity can be observed if a neuron responds to only parts of the circles (angular selectivity). Moreover, the excitatory and inhibitory receptive field components can be seen. Furthermore, the simple (resp. complex) cell property can be observed. Phase-invariant complex cells show a smooth activation profile in the radial direction, whereas simple cells are sensitive to the exact phase and thus show oscillations.

Based on the circular response images, the orientation bandwidth of each cell is determined by measuring the angle of the maximal response using a half-maximum criterion to define the border. For model evaluation, the mean orientation bandwidths of all cells with an activity above the value of 20% of the most active cell are considered.

3.3.  Relative Modulation Index.

The relative modulation index has been previously applied in electrophysiological studies (De Valois et al., 1982). Complex-cell responses show invariant responses to gratings shifted in phase, and simple cells respond to phase shifts with large oscillations as measured by the ratio of the modulation response (F1) to the mean firing rate (F0) given a grating with optimal frequency and orientation (Skottun et al., 1991; Einhäuser et al., 2002; Johnson, Hawken, & Shapley, 2008; Berkes, Turner, & Sahani, 2009). Relative modulation index values higher than one denote that the neuron has simple-cell characteristics. Lower values denote that the neuron has complex-cell characteristics.

The same Gabor functions as for the preprocessing of the input images are used as test stimuli (codomain [0, 0.04]). The modulation response has been determined through a Fourier analysis, where the modulation response (F1) is the first harmonic of the Fourier-transformed response. The reported relative modulation index values for the models with short and long calcium trace lengths are averaged across 10 different runs.

3.4.  Slowness.

Invariant sensory coding can also be measured by the slowness of neural responses to follow changes in the input (Wiskott & Sejnowski, 2002; Einhäuser et al., 2002; Berkes & Wiskott, 2005). In our design, the more invariant the cell is to changes in stimulus position, the more slowly it should respond to variations in the input sequence. While some previous work has used slowness as an additional optimization criterion, here, as in Einhäuser et al. (2002) we apply a slowness measure to describe the results of learning:
formula
3.1

One hundred randomly selected natural image sequences, similar to but different from those used for training, containing N = 50 small shifted presentations, are presented to the network, and the neuronal responses r of the V1-Complex layer are recorded. The differences in the response of every neuron i to the previous presentation are calculated and normalized by the mean response of the neuron for presentations of the same sequence (〈…〉N, where N denotes the length of the sequence). The mean of the squares of these normalized response differences is divided by the variance of the normalized neuronal responses from the same sequence. Higher values denote slower changes in the activities than lower values.

3.5.  Spatial Response Images.

A further important property of complex cells in the primary visual cortex is that they respond to their preferred stimuli independent of its exact position. Position invariance further generalizes phase invariance, which measures only the sensitivity of phase changes orthogonal to the preferred orientation. The spatial response images used here highlight the spatial regions at which a V1-Complex neuron shows a significant response to a Gabor stimulus. For neurons with simple-cell characteristics, narrower spatial regions are expected. Neurons with complex-cell characteristics should result in larger, smooth spatial regions.

4.  Results

The goal of invariance learning is to learn a high degree of selectivity to features that change slowly in the input sequence while becoming only broadly tuned to features that rapidly change in the input. In our design, the model should learn a precise mapping in the orientation domain while establishing a divergent mapping in spatial position. This is nontrivial, since a single patch from natural scenes does not contain only a single preferred orientation but leads to responses of multiple cells with different orientation tuning.

Two model variations, one with a short calcium trace length (τCa,Complex = 10 ms) and the second with a longer trace length (τCa,Complex = 500 ms), are compared to one another. The short trace length is chosen comparable to the time constant of neural dynamics, typically too short to learn temporal correlations between the successive patches.

4.1.  Circular Response Images.

Both model variations (using four oriented filters) show high orientation selectivity, with strong inhibition for nonpreferred orientations (see Figures 3 and 4). The mean orientation bandwidths are 20.2 degrees (τCa,Complex = 10 ms) and 26.3 degrees (τCa,Complex = 500 ms), which are close to the orientation bandwidth of the Gabor functions used. The neurons of the models mainly differ in the ability to respond phase invariant for preferred orientations. The model with the longer trace length (see Figure 4) typically responds equally strongly to stimuli of a different phase, whereas the model with the shorter trace length (see Figure 3) shows a more phase-sensitive activation pattern to the preferred orientation stimuli, which can be seen in the oscillations of the activity, namely, an excitatory response to the preferred phase and an inhibitory response to shifted phases (see Figure 5: see also the appendix for the feedforward weight matrices).

Figure 3:

Circular response images for every model neuron obtained with τCa,Complex = 10 ms. White denotes maximal excitation of a neuron to the corresponding part in the test image, and black denotes maximal inhibition. While all neurons show a high orientation selectivity and the neuron population as a whole represents all possible edge orientations, most neurons show a phase-sensitive activation pattern to the preferred orientation stimuli.

Figure 3:

Circular response images for every model neuron obtained with τCa,Complex = 10 ms. White denotes maximal excitation of a neuron to the corresponding part in the test image, and black denotes maximal inhibition. While all neurons show a high orientation selectivity and the neuron population as a whole represents all possible edge orientations, most neurons show a phase-sensitive activation pattern to the preferred orientation stimuli.

Figure 4:

Circular response images for every model neuron obtained with τCa,Complex = 500 ms and four oriented filters in the input. White denotes maximal excitation of a neuron to the corresponding part in the test image, and black denotes maximal inhibition. All neurons have learned a high orientation selectivity while at the same time having learned invariance to phase, as can be seen in the equally strong responses among phase variations in the test image.

Figure 4:

Circular response images for every model neuron obtained with τCa,Complex = 500 ms and four oriented filters in the input. White denotes maximal excitation of a neuron to the corresponding part in the test image, and black denotes maximal inhibition. All neurons have learned a high orientation selectivity while at the same time having learned invariance to phase, as can be seen in the equally strong responses among phase variations in the test image.

Figure 5:

Circular response images for two example neurons of the simulations with τCa,Complex = 10 ms and τCa,Complex = 500 ms. These images visualize the differences in the phase sensitivity of neuronal responses to the test image. The left neuron shows a high oscillation in the activity to phase variations in the input. White denotes maximal excitation of a neuron from one population to the corresponding part in the test image, and black denotes maximal inhibition.

Figure 5:

Circular response images for two example neurons of the simulations with τCa,Complex = 10 ms and τCa,Complex = 500 ms. These images visualize the differences in the phase sensitivity of neuronal responses to the test image. The left neuron shows a high oscillation in the activity to phase variations in the input. White denotes maximal excitation of a neuron from one population to the corresponding part in the test image, and black denotes maximal inhibition.

As a control experiment, a model (τCa,Complex = 500 ms; 64 cells) using eight orientations in the simple-cell input is evaluated (see Figure 6). This model variation still shows a high orientation selectivity (the mean orientation bandwidth is 25.1°). Nearly all cells respond to a single orientation in the input set of eight orientations, and the majority of the cells show phase-invariant behavior for their preferred stimuli.

Figure 6:

Circular response images for the 40 most excited or inhibited model neurons obtained with τCa,Complex = 500 ms and eight orientations representing input. White denotes maximal excitation of a neuron to the corresponding part in the test image, and black denotes maximal inhibition. The most neurons have learned a high orientation selectivity while at the same time having learned invariance to phase, as can be seen in the equally strong responses among phase variations in the test image.

Figure 6:

Circular response images for the 40 most excited or inhibited model neurons obtained with τCa,Complex = 500 ms and eight orientations representing input. White denotes maximal excitation of a neuron to the corresponding part in the test image, and black denotes maximal inhibition. The most neurons have learned a high orientation selectivity while at the same time having learned invariance to phase, as can be seen in the equally strong responses among phase variations in the test image.

The results of the proposed models are compared to those of SFA (Berkes & Wiskott, 2005) using the simple-cell responses to eight orientations in the input (see Figure 7). SFA also leads to orientation selectivity, but frequently for conjunct orientations (mean orientation bandwidth is 40.8°). However, all functions seem to be phase invariant. Following the unit classification in Berkes and Wiskott (2005), some of the units can be classified as orthogonal inhibited. Also nonorthogonal inhibited units, depending on the mean of the test image, can be found. The slowest units are nonoriented and respond to differences in brightness. The property of orthogonal inhibition can also be observed.

Figure 7:

Static circular response images for the 48 slowest functions ascertained with the slow feature analysis (SFA) on simple-cell responses (8 orientations). The majority of basis functions are orientation selective, but frequently to conjunct orientations. However, the slowest functions are phase invariant.

Figure 7:

Static circular response images for the 48 slowest functions ascertained with the slow feature analysis (SFA) on simple-cell responses (8 orientations). The majority of basis functions are orientation selective, but frequently to conjunct orientations. However, the slowest functions are phase invariant.

The SFA algorithm has also been applied on simple-cell responses to four orientations as input with less satisfying results, leading to cells selective for multiple orientations. When testing SFA on the raw images similar as in Berkes and Wiskott (2005), the slowest 48 basis functions become more narrowly tuned in orientation, but typically to multiple orientations. Generally the responses show a rich repertoire of properties, including frequency inhibition. Berkes and Wiskott (2005) reported more cells tuned to single orientations, but they tested the model with a motion component in the test image, whereas we report our results using a test image with a moving speed of zero. However, when we tested with non static test images using movement speeds of one to up to four pixels per frame, we obtained equivalent results.

4.2.  Relative Modulation Index.

The relative modulation index for the model with the longer trace shows that 93% of the cells can be classified as complex cells, while the shorter trace leads to only 46% complex cells. Switching the input protocol to random patch presentation reveals the importance of a temporally correlated input for the longer trace. The amount of complex cells drops to 40% with the longer trace and remains by 44% complex cells with the shorter trace. Thus, the 46% of complex cells found in the model with the short calcium trace can be explained by fluctuations that occur even in random sequences, while the increase up to 93% of complex cells is due to the calcium trace learning. The reduction of the sequence length in the input (10 consecutive patches) leads to 86% complex cells for the model with longer trace and lesser than with longer sequences but sufficiently more than with random patch presentation or short trace length. Hence, the sequence length of 50 has a marginal influence on the development of complex cells.

4.3.  Slowness.

The average slowness of all the neural responses in the trained model with the longer trace is −1.10 with an accumulation at values around −1. The model with the shorter trace length shows a lower mean slowness of −1.52, with a peak around −2 (see Figure 8). Compared to the results of Einhäuser et al. (2002), who have found slowness values for their simple layer around −1.5 and for their complex layer around −1.25, the model with the longer trace length shows comparable slowness values to the slowness values of the complex-cell layer from their model. The slowness values obtained with the shorter trace length resemble the results of their simple-cell layer. A short sequence length (10 consecutive patches) leads to slowness values of −1.06 for the model with a longer trace. Thus, the sequence length has no negative influence on the development of slowly responding cells on rapid varying input.
Figure 8:

Histogram of the slowness values obtained with τCa,Complex = 10 ms and τCa,Complex = 500 ms. The average slowness of the simulation with the shorter trace is −1.59, with a peak around −2, whereas the simulation with the longer trace has an average slowness of −1.18, with an accumulation at values around −1.

Figure 8:

Histogram of the slowness values obtained with τCa,Complex = 10 ms and τCa,Complex = 500 ms. The average slowness of the simulation with the shorter trace is −1.59, with a peak around −2, whereas the simulation with the longer trace has an average slowness of −1.18, with an accumulation at values around −1.

4.4.  Spatial Response Images.

The spatial response profile shows clear differences between neurons from the short trace model and neurons from the long trace model. Those of the latter model have more broadly tuned response regions than the neurons from the short trace model. Figure 9 shows exemplary spatial response images for two neurons per model. Each subimage of a neuron shows the spatial responses to phase variations of the preferred stimulus. The complex- and simple-cell characteristics can be seen in the broadness of the response. The model with the longer trace shows much higher invariance to phase shifts of the preferred stimuli than its counterpart.

Figure 9:

Spatial response images of four example cells. The first two cells are from the short trace model; the second two are from the longer trace model. Each panel shows the response to the preferred orientation at each spatial position. The different panels vary in the phase of the preferred stimulus. The spatial response images obtained with the model with the longer trace typically show broader response regions and much higher invariance to phase shifts of the preferred stimuli.

Figure 9:

Spatial response images of four example cells. The first two cells are from the short trace model; the second two are from the longer trace model. Each panel shows the response to the preferred orientation at each spatial position. The different panels vary in the phase of the preferred stimulus. The spatial response images obtained with the model with the longer trace typically show broader response regions and much higher invariance to phase shifts of the preferred stimuli.

5.  Conclusion

Invariance is a general property of the processing in the visual cortex and appears to be of fundamental importance for object recognition. It has been shown that cells in IT respond invariantly to a variety of stimulus transformations (Logothetis & Sheinberg, 1996; Tanaka, 1996). Such properties are not rigidly encoded in the visual system; they are a product of learning and adaption in the visual system (Cox, Meier, Oertelt, & DiCarlo, 2005; Li & DiCarlo, 2010).

Here we demonstrated that invariance can be learned from natural images through a biologically plausible learning algorithm on the level of the primary visual cortex using fixational eye movements to generate input sequences. In our model, V1 complex cells learn largely invariant responses to position and phase while at the same time being selective for orientation. The model with a slowly varying calcium trace develops strong orientation selective cells that are predominantly invariant to phase variations of the stimuli and with behavior that responds slowly to changes in the environment. Their spatial response regions are broader than the response regions of simple cells. The model with a short trace shows no more invariance than a model using a random sequence, indicating that a sufficiently long calcium trace could be a crucial neural correlate for invariance learning. However, the results of these simulations show that around 40% of the cells can be classified as complex cells. This higher amount is consistent with the previously reported results that residual dependencies in input can be enough for learning invariance (Hyvärinen & Hoyer, 2000, 2001). Our results show that the number of invariant cells increases rapidly by using temporal correlations in learning.

The learning of position and phase invariance while keeping selectivity in orientation is not trivial. Previous models have often used strongly simplified inputs in the form of artificial barlike patterns or only single categories such as faces to avoid too many fluctuations in the input. Other recent approaches artificially restricted learning to the most activated neuron (Einhäuser et al., 2002; Spratling, 2005). Despite the large variety of responses in SFA (Berkes & Wiskott, 2005) on our data, SFA leads to less orientation tuning than in our model.

The invariant and more temporally stable and high feature selective representation of the learned complex cells should facilitate the processing in further cortical stages without the loss of important identity information. The loss of the exact retinal position should not be a problem for the visual system, because real-world objects consist of many basic structural elements. Consequently, the proposed model and the learning algorithm are a good basis for the development of a more comprehensive model of the visual system; future work has to demonstrate invariance learning on even more complex inputs and finally at the level of objects.

Appendix:  Feedforward Weight Matrices

As supplementary material, the excitatory feedforward connections of six cells obtained from two simulations, using τCa = 500 ms (see Figure 10), respectively τCa = 10 ms (see Figure 11) and four orientations in the input, are presented. The ellipses show the orientation and position of the related subunits (Hoyer & Hyvärinen, 2002). The gray value represents the connection strength, and black denotes the maximum weight value of all cells in the network. For display reasons, two consecutive ellipses are represented by only one ellipse, where the value is determined by the mean of both. The connection patterns obtained using τCa = 500 ms show that each cell is highly orientation selective while being evenly connected to all phases over broad regions in the visual space. In contrast, the connection patterns obtained using τCa = 10 ms lack these even broad connections, resulting in the reported oscillating responses on slight stimulus transformations.

Figure 10:

Visualization of the feedforward matrices for six cells, obtained using τCa = 500 ms and four orientations in the input.

Figure 10:

Visualization of the feedforward matrices for six cells, obtained using τCa = 500 ms and four orientations in the input.

Figure 11:

Visualization of the feedforward matrices for six cells, obtained using τCa = 10 ms and four orientations in the input.

Figure 11:

Visualization of the feedforward matrices for six cells, obtained using τCa = 10 ms and four orientations in the input.

Acknowledgments

This work has been supported by the German Science Foundation (DFG HA2630/6-1). We thank Pietro Berkes and Laurenz Wiskott for providing the code to generate the circular response test image and helpful comments to the application of the slow feature analysis on our data set.

References

Abraham
,
W. C.
, &
Bear
,
M. F.
(
1996
).
Metaplasticity: The plasticity of synaptic plasticity
.
Trends Neurosci.
,
19
(
4
),
126
130
.
Adelson
,
E. H.
, &
Bergen
,
J. R.
(
1985
).
Spatiotemporal energy models for the perception of motion
.
J. Opt. Soc. Am. A
,
2
(
2
),
284
299
.
Bell
,
A. J.
, &
Sejnowski
,
T. J.
(
1997
).
The ”independent components” of natural scenes are edge filters
.
Vision Res.
,
37
(
23
),
3327
3338
.
Berkes
,
P.
,
Turner
,
R. E.
, &
Sahani
,
M.
(
2009
).
A structured model of video reproduces primary visual cortical organisation
.
PLoS Comput. Biol.
,
5
(
9
),
e1000495
.
Berkes
,
P.
, &
Wiskott
,
L.
(
2005
).
Slow feature analysis yields a rich repertoire of complex cell properties
.
J. Vis.
,
5
(
6
),
579
602
.
Bienenstock
,
E. L.
,
Cooper
,
L. N.
, &
Munro
,
P. W.
(
1982
).
Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex
.
J. Neurosci.
,
2
(
1
),
32
48
.
Carandini
,
M.
,
Demb
,
J. B.
,
Mante
,
V.
,
Tolhurst
,
D. J.
,
Dan
,
Y.
,
Olshausen
,
B. A.
, et al
. (
2005
).
Do we know what the early visual system does?
J. Neurosci.
,
25
(
46
),
10577
10597
.
Castellani
,
G. C.
,
Quinlan
,
E. M.
,
Bersani
,
F.
,
Cooper
,
L. N.
, &
Shouval
,
H. Z.
(
2005
).
A model of bidirectional synaptic plasticity: From signaling network to channel conductance
.
Learn. Mem.
,
12
(
4
),
423
432
.
Cho
,
K.
,
Aggleton
,
J. P.
,
Brown
,
M. W.
, &
Bashir
,
Z. I.
(
2001
).
An experimental test of the role of postsynaptic calcium levels in determining synaptic strength using perirhinal cortex of rat
.
J. Physiol.
,
532
,
459
466
.
Cormier
,
R. J.
,
Greenwood
,
A. C.
, &
Connor
,
J. A.
(
2001
).
Bidirectional synaptic plasticity correlated with the magnitude of dendritic calcium transients above a threshold
.
J. Neurophysiol.
,
85
(
1
),
399
406
.
Cox
,
D. D.
,
Meier
,
P.
,
Oertelt
,
N.
, &
DiCarlo
,
J. J.
(
2005
).
“Breaking” position-invariant object recognition
.
Nat. Neurosci.
,
8
(
9
),
1145
1147
.
Cummings
,
J. A.
,
Mulkey
,
R. M.
,
Nicoll
,
R. A.
, &
Malenka
,
R. C.
(
1996
).
Ca2+ signaling requirements for long-term depression in the hippocampus
.
Neuron
,
16
,
825
833
.
Daw
,
N. W.
,
Stein
,
P. S.
, &
Fox
,
K.
(
1993
).
The role of NMDA receptors in information processing
.
Annu. Rev. Neurosci.
,
16
,
207
222
.
De Valois
,
R. L.
,
Albrecht
,
D. G.
, &
Thorell
,
L. G.
(
1982
).
Spatial frequency selectivity of cells in macaque visual cortex
.
Vision Res.
,
22
(
5
),
545
559
.
Desai
,
N. S.
,
Cudmore
,
R. H.
,
Nelson
,
S. B.
, &
Turrigiano
,
G. G.
(
2002
).
Critical periods for experience-dependent synaptic scaling in visual cortex
.
Nat. Neurosci.
,
5
(
8
),
783
789
.
Dodge
,
R.
(
1907
).
An experimental study of visual fixation
.
Psychological Review Monograph
(suppl.)
.
Einhäuser
,
W.
,
Kayser
,
C.
,
König
,
P.
, &
Körding
,
K. P.
(
2002
).
Learning the invariance properties of complex cells from their responses to natural stimuli
.
Eur. J. Neurosci.
,
15
(
3
),
475
486
.
Falconbridge
,
M. S.
,
Stamps
,
R. L.
, &
Badcock
,
D. R.
(
2006
).
A simple Hebbian/anti-Hebbian network learns the sparse, independent components of natural images
.
Neural Comput.
,
18
(
2
),
415
429
.
Földiák
,
P.
(
1990
).
Forming sparse representations by local anti-Hebbian learning
.
Biol. Cybern.
,
237
(
5349
),
55
56
.
Földiák
,
P.
(
1991
).
Learning invariance from transformation sequences
.
Neural Comput.
,
3
(
2
),
194
200
.
Goold
,
C. P.
, &
Nicoll
,
R. A.
(
2010
).
Single-cell optogenetic excitation drives homeostatic synaptic depression
.
Neuron
,
68
(
3
),
512
528
.
Hamker
,
F. H.
, &
Wiltschut
,
J.
(
2007
).
Hebbian learning in a model with dynamic rate-coded neurons: An alternative to the generative model approach for learning receptive fields from natural scenes
.
Network
,
18
(
3
),
249
266
.
Hashimoto
,
W.
(
2003
).
Quadratic forms in natural images
.
Network
,
14
(
4
),
765
788
.
Hoyer
,
P. O.
, &
Hyvärinen
,
A.
(
2000
).
Independent component analysis applied to feature extraction from colour and stereo images
.
Network
,
11
(
3
),
191
210
.
Hoyer
,
P. O.
, &
Hyvärinen
,
A.
(
2002
).
A multi-layer sparse coding network learns contour coding from natural images
.
Vision Res.
,
42
(
12
),
1593
1605
.
Hu
,
H.
,
Shao
,
L. R.
,
Chavoshy
,
S.
,
Gu
,
N.
,
Trieb
,
M.
,
Behrens
,
R.
, et al. (
2001
).
Presynaptic Ca2+-activated K+ channels in glutamatergic hippocampal terminals and their role in spike repolarization and regulation of transmitter release
.
J. Neurosci.
,
21
(
24
),
9585
9597
.
Hubel
,
D. H.
, &
Wiesel
,
T. N.
(
1962
).
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex
.
J. Physiol.
,
160
(
1
),
106
154
.
Hyvärinen
,
A.
(
2005
).
Estimation of non-normalized statistical models by score matching
.
Journal of Machine Learning Research
,
6
(
1
),
695
709
.
Hyvärinen
,
A.
, &
Hoyer
,
P. O.
(
2000
).
Emergence of phase-and shift-invariant features by decomposition of natural images into independent feature subspaces
.
Neural Comput.
,
12
(
7
),
1705
1720
.
Hyvärinen
,
A.
, &
Hoyer
,
P. O.
(
2001
).
A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images
.
Vision Res.
,
41
(
18
),
2413
2423
.
Johnson
,
E. N.
,
Hawken
,
M. J.
, &
Shapley
,
R. M.
(
2008
).
The orientation selectivity of color-responsive neurons in macaque V1
.
J. Neurosci.
,
28
(
32
),
8096
8106
.
Jones
,
J. P.
, &
Palmer
,
L. A.
(
1987
).
An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex
.
J. Neurophysiol.
,
58
(
6
),
1233
1258
.
Karklin
,
Y.
, &
Lewicki
,
M. S.
(
2009
).
Emergence of complex cell properties by learning to generalize in natural scenes
.
Nature
,
457
(
7225
),
83
86
.
Kayser
,
C.
,
Einhäuser
,
W.
,
Dümmer
,
O.
,
König
,
P.
, &
Körding
,
K. P.
(
2001
).
Extracting slow subspaces from natural videos leads to complex cells
. In
G. Dorffner & H. Bischoff
(Eds.),
Artificial neural networks
(
pp. 1075
1080
).
New York
:
Springer
.
Kohonen
,
T.
(
1996
).
Emergence of invariant-feature detectors in the adaptive-subspace self-organizing map
.
Biol. Cybern.
,
75
(
4
),
281
291
.
Körding
,
K. P.
,
Kayser
,
C.
,
Einhäuser
,
W.
, &
König
,
P.
(
2004
).
How are complex cell properties adapted to the statistics of natural stimuli?
J. Neurophysiol.
,
91
(
1
),
206
212
.
Köster
,
U.
, &
Hyvärinen
,
A.
(
2010
).
A two-layer model of natural stimuli estimated with score matching
.
Neural Comput.
,
22
(
9
),
2308
2333
.
Li
,
N.
, &
DiCarlo
,
J. J.
(
2010
).
Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex
.
Neuron
,
67
(
6
),
1062
1075
.
Lisman
,
J.
(
1989
).
A mechanism for the Hebb and the anti-Hebb processes underlying learning and memory
.
Proc. Natl. Acad. Sci. U.S.A.
,
86
(
23
),
9574
9578
.
Lledo
,
P.
,
Hjelmstad
,
G.
,
Mukherji
,
S.
,
Soderling
,
T.
,
Malenka
,
R. C.
, &
Nicoll
,
R. A.
(
1995
).
Calcium/calmodulin-dependent kinase II and long-term potentiation enhance synaptic transmission by the same mechanism
.
Proc. Natl. Acad. Sci. U.S.A.
,
92
(
24
),
11175
11179
.
Logothetis
,
N. K.
, &
Sheinberg
,
D. L.
(
1996
).
Visual object recognition
.
Annu. Rev. Neurosci.
,
19
,
577
621
.
MacLean
,
J. N.
,
Zhang
,
Y.
,
Johnson
,
B. R.
, &
Harris-Warrick
,
R. M.
(
2003
).
Activity-independent homeostasis in rhythmically active neurons
.
Neuron
,
37
(
1
),
109
120
.
Malinow
,
R.
,
Schulman
,
H.
, &
Tsien
,
R. W.
(
1989
).
Inhibition of postsynaptic PKC or CaMKII blocks induction but not expression of LTP
.
Science
,
245
(
4920
),
862
866
.
Marder
,
E.
, &
Prinz
,
A. A.
(
2002
).
Modeling stability in neuron and network function: The role of activity in homeostasis
.
BioEssays
,
24
(
12
),
1145
1154
.
Martinez-Conde
,
S.
,
Macknik
,
S. L.
, &
Hubel
,
D. H.
(
2004
).
The role of fixational eye movements in visual perception
.
Nat. Rev. Neurosci.
,
5
(
3
),
229
240
.
Oja
,
E.
(
1982
).
Simplified neuron model as a principal component analyzer
.
J. Math. Biol.
,
15
(
3
),
267
273
.
Olshausen
,
B. A.
, &
Field
,
D. J.
(
1996
).
Emergence of simple-cell receptive field properties by learning a sparse code for natural images
.
Nature
,
381
(
6583
),
607
609
.
Osindero
,
S.
,
Welling
,
M.
, &
Hinton
,
G. E.
(
2006
).
Topographic product models applied to natural scene statistics
.
Neural Comput.
,
18
(
2
),
381
414
.
Pettit
,
D.
,
Perlman
,
S.
, &
Malinow
,
R.
(
1994
).
Potentiated transmission and prevention of further LTP by increased CaMKII activity in postsynaptic hippocampal slice neurons
.
Science
,
266
(
5192
),
1881
1885
.
Rehn
,
M.
, &
Sommer
,
F. T.
(
2007
).
A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields
.
J. Comput. Neurosci.
,
22
(
2
),
135
146
.
Rolfs
,
M.
(
2009
).
Microsaccades: Small steps on a long way
.
Vision Res.
,
49
(
20
),
2415
2441
.
Sejnowski
,
T. J.
(
1977
).
Storing covariance with nonlinearly interacting neurons
.
J. Math. Biol.
,
4
(
4
),
303
321
.
Shouval
,
H. Z.
,
Bear
,
M. F.
,
Cooper
,
L. N.
(
2002
).
A unified model of NMDA receptor-dependent bidirectional synaptic plasticity
.
Proc. Natl. Acad. Sci. U.S.A.
,
99
(
16
),
10831
10836
.
Shouval
,
H. Z.
,
Castellani
,
G. C.
,
Blais
,
B. S.
,
Yeung
,
L. C.
, &
Cooper
,
L. N.
(
2002
).
Converging evidence for a simplified biophysical model of synaptic plasticity
.
Biol. Cybern.
,
87
(
5-6
),
383
391
.
Skottun
,
B.
,
De Valois
,
R. L.
,
Grosof
,
D.
,
Movshon
,
J.
,
Albrecht
,
D. G.
, &
Bonds
,
A.
(
1991
).
Classifying simple and complex cells on the basis of response modulation
.
Vision Res.
,
31
(
7–8
),
1078
1086
.
Spratling
,
M. W.
(
2005
).
Learning viewpoint invariant perceptual representations from cluttered images
.
IEEE Trans. Pattern Anal. Mach. Intell.
,
27
(
5
),
753
761
.
Stringer
,
S. M.
,
Perry
,
G.
,
Rolls
,
E. T.
, &
Proske
,
J. H.
(
2006
).
Learning invariant object recognition in the visual system with continuous transformations
.
Biol. Cybern.
,
94
(
2
),
128
142
.
Stringer
,
S. M.
, &
Rolls
,
E. T.
(
2000
).
Position invariant recognition in the visual system with cluttered environments
.
Neural Netw.
,
13
(
3
),
305
315
.
Tanaka
,
K.
(
1996
).
Inferotemporal cortex and object vision
.
Annu. Rev. Neurosci.
,
19
,
109
139
.
Turrigiano
,
G. G.
,
Leslie
,
K. R.
,
Desai
,
N. S.
,
Rutherford
,
L. C.
, &
Nelson
,
S. B.
(
1998
).
Activity-dependent scaling of quantal amplitude in neocortical neurons
.
Nature
,
391
(
6670
),
892
896
.
Turrigiano
,
G. G.
, &
Nelson
,
S. B.
(
2004
).
Homeostatic plasticity in the developing nervous system
.
Nat. Rev. Neurosci.
,
5
(
2
),
97
107
.
van Hateren
,
J. H.
, &
van der Schaaf
,
A.
(
1998
).
Independent component filters of natural images compared with simple cells in primary visual cortex
.
Proc. Biol. Sci.
,
265
(
1394
),
359
366
.
Wallis
,
G.
, &
Rolls
,
E. T.
(
1997
).
Invariant face and object recognition in the visual system
.
Prog. Neurobiol.
,
30
(
3
),
304
309
.
Weber
,
C.
, &
Triesch
,
J.
(
2008
).
A sparse generative model of V1 simple cells with intrinsic plasticity
.
Neural Comput.
,
20
(
5
),
1261
1284
.
Wiltschut
,
J.
, &
Hamker
,
F. H.
(
2009
).
Efficient coding correlates with spatial frequency tuning in a model of V1 receptive field organization
.
Vis. Neurosci.
,
26
(
1
),
21
34
.
Wiskott
,
L.
, &
Sejnowski
,
T. J.
(
2002
).
Slow feature analysis: Unsupervised learning of invariances
.
Neural Comput.
,
14
(
4
),
715
770
.
Yang
,
S.-N.
,
Tang
,
Y.-G.
, &
Zucker
,
R. S.
(
1999
).
Selective induction of LTP and LTD by postsynaptic [Ca2+] elevation
.
J. Neurophysiol.
,
81
,
781
787
.
Yeung
,
L. C.
,
Shouval
,
H. Z.
,
Blais
,
B. S.
, &
Cooper
,
L. N.
(
2004
).
Synaptic homeostasis and input selectivity follow from a calcium-dependent plasticity model
.
Proc. Natl. Acad. Sci. U.S.A.
,
101
(
41
),
14943
14948
.
Zuber
,
B.
,
Crider
,
A.
, &
Stark
,
L.
(
1964
).
Saccadic suppression associated with microsaccades
.
Quart. Progr. Rept.
,
74
,
244
249
.