## Abstract

The moving bar experiment is a classic paradigm for characterizing the receptive field (RF) properties of neurons in primary visual cortex (V1). Current approaches for analyzing neural spiking activity recorded from these experiments do not take into account the point-process nature of these data and the circular geometry of the stimulus presentation. We present a novel analysis approach to mapping V1 receptive fields that combines point-process generalized linear models (PPGLM) with tomographic reconstruction computed by filtered-back projection. We use the method to map the RF sizes and orientations of 251 V1 neurons recorded from two macaque monkeys during a moving bar experiment. Our cross-validated goodness-of-fit analyses show that the PPGLM provides a more accurate characterization of spike train data than analyses based on rate functions computed by the methods of spike-triggered averages or first-order Wiener-Volterra kernel. Our analysis leads to a new definition of RF size as the spatial area over which the spiking activity is significantly greater than baseline activity. Our approach yields larger RF sizes and sharper orientation tuning estimates. The tomographic reconstruction paradigm further suggests an efficient approach to choosing the number of directions and the number of trials per direction in designing moving bar experiments. Our results demonstrate that standard tomographic principles for image reconstruction can be adapted to characterize V1 RFs and that two fundamental properties, size and orientation, may be substantially different from what is currently reported.

## 1. Introduction

The receptive field (RF) of a neuron defines how its spiking activity changes in response to a stimulus. Developing accurate quantitative characterization of the RFs of neurons in primary visual cortex (V1) is critical for understanding how the early visual system represents and transmits information (De Valois, Albrecht, & Thorell, 1982; Jones & Palmer, 1987; de Ruyter van Steveninck & Bialek, 1988; Leventhal, Thompson, Liu, Zhou, & Ault, 1995; Stanley, 2002; Ringach, 2004; Nishimoto, Arai, & Ohzawa, 2005; Chen, Han, Poo, & Dan, 2007). The moving bar experiment is a classic paradigm pioneered by Hubel and Wiesel (1959) for measuring the RFs of V1 neurons. It has been well known that V1 simple cells respond precisely to the location and contrast polarity of features in the visual scene. In the typical moving bar experiment, electrodes are implanted in the animal's V1 area, and a bar is passed in front of the animal's eye (i.e., the visual field) at constant speed and a fixed orientation. The bar is passed systematically at multiple orientations for several trials, and the neurons’ spiking activity is recorded across all orientations and all trials. Simple V1 neurons can be well described by a set of linear filters (RFs), which can be analyzed using the principle of reverse correlation (Ringach & Shapley, 2004) or Bayesian inference (Park & Pillow, 2011). The standard approaches to estimating the RF rely on constructing a spatial histogram of the spiking activity or using methods of spike-triggered average (STA) or Wiener-Volterra kernel (WVK) with linear or second-order kernels (Gabbiani, 1996; Rieke, Warland, de Ruyter van Steveninck, & Bialek, 1997; Paninski, 2003). The STA provides an unbiased estimate of the linear filter for uncorrelated (or spherically symmetric) stimuli, and WVK provides a minimum-variance estimate in the least-squares sense. The moving bar experiment remains a practical paradigm for mapping visual RFs in awake moneys, although other alternative visual stimuli have been employed, such as the white noise, grating, and m-sequences (Reid, Victor, & Shapley, 1997).

An examination of the literature found that two important estimation issues have not
been considered in standard analyses of V1 moving bar experiments. First, none of
these methods uses the fact that neural spiking activity can be accurately
represented by point processes (Brown, Kass, & Mitra, 2004; Kass, Ventura, & Brown, 2005; Truccolo, Eden, Fellow, Donoghue, & Brown, 2005; Paninski, 2004). Second, none of these estimation procedures has
considered the highly structured geometry of the moving bar experiments (Fiorani,
Azzi, & Gattass, 2001). Passing the
moving bar at multiple orientations to construct the visual neuron's RF is analogous
to tomographic image reconstruction in which energy is passed through an object from
different directions. The transmitted energy is captured by arrays of detectors, and
a reconstruction algorithm is used to build a tomogram or image of the object's
internal structure. For computer tomography, positron emissions tomography, and
single photon emission computer tomography, the incident energies are, respectively,
X-rays, gamma rays, and electron-positron annihilation (Kak & Slaney, 1988; Deans, 1993; Natterer, 2001). This
analogy suggests that tomographic principles for image reconstruction could be used
to estimate visual RFs in moving bar experiments. The idea of using the tomographic
reconstruction principle to map visual RFs is not new (Sun & Bonds, 1994). However, the spiking activity of V1
neurons was not carefully modeled in that the non-Poissonian firing effect was
completely ignored. Another important issue is characterizing the RF size of V1
simple cells. The V1 RF size was traditionally determined based on the
minimum-response-field technique (Movshon, Thomson, & Tolhurst, 1978; Pettet & Gilbert, 1992; Crist, Li, & Gilbert, 2001). The minimum-response-field technique assumes the
RF profile has a gaussian or difference-of-gaussian shape, and it characterizes the
RF extent using a threshold based on the *e*^{−1} criterion (DeAngelis, Anzai, Ohzawa, & Freeman, 1995) or using the standard deviation to be the
size of RF (Nienborg, Bridge, Parker, & Cumming, 2004). However, we argue that this technique lacks statistical
justification since, it does not take the uncertainty of the neuronal responses into
account; moreover, the gaussian assumption is a poor approximation, and the
selection of the threshold is rather ad hoc.

In this article, we combine point-process generalized linear model (PPGLM) methods (Truccolo et al., 2005; McCullagh & Nelder, 1989) with a filtered backprojection algorithm (Kak & Slaney, 1988) to derive a novel analysis approach for mapping the RFs of V1 neurons recorded in a moving bar experiment. Based on our new analysis approach, we propose a statistical criterion (based on the notion of confidence intervals) to define the RF size. We use the proposed approach to characterize the RF size and orientation tuning of 251 V1 neurons. Our results show that the orientation preferences of V1 neurons are sharper and the sizes of their RFs are larger than the estimates from conventional methods. Our analysis further suggests an approach to optimal design of moving bar experiments. We emphasize the computational nature and principle of our proposed approach; therefore, an exhaustive comparison with every estimation method in the literature is beyond our scope here. In addition, our proposed approach is specifically aimed at the moving bar experiments; we make no claim about the superiority of moving bar stimuli to other state-of-the-art experimental stimuli paradigms for mapping V1 RFs.

## 2. Methods

Our approach consists of three important steps: we use a PPGLM to characterize the neuronal responses at each moving bar direction, use the principle of tomography reconstruction to map the visual RF, and characterize the RF size and orientation tuning for every neuron. The first two steps are illustrated in Figure 1.

### 2.1. Visual Receptive Field Characterization via Radon Transform.

Passing the moving bar at multiple orientations to construct the visual RF is analogous to tomographic image reconstruction in which energy is passed through an object from multiple directions. The neuronal responses to a visual stimulus at different locations of the RF are assumed to follow a linear superposition principle, such that the neural response along a moving bar angle θ is treated as a line integration of the two-dimensional (2D) RF orthogonal to the moving direction.

*f*(

*x*,

*y*) along the specific projection angle θ (Kak & Slaney, 1988; Deans, 1993; Natterer, 2001). The V1 RF estimate conditional on the angle θ, characterized as a function of both spatial position (

*x*,

*y*) and orientation α, is denoted as

*h*(

*x*,

*y*, α|θ). Assuming a factorial property (Rodieck, 1965), the V1 RF is expressed as a product of spatial tuning and orientation tuning:

*h*(

*x*,

*y*, α|θ) =

*f*(

*x*,

*y*|θ)

*g*(α|θ),

^{1}where , 0 ⩽ α ⩽ π. It then follows that That is, the one-dimensional (1D) projection of neural response can be fully characterized by a product of an orientation tuning function and a spatial tuning function jointly defined by the Radon transform. For notation simplicity, the conditional term of the angle θ is made implicit in the functions

*f*and

*h*from now on.

*f*(

*x*,

*y*), the inverse Radon transform is applied (Kak & Slaney, 1988), where and denote the 1D and 2D Fourier transforms, respectively. The second equality follows from the relationship between the Radon transform and Fourier transform. (See appendix A for technical details on continuous Radon transform.) In theory, the projection-slice theorem states that a perfect reconstruction is possible given an infinite number of 1D projections taken at an infinite number of angles (Kak & Slaney, 1988). In practice, the inverse Radon transform is realized by filtered backprojection based on a finite set of projection angles {θ

_{j}}, which is implemented numerically in a finite discrete grid.

*f*(

*x*,

*y*) is assumed to have a band-limited spectrum with a bandwidth of

*b*Hz. Reliable reconstruction of

*f*(

*x*,

*y*) based on its line projections requires a minimum sampling frequency

*q*and a minimum number of projections

*J*such that the following two sampling criteria are fulfilled (Natterer, 2001): For

*J*= 16 projections evenly spaced on 2π, the bandwidth is upper-bounded:

*b*⩽ 8 Hz. Arbitrarily small bin size could be chosen to ensure a sufficiently high temporal sampling frequency

*q*.

### 2.2. Point-Process Generalized Linear Model for Estimating the Radon Transform.

In RF mapping, the Radon transform at each moving bar projection direction is
unknown and requires estimation based on spike train observations recorded from
multiple trials. To do so, single-unit spike trains are treated as point-process
observations. Within a time interval (0, *T*], the spike trains
are evenly binned with a bin size of Δ = *K*^{−1}*T* (note that *q* = 1/Δ). The
number *K* is chosen sufficiently large such that there is at
most one spike in any bin, and the point process is modeled by a conditional
intensity function (CIF) , for *k* = 1, …, *K* (Brown, Barbieri, Eden, & Frank, 2003). With a small bin size Δ = 1 ms
(i.e., *q* = 1000 Hz), approximates the probability of a spiking event within the
interval ((*k* − 1)Δ, *k*Δ]
given the observed spike history up to time *t _{k}*.

*j*th moving bar direction θ

_{j}, we model the log CIF of the spiking activity at time

*t*, denoted by , as the sum of stimulus-evoked and spike history–dependent components (Truccolo et al., 2005), where the first term of the right-hand side of equation 2.4 represents the stimulus-evoked component that is related to the RF characteristic, and the second term represents the spike history–dependent component that is related to the neuron's intrinsic biophysical property. The stimulus-evoked component is modeled using

_{k}*M*cubic B-spline functions , and the spike history–dependent component is modeled using spike count statistics within preceding

*I*history windows. Equation 2.4 essentially defines a GLM with a log link function.

The parameters of PPGLM are estimated using an iterative reweighted least-squares
(IRWLS) algorithm ( function, Matlab) based on maximum likelihood estimation
(Pawitan, 2001; Brown et al., 2003; Paninski, 2004). The order parameters *M* and *I* are chosen according to the Akaike's information
criterion (AIC) and assessed by a Kolmogorov-Smirnov (KS) test in light of the
time-rescaling theorem (Brown, Barbieri, Ventura, Kass, & Frank, 2002; Haslinger, Pipa, & Brown, 2010). Upon completing the maximum
likelihood fitting, taking the exponent of both sides of equation 2.4 and evaluating it with the
maximum likelihood estimate yields the CIF estimate from the PPGLM (see Figure 1d). In addition, confidence intervals
(CIs) for the parameters {ξ^{(B)}_{m,j}} and {ξ^{(n)}_{l,j}}, as well as the CI for at each direction, are analytically derived (Brown et al., 2003).

*f*(

*x*,

*y*) orthogonal to the moving direction, which is characterized by a Radon transform . At each moving bar direction, the estimated Radon transform is determined only by the stimulus-evoked neural response,

### 2.3. Modeling Stimulus-Evoked and Spike History–Dependent Com-ponents.

*d*= 3 (i.e., polynomials of degree 3), which is continuous and twice differentiable. The values of control points affect only the shape of equation 2.5 locally in time due to the piecewise definition of given a sequence of control points {0 ⩽

*t*

_{0}⩽

*t*

_{1}⩽ ⋅ ⋅ ⋅ ⩽

*t*

_{r−1}⩽

*t*⩽

_{r}*T*} (de Boor, 1978): The control points are placed nonevenly on the 2 s time interval with an adaptive placement strategy, which automatically allocates the control points based on the cumulated peristimlus time histogram (PSTH) of the spike activity (see Figure 2 for illustration).

To model the spike history–dependent component of equation 2.4 at each moving bar direction,
we use the spike count within the previous temporal windows to account for the
spiking history prior to the current spike's occurrence. The *n*_{k−l,j} in equation 2.4 is the
observed spike count in the past *l*th temporal window at the *j*th moving bar direction, and
ξ^{(n)}_{l,j} is the associated weight coefficient. A positive (negative) value of
ξ^{(n)}_{l,j} implies an excitatory (inhibitory) effect. We let and refer, respectively, to the PPGLM with and without the history
term, respectively. That is, is a special case of with all ξ^{(n)}_{l,j} = 0. In the absence of spike history dependence, reduces to the standard rate function of an inhomogeneous
Poisson process. Therefore, is more general since it allows for accommodating
non-Possionian neuronal spiking by considering spike history dependence. To fit
the to the spiking activity for each of the 16 orientations for
each of 251 V1 neurons, we use the following five temporal windows:
[1–3], [4–6], [7–17], [18–23], [24–35] ms
(see Figure 3a for illustration).

### 2.4. Receptive Field Reconstruction via Filtered Backprojection.

^{2}Specifically, the continuous integration is approximated by a finite sum operation, where represents a discrete 1D Fourier transform of the filtered version of the estimated Radon transform (Kak & Slaney, 1988), where , and the frequency response |

*w*| in equation 2.10 defines a so-called Ramp filter in the frequency domain (note that |

*w*| is not a square integrable function, so its inverse Fourier transform does not exist in a conventional sense). Therefore, the reconstruction is viewed as a filtered backprojection process: a filtering operation (see equations 2.9 and 2.10) combined with a backprojection (see equation 2.8).

To make the filtered backprojection operation more robust with respect to noise, we use a modified Ramp filter (multiplied by a Hamming window). We specifically construct a symmetric, finite-length discrete filter (with a total length 2048, where 2048 is the next highest power of 2 for 2000) within the Nyquist bandwidth (and therefore cut off the frequency response outside the Nyquist range by setting the response to 0). In addition, we can implement the Ramp filter in the time domain instead of the frequency domain. In so doing, we zero-pad the filter as well as the Radon transform signal, which helps avoid aliasing. We thus replace the circular convolution with a nonperiodic convolution.

Finally, for each discrete point in the 1414 × 1414 grid, we compute (1 ⩽ *m*, *n* ⩽
1414), and we use a cubic spline to interpolate in the discrete grid from the filtered projection (see equation 2.10).

Note that there is a typical time delay between the moving bar stimulus and V1
responses in the order of ∼60 ms. For each direction, we used a
fixed latency of 60 ms to shift the neuronal response in time to reflect
the delay (otherwise the peak of the estimated RFs would be misidentified). The
rationale for estimating the latency and choosing 60 ms is as follows.
Given that the moving bar directions are spanned from 0 to 2π (the Radon
transform theory requires sampling only from 0 to π by assuming radial
symmetry), we can use bars moving in from reverse directions (e.g., 0 and
π, π/2 and 3π/2) to estimate the actual latency. If the
latency parameter is correct, two estimated CIFs from opposite directions would
be aligned in the peak response; in contrast, if the latency parameter is either
too large or too small, the peak responses of two estimated CIFs would deviate.
In fact, we have tested other alternative latency values of 50, 55, 65, and
70 ms and found that by using these alternative values, the reconstructed
RF is much more elongated, indicating that the latency is either too long or too
shorts.^{3}

### 2.5. Estimating Receptive Field Size and Orientation Tuning.

The standard minimum-response-field technique (Movshon et al., 1978; Pettet & Gilbert, 1992) to define the RF extent is based on a gaussian
shape assumption of the RF profile. However, the gaussian assumption is not
fully justified, and this ad hoc technique lacks statistical justification.
Because of the asymmetry of the V1 RF, we compute the RF size from the projected
temporal profile at each direction, using two criteria defined below. With the
standard *e*^{−1} criterion (criterion 1; DeAngelis et al., 1995), the RF size is defined by the extent (in time
and then converted to degree) where the neuronal response exceeds a threshold
(see Figure 4); and the threshold value is
defined as *e*^{−1} of the dynamic range above the baseline firing rate, , where *e* is the Euler's number
(*e*^{−1} ≈ 0.37); λ_{j,0} and define the baseline and the estimated peak firing rates at the *j*th direction, respectively; and their difference () defines the dynamic range. The *e*^{−1} criterion is based on the assumption that the RF is of a
gaussian or difference-of-gaussian shape. With criterion 2, the RF size is
defined by the extent to which the lower 95% CI of the CIF estimate is greater
than the baseline (see Figure 4). The user
chooses the level of confidence, which is determined by the need for
conservativeness based on the available data. Even with the identical confidence
level, a small sample size would often induce a large confidence bound (i.e.,
uncertainty) to the estimate.

*j*th direction, assuming that the maximum likelihood estimate follows a multivariate gaussian distribution (Pawitan, 2001), then by virtue of equation 2.4, the estimated log-CIF profile follows a univariate gaussian distribution, , and the estimated CIF profile follows a log-normal distribution with mean and variance statistics as The 95% (or 99%) CI roughly corresponds to the twofold (or threefold) standard deviation. In comparison to criterion 1, criterion 2 is statistically meaningful and makes no assumption of the RF shape.

*g*(α|θ) is estimated separately from the spatial tuning

*f*(

*x*,

*y*) at each direction θ

_{j}. The strength of

*g*(α|θ) in the specific angle θ

_{j}is identified by the estimated peak firing rate (maximum response during the complete 2 s time interval) and stored in the

*j*th entry of a vector. Upon collecting the values of estimated from 16 directions, we compute the circular variance (CV) (Mardia, 1972; Ringach, Shapley, & Hawken, 2002), and characterize the orientation tuning with an index: ϵ = 1 −

*CV*(where 0 ⩽ ϵ ⩽ 1). A large value of

*CV*or a low value of ϵ indicates a low sensitivity to orientation (see Figure 5c). Since the periodicity of orientation is 180°, but the moving bar direction is cyclic over 360°, the actual directions of the bar are used to calculate the directional preferences of each neuron. In all graphic visualizations presented later (see Figures 5b and 7), instead of visualizing

*h*(

*x*,

*y*, α) =

*f*(

*x*,

*y*)

*g*(α), we plot a rescaled version of

*f*(

*x*,

*y*) by assuming an isotropic orientation in all directions.

## 3. Results

### 3.1. Data.

The spiking activity from 251 V1 neurons was recorded in two awake macaque
monkeys during the moving bar experiment (see Figure 1c for the raster plot of one V1 neuron), in which a
moving light bar (luminance: 50 cd·m^{−2}, bar width:
0.2°) stimulus was used in 16 different orientations separated evenly by
22.5° in nine trials per direction presented in random order. Each trial
lasted 2 seconds. Details of experimental setup and recordings are described in
appendix C.

### 3.2. Encoding Analysis and Receptive Field Mapping.

In the encoding analysis, we estimate the 1D Radon transform of each of the 16 different orientations as the CIF of a PPGLM fit to the spiking activity. We compare the reconstructed RF from two PPGLMs—one using spike history () and one not using spike history ()—with estimates computed from standard methods of STA and WVK (using only linear kernels). The details of the STA and WVK methods are given in appendix B.

Figure 5 shows a typical comparison of four estimation methods. All provide rate function estimates in each of the 16 directions from the nine trials of spiking activity (see Figure 5a). In Figure 5a, the estimated CIFs are plotted against the raw spike rasters. The CIFs from the and the are smoother than those from the STA and WVK because the former two used smoothing cubic splines as basis functions. The WVK and STA rate function estimates are noisier because neither considers the point-process nature of the spiking activity. The WVK is a least-squares estimate whereas the STA is the simple average of the spike counts over the nine trials. Furthermore, the RF spatial tunings of the and the are sharper than those of the STA and the WVK (see Figure 5b). The spatial RF's peak amplitude of the is typically less than that of the because the latter does not consider spike history. Note that all spatial tuning estimates have stripe-like “artifacts” that are clearly related to the discrete sampling of the bars at the 16 orientations. In addition, the gives sharper orientation tuning measured as a greater ϵ-index value than the , STA, and WVK (see Figure 5c).

. | . | . | . | . |
---|---|---|---|---|

Complete data (9 trials per direction) | 238 (94.9%) | 171 (68.1%) | 154 (61.4%) | 129 (51.4%) |

Training data (6 trials per direction) | 243 (96.8%) | 183 (72.9%) | 162 (64.5%) | 128 (51.0%) |

Validation data (3 trials per direction) | 242 (96.4%) | 179 (71.3%) | 112 (44.6%) | 61 (24.3%) |

. | . | . | . | . |
---|---|---|---|---|

Complete data (9 trials per direction) | 238 (94.9%) | 171 (68.1%) | 154 (61.4%) | 129 (51.4%) |

Training data (6 trials per direction) | 243 (96.8%) | 183 (72.9%) | 162 (64.5%) | 128 (51.0%) |

Validation data (3 trials per direction) | 242 (96.4%) | 179 (71.3%) | 112 (44.6%) | 61 (24.3%) |

Note: The table entries are the number (and percentage) of the neurons whose KS plots are within the 95% CIs, suggesting a close agreement between the model and experimental data.

Of the four models considered, the gives the most accurate description of the spiking activity across all of the 251 V1 neurons in terms of cross-validation KS goodness-of-fit assessments (see Table 1 and Figure 6 for one example). In the validation data, the fits meet the KS goodness-of-fit criterion for 96.4% of the neurons compared with 71.3% of the , 44.6% of the STA and 24.3% of the WVK fits. Therefore, in the subsequent analyses, we compare the results with the conventional STA and WVK methods.

V1 neurons have both excitatory and inhibitory RFs. Two representative examples of spatial RF mapping are shown in Figure 7. In the encoding analysis, consistently obtains a smoother and sharper spatial RF estimate (higher contrast and spatially well defined).

### 3.3. Receptive Field Size, Orientation Tuning, and Bandwidth.

Our PPGLM framework also provides a new criterion for assessing RF size based on
comparing the baseline firing activity with the lower 95% CI of the CIF. The
logic behind this new definition is that activity significantly greater than
baseline can be considered the stimulus-evoked response. Using the new
criterion, we find that the estimate of RF size is asymmetric and larger than
that determined by the *e*^{−1} criterion. As illustrated in Figure 4 for one V1 neuron, the RF size estimates from three
selected moving bar directions are smaller using the old criterion than using
the new one. This is consistently observed across the majority of the
orientations (80.9% among all 16 × 251 orientations; see Figure 8c) of the V1 neurons.

We compare the population RF size and orientation estimates of three methods: , STA, and WVK (see Table 1). For . we consider two criteria for defining RF size: criterion 1,
the standard *e*^{−1} criterion, and criterion 2, defined by the 95% CI of the
estimated CIF. For STA and WVK, we consider only the standard *e*^{−1} criterion. In principle, we can also use the 95% CI
criterion for the WVK (i.e., roughly two standard deviations of the estimate;
see the appendix); however, the least-squares method may produce a negative CIF
estimate at some time points (see the last column of Figure 5a for an illustration), such that its lower 95% CI is
completely within the negative region. In this case, it is not meaningful at all
to use the 95% CI criterion for the WVK method.

As seen from Figure 8a, based on criterion
1, the median RF size estimates from , STA, and WVK are in close agreement (pairwise median tests
between STA and WVK, *n* = 4016 orientations, *P*>0.05). Using criterion 2, has a larger median RF size. The RF size estimates from the using criterion 2 are linearly related to, but significantly
larger than, the estimate from using criterion 1 (see Figure 8c; linear regression, RF_{GLM,crit 2} = 1.066
× RF_{GLM,crit 1} + 0.599), suggesting a clear offset between
these estimates. In the analysis of the orientation tuning (see Figure 8b), the ϵ-estimate from is significantly different from STA and WVK
(*n* = 251 cells, *P* < 0.01,
Kruskal-Wallis one-way ANOVA median test). No significant difference is found
between STA and WVK estimates. In addition, the ϵ-estimate from is nearly twice the estimates from WVK and STA (see Figures 8d and 8e). The average slope (±95% CI) for the linear regression of
WVK versus is 0.543 ± 0.021, and for STA versus it is 0.573 ± 0.023. These findings suggest that formal
statistical modeling of the point-process nature of the neural spiking activity,
combined with tomographic reconstruction and statistically based definition of
the RF property, leads to substantially different inferences about the RF size
and orientation tuning. Note that there is no significant correlation between
the estimated RF size and the orientation tuning. We compare the median RF size
and the ϵ-estimate from 16 directions from 251 neurons; the Pearson
correlation coefficients for the , STA, and WVK methods are −0.1284, −0.1548, and
−0.1122, respectively. None of these correlations is statistically
significant.

*b*(Ringach et al., 2002), we estimate the V1 RF bandwidth

*b*from the ϵ-estimate. A greater ϵ-estimate (a sharper orientation selectivity) yielded a narrower bandwidth. The median value 0.129 (25% quantile: 0.067, and 75% quantile: 0.250, respectively) of the ϵ-estimate among 251 V1 neurons (see Figure 8b) corresponds to a median bandwidth 0.35 Hz (0.39 Hz and 0.30 Hz, respectively).

### 3.4. Experimental Design.

For experiments using moving bar stimuli to estimate visual RF, two design questions arise naturally. What is the minimum number of moving bar directions required to reconstruct the RF with high fidelity? Given a fixed total number of trials, what is the optimal configuration of number of directions and the number of trials per direction?

^{2}= 1/8. The function

*f*(

*x*,

*y*) is properly shifted and scaled (by three parameters

*A*

_{1},

*A*

_{2}, and

*A*

_{3}) such that the ultimate function

*f*(

*x*,

*y*)>0 and has a peak firing rate of 125 Hz and a baseline firing rate of 25 Hz. The RF

*f*(

*x*,

*y*) is discretized in a 1414 × 1414 grid and has a bandwidth of 8 Hz/pixel.

From equation 2.13 we simulate 2 seconds of spike trains in each of 128 evenly spaced directions with nine trials per direction. From these data, we reconstruct the RF using 8, 16, 32, 64, and 128 directions and one to nine trials per direction (see Figure 9 for illustrations). Given a number of directions and a number of trials per direction, we fit PPGLMs to the spiking activity to estimate the Radon transforms in each direction and then applied filtered backprojection to the estimated transforms to reconstruct the RF. We evaluate the quality of the reconstruction by computing the mean-squared reconstruction error (MSRE) between the estimated RF and the true value across all 1414 × 1414 pixels.

As expected, the reconstruction error decreases as the number of directions and
the number of trials per direction increases (see Figure 10a). In examining the MSRE with respect to the total
number of trials, we observe that, provided the condition for the minimum number
of spatial samples is satisfied (at least 16 directions), the error decreases
with an increasing number of trials. However, as the total number of trials
reaches approximately 200, the MSRE approaches an asymptote of approximately 2
× 10^{−5} (see Figure 10b). As the sampling theory suggests, if the high-frequency details
of the RF are required (i.e., larger bandwidth), then the number of directions
is crucial, and oversampling will yield a more accurate reconstruction. Because
in moving bar experiments the neural responses are stochastic, a minimal number
of trials per direction is always required. We find that to achieve an MSRE of 2
× 10^{−5}, a feasible design would be to use 32
directions, with at least 6 trials per direction, or 192 trials for the
experiment, which suggests a different configuration from the 16 directions and
9 trials per direction used in our experiment.

### 3.5. Robustness of the Separability Assumption.

*x*

_{0},

*y*

_{0}) denotes the location of the peak of the gaussian envelope; (

*a*

^{2},

*b*

^{2}) denote the spatial variances along the (

*x*,

*y*) coordinates, respectively; (

*A*, ξ) denote the amplitude and rotation angle of the gaussian envelope, respectively; (

*u*

_{0},

*v*

_{0}) and φ denote the spatial frequencies and phase of the sinusoid, respectively. Taking the 2D Fourier transform of equation 3.3 further yields

Basically, the Gabor filter is spatially localized and can be viewed as a
bandpass filter that favors spatial frequencies within a certain range. Here, we
set *x*_{0} = *y*_{0} = 0, *a* = 50 pixels, *b* = 40 pixels, ξ = π/4, *u*_{0} = *v*_{0} = 1/80 cycles/pixel, and φ = 0. Let and , the spatial frequency BW is about
2*C*/*a* ≈ 0.0188 pixel, and the
orientation BW is about
2tan^{−1}(*C*/*Rb*) ≈
1.173 deg (see Movellan, 2008, for BW
derivations based on the half-magnitude RF profile). Figures 11a and 11b show
the real and imaginary parts of the Gabor filter in spatial domain on an image
of 1024 × 1024 pixels, respectively.

From the real part of the Gabor filter (which is properly shifted such that it is
nonnegative, with a peak firing rate of ∼80 Hz, and a baseline
firing rate of ∼20 Hz), we simulate 2 seconds of spike trains in
32 (or 64) evenly spaced directions in [0, 2π) with four trials per
direction.^{4} Using the separability assumption, we estimate the spatial tuning of the
visual RF and evaluate the quality of the reconstruction by computing the MSRE
between the ground truth (see Figure 11a)
and estimated RF profile (in terms of the pixel intensity of two images). The
estimated results are shown in Figures 11c
and 11d. It appears that despite the use
of the separability assumption, the estimated RF profile remains qualitatively
and quantitatively similar to the true RF. In these two cases, the MSRE values
are 0.0432 and 0.0304, respectively. As seen in Figure 11, although the details of the reconstructed RF
images are partly distorted, the underlying Gabor filter structure has been
captured, suggesting the robustness of the separability assumption.

## 4. Discussion

### 4.1. Tomographic Reconstruction and PPGLM.

By combining statistical modeling of neural spike trains using the PPGLM framework with the principles of tomographic reconstruction, we have presented a new two-step approach to estimating the RFs of V1 neurons from spiking activity recorded in an experiment in which a bar was moved across the visual field at multiple orientations. In the first step, a PPGLM is used to estimate the Radon transform as the CIF of the spike trains recorded across multiple trials in each direction. In the second step, under the linear superposition principle, we invert the estimated Radon transforms to reconstruct the V1 RF using filtered backprojection. This approach takes advantage of the point-process nature of the spiking activity and the highly structured geometry of the experiment. This latter step is analogous to image reconstruction in which use of filtered backprojection is standard practice (Kak & Slaney, 1988; Deans, 1993). Our use of tomographic principles to map V1 RFs is a further extension of previous work by Sun and Bonds (1994) in two important aspects: one is characterization of spiking activity via a PPGLM; the other is to propose a statistical criterion for determining the RF size.

Our PPGLM () that models spike history along with stimulus effect gives the most accurate description of the spiking data compared with a PPGLM that models only the stimulus effect () and the standard STA and WVK algorithms for rate function estimates. There are several reasons that the outperforms the STA and WVK. Instead of treating the spiking activity as a point process, the WVK use a linear gaussian model to relate the visual stimuli to neural responses (Paninski, 2003). The WVK parameters are one of the initial guesses commonly used to start the IRWLS algorithm used to compute the PPGLM parameter estimates (Pawitan, 2001). The STA is an approximation to the WVK. Neither of these methods takes into account spike history. The spike history effect reflects the nature of non-Poissonian firing (Shimokawa & Shinomoto, 2009), as evidenced in most V1 neurons (see Figure 3b). Compared to coarse spike count coding (as in STA and WVK), fine temporal coding contains more information of visual stimuli, as seen by its superior decoding performance in real visual system (Jacobs et al., 2009). It is also now appreciated that intrinsic neuronal dynamics including absolute and relative refractory periods are important for predicting current spiking activity (Truccolo et al., 2005; Truccolo, Hochberg, & Donoghue, 2010). Including both the stimulus and spike history in the PPGLM helps explain its superior performance relative to the STA and WVK in terms of the accuracy of the data description and the model generalizability, as evaluated, respectively, in the goodness-of-fit and cross-validation analyses. Since neither STA nor WVK meets the sampling criteria imposed by tomographic reconstruction, they consequently suffer a loss of reconstruction accuracy.

### 4.2. Receptive Field Size and Uncertainty.

The RF of a V1 neuron defines a region of space that is sensitive to a visual
stimulus. Such a region can have rather complex dependencies of the stimulus
sensitivity on the exact position and type of the stimulus. Despite this
complexity, an RF is often experimentally described with rather simple measures,
such as the RF size and the orientation specificity. Our analysis, based on the
maximum likelihood estimation of the PPGLM, provides a new definition of RF
size. This definition is more reasonable than the standard one because it uses
an accepted statistical criterion. The spatial extent of the RF is determined by
the region over which the lower bound of 95% CI of the CIF is greater than the
baseline. In contrast, the *e*^{−1} criterion assumes a gaussian shape for each RF and does not
consider the statistical properties of the data. In particular, it is possible
to set the *e*^{−1} boundary beyond the point where there is a statistically
significant increase in the CIF relative to baseline. It is also possible to
have statistically significant activity beyond the boundary defined by the *e*^{−1} criterion. This latter case is what we have observed. In
the analyses, the RF sizes based on the new criterion are
systematically larger compared with the RF sizes estimated by the *e*^{−1} criterion (see Figure 8c). The RFs also have greater orientation specificity than those
estimated by either the STA or the WVK (see Figures 8d and 8e).

One has to keep in mind that defining the RF size requires determining a threshold or minimal sensitivity given a certain type of a stimulus. Based on the signal detection theory, it is clear that changing the threshold or minimal sensitivity results in changes of the RF size. Given the typical shape of the RFs in V1, a stricter criterion that requires a larger sensitivity will typically result in a smaller RF size and vice versa. We expect the same for the threshold criterion and the CI criterion discussed in this article. For both criteria, the precise change of the RF size as a function of the threshold will depend on the exact shape of the spatial sensitivity. In the case of the CI criterion, the threshold itself also depends on the number of data used for fitting the PPGLM, so more data usually lead to smaller CIs and larger RF size. Therefore, the RF size estimate should be seen as not only an intrinsic property of the V1 neurons, but also an induced quantity of uncertainty about the data. The confidence of uncertainty is defined a priori and is often dependent on the number of data available.

Our proposed definition using the 95% CI criterion applied to the 251 V1 neurons analyzed in this study suggests that V1 RFs may be larger than currently appreciated. Presumably if we increase the threshold of confidence level (say, from 95% to 99%), the estimated RF size will become smaller, and the gap between these two criteria might become smaller. However, with the same level of confidence but with more spiking data collected from more experimental trials, the estimated RF size may also become smaller (due to reduced confidence bounds or reduced uncertainty for the estimated parameters).

### 4.3. Regularization.

Maximum likelihood estimation is known to subject to overfitting by using an
undesirable or unnecessary complex model. Due to the small number of trials
being collected, we have not extensively investigated the regularization option
for all the methods (WVK and PPGLM). Determining optimal regularization
parameters and regularization schemes (e.g., *L*_{1} norm or *L*_{2} norm) is a nontrivial estimation problem in the presence of small
sample size (Hastie, Tibshirani, & Friedman, 2009). Based on the goodness-of-fit assessment using the KS plot and
cross-validation, we believe that the overfitting issue is not a concern for .

### 4.4. Experimental Design.

Finally, our paradigm suggests an approach to determining the optimal design of moving bar experiments to maximize the accuracy of RF reconstruction. Given a fixed number of projections, the signal bandwidth that can be faithfully estimated is bounded above (Natterer, 2001). For the smoothing spline model, the number of control points per direction is bounded above by the number of moving bar projections multiplied by the time interval duration per direction. Therefore, in order to improve the estimation resolution of the RFs, it is necessary to increase the number of moving bar projections to meet the sampling criteria. Given a fixed number of trials, an experimental design can be chosen that balances the trade-off between accurate RF reconstruction and the number of directions and the number of trials per direction required to achieve it (see Figures 9 and 10).

### 4.5. Future directions.

Our findings suggest several directions for future investigations. We can directly compare V1 RF properties (size and orientation tuning) obtained using different stimulus paradigms and examine the plasticity or sensitivity of V1 RF to distinct visual stimuli (Pettet & Gilbert, 1992; DeAngelis et al., 1995; Wandell & Smirnakis, 2009). Since our moving-bar tomographic reconstruction framework is not limited by the assumption of a linear relationship between the visual stimuli and their neural responses, it can also be used to examine the nonlinear stimulus-response relationship of neurons in other visual areas, such as V2. Alternatively, the principle can be extended to estimate the spatiotemporal RF profile (Rust, Schwartz, Movshon, & Simoncelli, 2005). In addition, instead of using filtered backprojection, alternative computational approaches for tomographic reconstruction (Herman, 2009; Mao, Fahimian, Osher, & Miao, 2010) can be used to improve the reconstruction accuracy and reduce the minimum number of moving bar directions required. Finally, our current RF model is rather simplified and limited by several assumptions. It would be interesting to explore the possibility of estimating the spatiotemporal visual RF (Theunissen et al., 2001; Stanley, 2002; Ringach, 2004; Nishimoto et al., 2005) using the same or a modified experimental paradigm. These studies will be topics of future reports.

### 4.6. Limitations.

Finally, there are some limitations in using the moving bar stimuli paradigm. Considering a 0.2° bar width, the size of the bar width might be comparable to the full RF of some V1 cells; therefore, it might be too coarse to reconstruct the fine internal structure of RF. In addition, at the eccentricity of 2° to 5°, the fixation noise is comparable to the RF size, which might cause blurring in the reconstruction. Despite these inherent limitations caused by the original moving bar experiments, we still believe that our proposed analysis approach provides an alternative computational solution for mapping visual RFs using moving bar stimuli.

In addition, in this article, we estimate only the orientation tuning, not the directional tuning property of the V1 neurons. If V1 neurons are directionally selective (de Valois, Yund, & Hepler, 1982; Livingstone, 1998; Livingstone & Conway, 2003), then our approach would yield some biases in the RF estimate. This is due to the fact that the Radon transform is defined in the symmetric plane with orientation coverage of [0, π) (Kak & Slaney, 1988; see also appendix A); therefore, the orientation selectivity is implicitly assumed. Despite this simplified assumption (in addition to the assumption of separability of spatial and orientation tunings), our approach still provides a useful tool to assess the visual RF properties (complementary to the STA and WVK approaches).

Since the moving bar experiment remains a classic paradigm for mapping visual RFs in awake monkeys, we believe that our proposed approach and the investigation of experimental design would provide valuable information for neuroscientists. This can also be used as a complementary approach to compare the RF estimate obtained from the other experimental paradigms based on either random white noise stimuli or m-sequences.

## Appendix A: Continuous Radon Transform

The Radon transform (Radon, 1917) is a method that has been widely used in various inverse reconstruction problems, such as computerized tomography, nuclear magnetic resonance, optics, astronomy, and geophysics (Kak & Slaney, 1988; Deans, 1993). The common task of these problems is to reconstruct a 2D or 3D function based on its lower-dimensional (1D or 2D) projections, which are treated as samples of the Radon transform.

*f*(

*x*,

*y*) be a continuous, band-limited function defined on a 2D Euclidean space . The Radon transform of

*f*along a direction θ ∈ [0, 2π], denoted as , is defined by a set of line integrals along the projection direction θ at a distance

*t*of the origin. Mathematically, the line integral is written as where (

*t*,

*s*) are defined by the rotated version of (

*x*,

*y*) as follows: Here,

*t*=

*x*cos θ +

*y*sin θ defines a line in the 2D plane, which is specified by an angle θ and a distance

*t*from the origin. Therefore, the Radon transform is a 1D line projection of

*f*(

*x*,

*y*) for a given direction θ. The underlying mathematical theory for tomographic reconstruction is the projection-slice theorem. Let and denote the 1D and 2D Fourier transforms, respectively. The 2D Fourier transform of

*f*(

*x*,

*y*) is written as where , and {

*u*,

*v*} are the spatial frequencies. Let

*u*=

*w*cos θ and

*v*=

*w*sin θ, which convert the Cartesian coordinate system (

*u*,

*v*) in the frequency domain to a polar coordinate system (

*w*, θ); we then obtain Because of

*t*=

*x*cos θ +

*y*sin θ (see equation A.2), it follows that From equation A.1, equation A.5 is further written as That is, the 2D Fourier transform of a function

*f*(

*x*,

*y*) is identical to the 1D Fourier transform of the Radon transform . Note that equation A.6 is defined in the polar coordinate system (

*w*, θ) of the frequency domain. To reconstruct

*f*(

*x*,

*y*), we apply the inverse 2D Fourier transform to equation A.6, which yields

*u*=

*w*cos θ,

*v*=

*w*sin θ, and d

*u*d

*v*=

*w*d

*w*dθ, equation A.7 is rewritten as In light of the Fourier transform property, , equation A.8 is further written as where , and in which the frequency response |

*w*| defines a so-called Ramp filter. Hence, the reconstruction is viewed as a filtered backprojection process: a filtering operation (see equation A.10) combined with a backprojection (see equation A.9).

### A.1. Numerical Issues of Filtered Backprojection.

In terms of discrete implementation of the inverse Radon transform, it is worth pointing out several additional important numerical issues that are often encountered in practice (some of which might not be crucial in the computerized tomography problem but are pivotal in the RF mapping problem):

- •
*Ramp filter.*The original Ramp filtering is defined in continuous time. Due to discrete implementation, the exact zero frequency filtering cannot be done, so some very low-frequency components might be cancelled out. One of the solutions to alleviate this problem is to implement the Ramp filter in the time domain instead of in the frequency domain, namely, to zero-pad the filter as well as the Radon transform (Kak & Slaney, 1988), followed by FFT. This is aimed at avoiding the aliasing in circular convolution and replacing the circular convolution with a nonperiodic convolution. - •
*DC leakage.*If the 2D RF profile has a nonzero baseline, the DC leakage phenomenon will occur due to the discrete Fourier transform in . This issue will become more severe because of the finite projection effect (the fewer the projections, the more severe is the problem). To tackle this issue, we came up with a practical solution. Because the inverse Radon transform involves only linear operations, the linear superposition principle will hold. Therefore, we can first estimate the DC component*c*(the global mean value) of the reconstructed RF image and apply the Radon transform to a constant DC image (with the same image size), from which we obtain . Before applying the inverse Radon transform, we subtract the original Radon transform by , and finally, we apply the inverse Radon transform to , from which is obtained, and we estimate the RF by adding back the constant for every pixel (*x*,*y*). In practice, this trick is quite effective in reducing the DC leakage effect. - •
*Boundary effect.*In the discrete inverse Radon transform, because of finite approximation and interpolation, the boundary values in the reconstructed RF images are less reliable (typically overestimated). In practice, we may simply ignore the estimated boundary values and display only the central image.

## Appendix B: STA and WVK for RF Estimation

A visual RF can be viewed as a linear transfer function that maps the visual stimulus input to a neuronal response output. The RF characterizes the basic firing property of a neuron, either excitatory or inhibitory. In the experimental setup of mapping a 2D visual RF, we assume that the RF space has a size of ℓ × ℓ pixels.

*j*= 1, …,

*J*denote the number of bar projections. In the

*j*th direction and at the

*k*th time bin, let be a vectorized visual stimulus field input (represented as a 1 × ℓ

^{2}row vector), and let

*n*

_{j,k}denote the corresponding observed spike count (neural response output). Let denote a concatenated matrix that consists of

*K*visual stimuli observed from

*K*time instants at the

*j*th direction (where

^{⊤}denotes the transpose operation); let denote a column vector with the concatenated neural response output at the

*j*th direction. We further construct an ensemble stimulus matrix that combines all

*J*directions as well as an ensemble response vector , such that The linear neural encoder relates the stimulus input with the neural response output via a linear regressor , where is a parameter vector to be estimated. Upon rematrization of into an ℓ × ℓ matrix, we obtain the 2D RF estimate projected onto an ℓ × ℓ pixel space. Equation B.1 defines a linear representation between the input and output , and the parameter vector serves as a linear transfer function.

^{5}The variance of the estimate

*ξ*_{WVK}is where (the denominator

*KJ*− ℓ

^{2}denotes the statistical degrees of freedom).

*N*denote the total number of spikes, and let be an ℓ

^{2}× ℓ

^{2}diagonal matrix with

*N*in all diagonal entries. Then the STA estimate is a special case of equation B.2 by letting , where

*c*= 1/

*N*denotes a rate constant, which multiplies the vector to yield the STA estimate. That is, STA computes the normalized spike counts across time related to the stimulus; hence, it can be interpreted as coding a preferred stimulus of the neuron. In the so-called whitened STA or rotated STA (RSTA) (Paninski, 2003), the regressor vector is weighted by a full matrix rather than a constant: , where denotes the sample covariance matrix computed from the stimulus samples. By simple linear algebra, we can see that the RSTA estimate is a scaled version of the WVK solution.

Next, in the context of moving bar experiments, we would like to examine the STA and
WVK estimates (given data collected from all moving bar directions) in relation to
their individual estimates given data collected in each direction. Put another way,
can the STA or WVK estimate be composed into a linear sum of *J* estimates computed separately from each moving bar direction?

*N*= ∑

_{j}^{K}

_{k=1}

*n*

_{j,k}denote the number of total spike counts in the

*j*th moving bar direction, and let

*N*= ∑

^{J}

_{j=1}

*N*and

_{j}*c*= 1/

_{j}*N*. We rewrite the constant

_{j}*c*in equation B.4 as Hence,

*c*is equal to 1/

*J*fraction of the harmonic mean of

*c*obtained from all

_{j}*J*directions. By decomposing the estimate

*ξ*_{STA}in terms of individual estimates computed from each moving bar direction, equation B.4 can be rewritten as where denotes the individual estimate from the

*j*th direction. Since the harmonic mean and the arithmetic mean are generally different in value, we know from equation B.4 that . However, in the special case when

*N*or

_{j}*c*is identical across all directions (i.e., the neuron firing is invariant to all orientations), the STA estimate is purely a linear average of the individual STA estimates from each direction.

_{j}*j*. In the moving bar experiment, if the bar length is greater than the diameter or maximum span of the visual field, this special condition could be satisfied since the area coverage of the moving bar, , is exactly the same at each direction in a circular-shape visual field (due to circular symmetry).

Note that in equation B.7, the matrix has to be of full rank for all *j*th directions,
which typically requires a large number of *K* for each direction.
Because of the least-squares fit, the elements of are not always nonnegative (see, e.g., Figure 5a, right-most column), and the least-squares estimate
might be sensitive to noise or outliers in , as reflected in the worst performance in goodness of fit (see
Table 1).

## Appendix C: Data Recordings and Visual Stimulus Setup

The experimental procedures were approved by the National Committee on Animal Welfare
in compliance with the guidelines of the European Community for the care and use of
laboratory animals. Two adult female rhesus monkeys (*Macaca
mulatta*) were used in this study. Neuronal spiking activity was recorded
from awake and head-fixed monkeys in the opercular region of V1 (2–5°
of eccentricity) and, on some occasions, from the superior bank of the calcarine
sulcus (8–12° of eccentricity). The receptive fields were first
visualized using a method that was developed by Mario Fiorani in the Biophysics
Institute Carlos Chagas Filho at the Federal University of Rio de Janeiro, Brazil
(Fiorani et al., 2001).

Each trial started with an appearance of a red fixation point (0.15°;
luminance: 10 cd·m^{−2}), and the monkeys were trained to
press a lever within 700 ms and to maintain their gazes within a virtual
window (window size: 1°) centered on the fixation point. The task ended by
randomly changing the color of the fixation point from red to green between 2500 and
4000 ms after the trial onset. To obtain a juice reward, monkeys had to
release the lever within 500 ms after the color change in the fixation point.
Trials were discarded whenever fixation was interrupted. The monkeys’ eye
positions were tracked continuously by a search coil system (DNI and Crist
Instruments) with a 500 Hz sampling rate.

Visual stimuli were generated as sequences of computer-generated bitmap images with
an interface developed in LabVIEW (National Instruments) and were presented as
movies (frame rate: 100 frames/s) using a standard graphic board (GeForce
6600-series, NVIDIA) controlled by ActiveSTIM (www.activestim.com). The CRT monitor (CM813ET, Hitachi) was gamma
corrected and spanned a visual angle of 36° × 28°. The visual
stimulus was a high-contrast light bar (luminance: 50
cd·m^{−2}; bar width: 0.2°) that was moved with a
constant velocity (14.9°/s) in the visual field (21.8° ×
21.8°). The bar stimulus was presented in random order in 16 different
orientations separately and evenly by 22.5°. Each trial lasted 2 seconds
(during which the bar traveled inside the visual field), where *t* = 0 and *T* = 2 s, respectively, corresponded to the
moment when the bar's position overlaid two corners of the visual field in the
diagonal and *t* = 1 corresponded to the center of the field
(see Figure 1b).

Quartz-insulated tungsten-platinum electrodes (∼80 μm diameter, 0.3–1.0 MΩ impedance; Thomas Recording) were used to record the extracellular neural activity from three to five sites in both superficial and deep layers of the striate cortex (digitally bandpass filtered, 0.7–6.0 kHz; Plexon). Neuronal spikes were detected by amplitude thresholding, which was set interactively based on online visualization of the spike waveforms. Spike events and corresponding waveforms were sampled at 32 kHz (spike waveform length: 1.2 ms). Single-unit activity was separated using an interactive custom-designed spike-sorting software. Experimental trials with artifacts were rejected during which either monkey did not maintain fixation or showed no response or incorrect behavior. Upon trial rejection, the complete data sets were balanced for all conditions such that all recorded V1 neurons contained exactly nine trials per direction.

## Acknowledgments

We thank two anonymous reviewers for valuable comments. This work was supported by the U.S. National Institutes of Health, grant DP1-OD003646 (to E.N.B.), and the European Union, FP7-1CT, grant 240763 (to G.P.). Some preliminary results were presented in COSYNE’09, Salt Lake City, Utah (Pipa, Chen, Neuenschwander, Lima, & Brown, 2009).

## References

*Computational and Systems Neuroscience (COSYNE’09)*

## Notes

^{1}

A more realistic model for the V1 RF is the 2D Gabor filter (Daugman, 1980; Jones & Palmer, 1987), which has coupled or nonseparable spatial and orientation tunings. However, applying the Radon transform to the nonseparable Gabor filter model would make the analysis nontrivial (see equation 2.1).

^{2}

The number 1414 = arises from the fact that we treat the moving bar 2 s traveling time (binned as a 2000-dimensional vector) as the diagonal in the rectangular stimulus field (see Figure 1b).

^{3}

Alternatively, as suggested by one reviewer, one could apply the reverse correlation or STA method to reveal the period between the stimulus onset and the cortical response that is either significantly correlated or not correlated, and based on that to use it as a rough estimate of the latency.

^{4}

Assuming circular symmetry, this is equivalent to sampling from 16 (or 32) evenly spaced directions in [0, π) with eight trials per directions.

^{5}

This is obtained by solving the Yule-Walker equation. Multiplying to both sides of equation B.1, we approximate the autocorrelation and cross-correlation as and , respectively, and it follows that , and the linear least-squares estimate yields the optimal Wiener solution.

## Author notes

* Gordon Pipa and Zhe Chen contributed equally.

Color versions of figures in this article are provided in the online supplement, available at http://www.mitpressjournals.org/doi/suppl/10.1162/NECO_a_00334.