The anterior intraparietal area (AIP) of macaques contains neurons that signal the depth structure of disparity-defined 3-D shapes. Previous studies have suggested that AIP's depth information is used for sensorimotor transformations related to the efficient grasping of 3-D objects. We trained monkeys to categorize disparity-defined 3-D shapes and examined whether neuronal activity in AIP may also underlie pure perceptual categorization behavior. We first show that neurons with a similar 3-D shape preference cluster in AIP. We then demonstrate that the monkeys' 3-D shape discrimination performance depends on the position in depth of the stimulus and that this performance difference is reflected in the activity of AIP neurons. We further reveal correlations between the neuronal activity in AIP and the subject's subsequent choices and RTs during 3-D shape categorization. Our findings propose AIP as an important processing stage for 3-D shape perception.
Humans, as well as monkeys, can effortlessly recognize and categorize 3-D shapes. Current evidence suggests that this capacity arises from a network of areas in both the dorsal and the ventral visual stream (Anzai & DeAngelis, 2010; Parker, 2007), as well as frontal areas (Theys, Pani, van Loon, Goffin, & Janssen, 2012; Joly, Vanduffel, & Orban, 2009; Ferraina, Paré, & Wurtz, 2000). Yet the exact role played by each area or visual stream remains a matter of ongoing research (Neri, 2005).
Recent findings point to the ventral stream as an important basis for disparity-defined 3-D structure perception (Georgieva, Peeters, Kolster, Todd, & Orban, 2009; Yamane, Carlson, Bowman, Wang, & Connor, 2008; Chandrasekaran, Canon, Dahmen, Kourtzi, & Welchman, 2007; Uka, Tanabe, Watanabe, & Fujita, 2005; Janssen, Vogels, & Orban, 1999, 2000). For example, area V4 has been implicated in fine disparity discrimination using both correlational and causal techniques (Shiozaki, Tanabe, Doi, & Fujita, 2012). Further along the ventral stream, the activity of neurons in the lower bank of the STS in the anterior inferior temporal (IT) cortex correlates with perceptual decisions made by monkeys during 3-D shape categorization (Verhoef, Vogels, & Janssen, 2010), and microstimulation of these neurons strongly and predictably influences 3-D shape categorization behavior (Verhoef, Vogels, & Janssen, 2012).
However, neurons in the dorsal visual stream also signal 3-D structure information (Theys, Srivastava, van Loon, Goffin, & Janssen, 2012; Srivastava, Orban, De Mazière, & Janssen, 2009; Preston, Li, Kourtzi, & Welchman, 2008; Durand et al., 2007; Nguyenkim & DeAngelis, 2003; Tsao et al., 2003; Tsutsui, Jiang, Yara, Sakata, & Taira, 2001), and neurons in several dorsal stream areas (e.g., MT, MST, CIP, and LIP) have been implicated in perceptual decisions (Swaminathan & Freedman, 2012; Hanks, Ditterich, & Shadlen, 2006; Tsutsui et al., 2001; Britten & van Wezel, 1998; DeAngelis, Cumming, & Newsome, 1998; Salzman, Britten, & Newsome, 1990). These findings raise the possibility that some dorsal stream areas, such as the anterior intraparietal area (AIP) with its 3-D shape-selective neurons (Srivastava et al., 2009), are also involved in 3-D shape perception.
A previous study reported choice-related neuronal activity in AIP during disparity-defined 3-D shape categorization (Verhoef et al., 2010). This choice-related activity seemingly occurred after the perceptual decision had already been made. However, in that study, the decision time was estimated indirectly, based on small displacements of the average eye position in the direction of the upcoming response saccade. Here we reexamine AIP's choice-related neuronal activity using an RT version of the 3-D shape categorization task, which provides a direct upperbound on the decision time on a trial-by-trial basis.
Recent work by Nienborg and Cumming (2014) suggests that choice-related activity is more easily discovered when the brain region examined has an anatomical map for the feature tested in the task (see also Mayo & Verhoef, 2014). We accordingly measured choice-related activity in clusters of AIP neurons with a similar 3-D shape preference.
We show that the monkeys' 3-D shape discrimination performance depends on the stereo coherence and position in depth of the stimulus and that this performance variability is reflected in the activity of AIP neurons. We further demonstrate that AIP exhibits early choice-related activity during the RT version of the 3-D shape discrimination task.
Subjects and Surgery
One male (M1) and one female (M2) rhesus monkey (Macaca mulatta) served as subjects. Both monkeys participated in an earlier study (Verhoef et al., 2012). Recording chambers were implanted under isoflurane anesthesia and aseptic conditions. Each monkey received a recording chamber (Crist Instrument, Hagerstown, MD) that was positioned with the electrode axis ∼40° from vertical, over the right (M1) or left (M2) anterior intraparietal sulcus (IPS; Figure 1C, D). The tilted recording chambers allowed electrodes to penetrate the lateral bank of the IPS approximately orthogonal to the cortical surface. All surgical procedures and animal care were approved by the KU Leuven ethics committee and in accordance with the European Communities Council Directive 2010/63/EU. Structural MRI (0.6 mm slice thickness) using glass capillaries filled with a 1% copper sulfate solution inserted into several grid positions and the pattern of gray-to-white matter transitions confirmed that the recordings were made in the anterior part of the lateral bank of the IPS (Horsley–Clark coordinates: M1: −0.6–3.6 mm anterior, 13–19 mm lateral; M2: 1.8–3.8 mm anterior, 12–16 mm lateral).
Stimuli and Tasks
3-D Structure Categorization Task
We trained monkeys to discriminate convex and concave 3-D structures. The stimulus set consisted of static random-dot stereograms with eight different 2-D shape outlines (e.g., circle, ellipse, square, etc.; see Figure S1 in Verhoef et al., 2012; size: ∼5°). Stimuli were presented foveally on a gray background. Foveally presented stimuli are generally effective in engaging AIP neurons as has been shown in previous studies (Romero, Van Dromme, & Janssen, 2012; Srivastava et al., 2009). The depth structure was defined solely by horizontal disparity as a 2-D radial basis Gaussian surface (standard deviation = 48 pixels, 0.96°) which could be either convex or concave (maximal disparity amplitude: 0.15°). The dots consisted of Gaussian luminance profiles (width: 7 pixels; height: 1 pixel; horizontal standard deviation: 0.7 pixels; 1 pixel ≈ 0.02°). For each dot, the mean of the Gaussian luminance profile could be positioned along a continuous axis resulting in perceptually smooth stereograms with subpixel resolution. Stimuli were presented at three positions in depth, that is, before (Near), behind (Far), or at (Fix) the fixation plane (±0.23° depth variation). Stimuli were presented dichoptically using a double pair of ferroelectric liquid crystal shutters (two superimposed shutters in front of each eye; Displaytech, Longmont, CO). The four shutters operated at 60 Hz and were synchronized with the vertical retrace of the display monitor (vertical refresh rate, 120 Hz) equipped with a fast decay P46 phosphor (VRG). Each eye was therefore stimulated at 60 Hz. There was no measurable crosstalk between the eyes. The viewing distance was 86 cm. Task difficulty was manipulated by varying the percentage of dots defining the surface, that is, the stereo coherence. Dots that were not designated as defining the surface were assigned a disparity drawn randomly from a uniform distribution (support = [−0.50°, 0.50°]). For each experiment, we used 20 different random-dot patterns per signal strength. Each trial started with a prestimulus interval, the duration of which was randomly selected from an exponential distribution (mean = 570 msec, minimum duration = 250 msec, maximum duration = 1500 msec). Following the fixation period, a stimulus with a randomly chosen shape (convex or concave), position in depth (Far, Fix, Near), and stereo coherence (0%, 15%, 20%, 30%, 100%) was presented on the monitor. After stimulus onset, the monkey was free to indicate its choice at any time by means of a saccade to one of two choice targets located at 6° eccentricity from the fixation point (Figure 1A). Monkeys were required to maintain fixation (fixation window < 1.5° on a side) on a small fixation point until they initiated their choice saccade. Once the monkey left the fixation window, the stimulus was extinguished. Choice targets remained visible throughout the trial until one of the targets had been fixated for 300 msec. The few trials in which monkeys fixated the choice target less than 300 msec or switched between choice targets were not rewarded and were not included in the data set. Correct responses were followed by a liquid reward. The 0% signal strength trials were randomly rewarded with a probability of .5.
Target Onset Asynchrony Saccade Task
Each trial of the target onset asynchrony saccade (TOAS) task started with a prestimulus interval, the duration of which was randomly selected from an exponential distribution (mean = 570 msec, minimum duration = 250 msec, maximum duration = 1500 msec). Following the prestimulus period, two choice targets appeared on the screen with a variable time delay (TOA) between their onsets. The monkey was required to make a saccade to the choice target that appeared first (Figure 1B). In each trial, a TOA was randomly chosen, ranging between 8 and 150 msec, such that on any trial the monkey could neither predict the location of the first target nor the TOA. Choice targets' locations matched those of the categorization task, that is, 6° eccentricity from the fixation point.
Recording of Neuronal and Eye Position Signals
Extracellular recordings were made using tungsten microelectrodes (impedance, ∼0.7–1.2 MΩ at 1 kHz; FHC, Bowdoin, ME). Details of the physiological recording methods have been described previously (Verhoef et al., 2010). The positions of both eyes were sampled at 500 Hz using an EyeLink II (SR Research, Ottawa, Ontario, Canada) system.
We sampled AIP along electrode penetrations approximately orthogonal to the cortex in steps of ∼100–150 μm. For each electrode position in a penetration, we first selected the optimal (within our stimulus set) 2-D shape outline (circle, ellipse, square, etc.) for the multiunit activity (MUA) of that site using a passive fixation task. With the optimal 2-D shape outline thus chosen, we then tested the 3-D structure selectivity of a site by presenting 100% stereocoherent concave and convex 3-D structures at one of three different positions in depth (i.e., Near, Fix, Far). The electrode was subsequently retracted to the center of the 3-D structure-selective cluster where we again verified that the MUA still displayed the same 3-D structure preference before starting the 3-D structure categorization task. We used the optimal 2-D shape outline for the MUA at the cluster center throughout the categorization task. We adopted the following criterion for defining a 3-D structure-selective cluster: The center position of a cluster had to be neighbored by MUA positions having the same 3-D structure selectivity for at least 125 μm in both directions (i.e., upward and downward). Similar criteria have been used in previous studies (Verhoef et al., 2012; Hanks et al., 2006; Uka & DeAngelis, 2006; Salzman et al., 1990). To maximize the number of trials in the main experiment (i.e., categorization and TOAS) and to minimize the cortical damage inflicted upon the positions with significant 3-D structure selectivity, we did not sample the entire extent of a 3-D structure-selective cluster. Hence, our data allow us to state that 3-D structure-selective clusters exist in AIP but do not allow us to estimate the size of the 3-D structure selectivity clusters in AIP.
The population neurometric function was computed using linear support vector machines (SVM; spider package) based on 45 MUA sites. For every MUA site, we randomly selected seven trials per 3-D structure (n = 14 trials for convex and concave stimuli combined) for training and one trial per 3-D structure, not used for training, for testing. Fourfold cross validation on a separate sample of four trials per 3-D structure (n = 8 trials for convex and concave stimuli together) was used to optimize the soft margin parameter for each SVM model. This procedure was repeated 1000 times, and the average proportion of correct classification on test trials across all repetitions was taken as the population neurometric performance. We repeated this procedure for each combination of stereo coherence and position in depth, and the population neurometric performance for each position in depth was subsequently fitted with a cumulative Weibull function: . This procedure allows the possibility of a different hyperplane, and hence a different population read-out, per stereo coherence and position in depth. It therefore gives an approximation of the best possible performance that can be achieved using the data. The population neurometric threshold was defined as the stereo coherence of the fitted function at which 75% accuracy was achieved.
Choice probabilities (CPs) were calculated for each combination of stereo coherence, 3-D structure, and position in depth based on the neuronal responses recorded during 3-D structure categorization (Britten, Newsome, Shadlen, Celebrini, & Movshon, 1996). For this purpose, the 3-D structure preference was obtained from the neuronal responses during passive fixation (see above). CPs were derived from the average spike rate in the time interval from 100 msec after stimulus onset until the monkey's RT. For each 3-D structure-selective site, the CP was tested for a significant deviation from .5 using a permutation test (α = .05). Grand CPs were obtained for each 3-D structure-selective site by first z-scoring the neuronal data within each condition (with ≥5 trials per choice and an RT of ≥150 msec; no 100% stereo coherence conditions) and subsequently combining, that is, bringing all z-scored data into one data set, the z-scored data from different conditions to calculate the grand CP. z-Scoring followed the method proposed by Kang and Maunsell (2012), giving a grand CP corrected for biases that occur due to unbalanced samples. We examined the time course of the grand CP using a sliding window analysis in which the grand CP was calculated for the neuronal data contained in a 100-msec time window that was advanced in time in steps of 10 msec.
The CPs measured during the TOAS task were computed similarly. For calculation of the preferred saccade direction, we used all trials with a TOA of 50 msec for which RTs exceeded 180 msec. The preferred saccade direction was defined as the saccade direction with the highest firing rate in the time interval starting 50 msec before saccade onset and ending 100 msec after saccade onset, that is, including both pre- and postsaccadic activity. This time interval largely avoids the influence of the particular order of choice target presentations because, at the start of the interval, both targets had already been present on the screen for at least 80 msec. Grand CPs were computed on the trials from each condition with a TOA of less than 50 msec (with ≥5 trials per saccade direction and an RT of ≥150 msec).
The time course of the correlation between the spike rate and RT was estimated as follows: neuronal responses were first smoothed by convolving each trial's spike train with a Gaussian filter (σ = 25 msec). The smoothed spike trains and RTs were next z-scored within each stimulus condition. Pearson's product–moment correlation (r) between the z-scored RTs and the z-scored smoothed neuronal responses was then calculated across stimulus conditions as a function of time for preferred and nonpreferred choices separately and averaged across sites. For better comparison with the CP results, spike–RT correlations were calculated on all stereo coherences except 100% stereo coherence.
All analyses related to the 3-D structure categorization task were based on correct and incorrect trials, that is, all convex and concave responses.
We trained two rhesus monkeys (M1 and M2) to categorize 3-D structures as either convex or concave. Static random-dot stereograms depicted either a convex or concave surface, which was presented at one of three positions in depth, that is, in front of (Near), behind (Far), or within (Fix) the fixation plane. This procedure requires the subject to rely on depth variations within the stimulus (i.e., disparity gradients or curvature) rather than using perceptual strategies that are based on position-in-depth information (i.e., “near” or “far” decisions; see Verhoef et al., 2010, 2012). Task difficulty was manipulated by varying the percentage of dots defining the 3-D surface, denoted as the percent stereo coherence. The monkeys were allowed to indicate their choice at any time after stimulus onset by means of a saccade to one of two choice targets (Figure 1A). In addition to examining choice behavior, this procedure allowed us to measure RTs (see Methods) and delimit the perceptual-decision process more precisely in time.
Clustering of 3-D Shape-selective Neurons in AIP
Previous studies have shown that AIP neurons are 3-D structure selective (Theys, Srivastava, et al., 2012; Srivastava et al., 2009). However, it is still unknown whether neurons with similar 3-D structure preferences cluster together in AIP. We examined this issue by assessing the 3-D structure preference of MUA at regularly spaced intervals (steps of ∼100–150 μm) along the cortex. Because the recording chamber was positioned obliquely above the IPS, the electrode penetrated the cortex of the lateral bank of the IPS approximately orthogonal to the surface (Figure 1C, D). At each cortical position, we determined the 3-D structure selectivity of the MUA using a passive fixation task in which the monkey viewed 100% stereocoherent convex or concave stimuli positioned at one of three positions in depth. In different trials, stimuli were positioned either behind, within, or in front of the fixation plane. Figure 2 shows that neurons possessing similar 3-D structure preferences, that is, convex or concave, cluster together with an observed maximum extent (along the electrode axis) of 1100 μm and an average extent of 336 μm (SEM = 24 μm) and 537 μm (SEM = 39 μm) for monkeys M1 and M2, respectively. We observed a total of 7 convex- and 9 concave-preferring neuronal clustering in monkey M1 and 37 convex- and 7 concave-preferring neuronal clusters in monkey M2. It should be noted that these cluster size estimates are most likely biased due to cortical instabilities (i.e., gradual rise of the cortex after electrode penetration), adhesion of the cortex to the electrode, and time constraints (i.e., we did not sample the entire extent of the lateral bank of the IPS within a single penetration). Although our experiment was not designed to examine the exact spatial extent of such clustering, these data nonetheless show that neurons with similar 3-D structure preferences are spatially organized in AIP to a degree apparently similar to that seen in IT (Verhoef et al., 2012). Once we encountered a 3-D structure-selective neuronal cluster, we positioned the electrode at the estimated center of that cluster and verified the 3-D structure preference once more (p < .05, ANOVA) before starting the categorization experiment (in 45 of 60 clusters).
Neurometric Performance in AIP
We first assessed whether the neural activity in AIP reflects psychophysical performance during the 3-D structure categorization task. We examined this question by constructing a population neurometric function by means of a multivariate linear classifier (SVM; see Methods). These neurometric functions were computed on the MUA recorded during the trial interval from 100 msec after stimulus onset (to account for the average selectivity latency) until the RT of the monkey (see Methods). Specifically, we tested how well this classifier could categorize between convex and concave stimuli at different stereo coherences using single trial data from our pseudopopulation of AIP MUA (n = 45). We also noted that each monkey performed worse on trials when stimuli were presented Near, followed by trials with Far stimuli, with best performance for stimuli presented at the fixation plane (Figure 3A; p < .01 for each monkey for all pairwise comparisons of psychophysical thresholds at different positions in depth; see also Verhoef et al., 2012). We therefore calculated a population neurometric function for stimuli at each position in depth to see whether this position-in-depth dependency of the behavioral performance was mirrored in the performance of the population of AIP neurons. As can be seen in Figure 3B, the population neurometric functions resembled the psychometric functions. The population neurometric threshold for Far (21%), Fixation (18%), and Near (35%) stimuli approximated the average psychophysical threshold for Far (15%), Fixation (12%), and Near (31%) stimuli. The population neurometric thresholds for Far and Fixation stimuli did not differ significantly from each other (p = .13), but both differed significantly from the neurometric threshold for Near stimuli (p < .002, bootstrap test). It should be noted that the similarity between the exact values of the neurometric and psychometric thresholds could be a coincidence caused by the number of MUA sites in the pseudo population. What is of interest is the corresponding order of neurometric and psychometric performance across different positions in depth. We found qualitatively similar results when computing the neurometric curves based on ROC analysis on the distribution of spike counts for convex and concave stimuli associated with each stereo coherence and position in depth.
The position-in-depth dependency of the population neurometric threshold was not a consequence of small differences in the time intervals used to extract MUA for neurometric estimations because very similar results were obtained when identical time intervals were used for each position in depth (i.e., the mean time interval across positions in depth for each stereo coherence and site). This is further illustrated in Figure 3C, which shows the temporal evolution of the population neurometric performance for each position in depth. Here the slope of the neurometric function at the population neurometric threshold is plotted as a function of time (window size = 100 msec, step size = 10 msec). This plot shows that the average population neurometric performance for each position in depth increased from baseline ∼100 msec after stimulus onset. Immediately after, the neurometric performances evolved differently but preserved their ordinal relation (FIX > FAR > NEAR) over time.
About 120 msec after stimulus onset, small vergence differences in eye position arose between the Near position relative to the Fix position in depth (M1: 0.008° less divergence; M2: 0.002° more convergence; calculated in the interval [100 msec, RT]) and the Far position relative to the Fix position in depth (M1: 0.012° more divergence; M2: 0.031° more divergence). It seems unlikely that such tiny vergence differences are able to explain the observed differences in neurometric performance across position in depth, also because vergence on average decreased depth differences with the Fix position in depth (e.g., because of relatively more divergence for Far stimuli and more convergence for Near stimuli), which should lead to a more similar neurometric performance across positions in depth. Furthermore, vergence probably only influenced the late part of the AIP responses, starting ∼200 msec after stimulus onset (= vergence latency + response latency), and differences were observed earlier on. Finally, we repeated the neurometric analysis after matching for mean vergence eye movements beween positions in depth and found similar results. Specifically, we eliminated trials with relatively extreme vergence eye movements until the mean vergence difference in either pairwise position-in-depth comparison was less than 0.01°. We did this for each stereo coherence separately. Vergence matching resulted in qualitatively similar findings, with best performance for the Fix position in depth (threshold: 22% stereo coherence), worse for the Far position in depth (threshold: 29%), and worst for the Near position in depth (threshold: 48%). Thus vergence eye movements could not explain the difference in neurometric performance across positions in depth.
These findings reveal that 3-D structure information in AIP is not evenly distributed across stimuli at different positions in depth, and this parallels psychophysical performance.
Neuronal Activity in AIP Correlates with Behavioral Choice and RT
We next inspected the trial-by-trial correlation between the multiunit spiking activity (MUA) in AIP and choice behavior during 3-D structure categorization. For this purpose, we computed CPs (see Methods) calculated using the spikes within the time interval starting 100 msec after stimulus onset and ending when the monkey initiated a saccade. CPs were similar across stereo coherences and positions in depth (p > .05 for each monkey, ANOVA), so we combined the data from different stimulus conditions to obtain a single grand CP for each 3-D structure-selective site (Methods). Both monkeys indicated their choices using saccadic eye movements to the left or right choice target (Figure 1A). In addition to this operant rule, we also trained monkey M2 to perform the task with choice targets located above or below the fixation point (see below for the rationale for this procedure). The positioning of the choice targets, however, had no significant influence on the magnitude of the grand CP (p > .05; mean CP for left–right = 0.52, p = .03 when compared with .5, n = 22; mean CP for up–down = 0.55, p = .04, n = 11; permutation t test), so we combined these data for further analyses. We observed an average grand CP of 0.57 for monkey M1 (p < .0001, n = 12; t test) and 0.53 for monkey M2 (p = .005, n = 33).
Next, we examined the time course of the CP. Figure 4A, B shows the time course of the average grand CP relative to stimulus onset (100 msec window, 10 msec step). For either monkey, the CP began to rise ∼100 msec after stimulus onset, which corresponds to the median latency at which AIP neurons become 3-D structure-selective for the stimuli used in this task (Verhoef et al., 2010). Both monkeys' CPs peaked (M1: 0.6; M2: 0.55) ∼200 msec after stimulus onset. Figure 4C, D shows the time course of the average grand CP relative to the onset of the response saccade. These time courses differed between the two monkeys, possibly due to dissimilar RTs (M1: median RT = 259 msec; M2: median RT = 309 msec).
Previous studies have shown that perceptually more sensitive neurons tend to have larger CPs (e.g., Price & Born, 2010; Gu, DeAngelis, & Angelaki, 2007; Purushothaman & Bradley, 2005; Britten et al., 1996; Celebrini & Newsome, 1994). To examine this issue in the present context, we calculated neurometric thresholds by fitting a cumulative Weibull function to the neurometric performance, that is, the area under the ROC curve, across stereo coherences (Celebrini & Newsome, 1994). In agreement with earlier studies, a negative correlation was observed between the magnitude of the CP and the neurometric threshold of a given MUA site (M1: r = −.64, p = .025; M2: r = −.31, p = .079; Fisher Z test), although this correlation was significant in only one subject. These findings indicate that the neuronal activity at more sensitive sites tends to have a stronger relationship with behavioral choice.
We verified whether differences in vergence eye movements could have influenced the observed CPs but found no significant difference in the mean vergence eye movements between preferred and nonpreferred choices (p > .05, permutation test).
We also wanted to ensure that the CPs were not an artifact of neuronal activity related to saccade planning. Such an explanation seems unlikely, however, as the CPs arose shortly after stimulus onset (∼100 msec), at a point in time where perceptual decisions most likely had yet to be formed, and where it seems implausible that monkeys were already planning a saccade. Furthermore, we observed similar CPs in convex- and concave-selective sites (concave sites: M1: 0.59 (p = .003; n = 7) and M2: 0.55 (n = 5); convex sites: M1: 0.58 (n = 4) and M2: 0.53 (p = .03, n = 29); permutation test). This implies that if planned saccades are to explain the CPs, the 3-D structure preference of a site, as measured during a fixation task, must correlate with the saccade direction preference of a site. Moreover, given the correlation between the CP and the neuronal sensitivity of a site (see above), this relationship should have been more pronounced for the most sensitive neurons.
These issues were further examined by inspecting AIP's perisaccadic activity. We measured MUA in AIP while monkeys performed a TOAS task (see Methods) wherein the animal was instructed to detect which of two targets—presented with a variable time delay between the two—appeared first, by making a saccade to that target (Figure 1B). All recordings during the TOAS task were made in grid positions previously associated with 3-D structure selective AIP sites. Four of 11 (M1) and 17 of 22 (M2) TOAS recordings were made during the 3-D structure categorizaton recording sessions. We first measured the time course of the perisaccadic activity by separating trials according to the monkey's saccade direction and the preferred saccade direction of a MUA site and calculated a CP (see Methods). Figure 5A, B shows the time course of the average saccade CP aligned to saccade onset (M1: n = 11; M2: n = 22). In contrast to the categorization CPs reported above (Figure 4C, D), saccadic CPs reach their maximum after the onset of the saccade. The plots also show that the perisaccadic activity first started to differentiate between preferred and nonpreferred saccade directions about 50 and 120 msec before saccade onset in monkey M1 and M2, respectively. Assuming a similar saccadic time course during the categorization task, we recalculated the 3-D shape categorization CPs using the time interval from 100 msec after stimulus onset until 50 msec (M1) or 120 msec (M2) before the monkey's RT. Even using these short time intervals, we still observed a significant 3-D shape categorization CP of 0.52 in monkey M1 (p = .005) and 0.53 in monkey M2 (p < .001, permutation t test). Finally, in 17 sites of monkey M2, we were able to determine the saccade direction preference (based on the TOAS task) in addition to the 3-D structure preference. We found no significant relationship between a site's preferred saccade direction and its 3-D structure preference (p = .32; Fisher exact test). Hence, it seems unlikely that perisaccadic activity could give rise to the average CPs obtained during 3-D shape categorization.
Next, we explored if the neuronal activity in AIP correlates with the RTs of the monkeys during the categorization task. Figure 6A shows the temporal evolution of the average correlation between spike rate and RT for preferred and nonpreferred choices separately. We noticed that about 100 msec after stimulus onset, around the time of onset for the average CP, the spike–RT correlations for preferred and nonpreferred choices began to diverge, with negative correlations for preferred choices and positive correlations for nonpreferred choices. The two correlations differed significantly from one another in the time interval from 100 to 250 msec after stimulus onset (p < .002, Bonferroni-corrected for two a priori chosen intervals based on the CP time courses; permutation test). This finding shows that increased firing rates were associated with faster preferred choices, as might be expected if the neuronal activity in AIP accumulates until a decision boundary is crossed, triggering a response (Purcell et al., 2010; Ditterich, Mazurek, & Shadlen, 2003; Cook & Maunsell, 2002). The positive correlation between firing rate and RT for nonpreferred choices, on the other hand, suggests that a competition exists between the activity of convex- and concave-selective neurons at decision-related brain areas: Stronger activity, in favor of the preferred choice of a MUA site, delays the decision process until the winning nonpreferred population provides the outcome of the perceptual decision (i.e., convex or concave). Specifically, for trials with nonpreferred choices, the high firing rate of a selective site was apparently insufficient to win the competition—perhaps because other neurons in the population fired less and/or those in the other population fired more—but was sufficient to impede the perceptual decision. Interestingly, we found that sites with a greater correlation between spike rate and behavioral choice, that is, sites with larger CPs, display a more negative correlation between spike rate and RT for preferred choices (100–250 msec window; Figure 6B; r(CP, spike–RT correlation) = −.45, p = .002; M1: r(CP, spike–RT correlation) = −.7, p = .01; M2: r(CP, spike–RT correlation) = −.38, p = .02; Fisher Z test). Similar effects were observed for convex- and concave-selective sites (convex: r(CP, spike–RT correlation) = −.4, p < .01; concave: r(CP, spike–RT correlation) = −.77, p < .02). Although sites with greater CPs tended to show larger correlations between spike rate and RT for nonpreferred choices, these effects were not significant (Figure 6C; M1: r(CP, spike–RT correlation) = .42, p = .18; M2: r(CP, spike–RT correlation) = .27, p = .13). Hence, amplified AIP activity increased the probability of a fast preferred choice.
This study examined the relationship between neuronal activity in cortical area AIP and perceptual behavior during 3-D structure categorization. While recording in clusters of AIP neurons with a similar 3-D shape preference, we observed that the position-in-depth dependency of the monkeys' 3-D shape discrimination performance was reflected in the activity of AIP neurons. We further observed a correlation between each animal's neuronal activity and subsequent choices and RTs while categorizing 3-D shapes.
Task performance is often not completely invariant across different stimulus conditions. These different levels of behavioral precision in distinct task conditions can be mirrored in the performance of some population(s) of neurons (Gu, Fetsch, Adeyemo, Deangelis, & Angelaki, 2010). Here we show that differences in the behavioral performance for stimuli at different positions in depth were paralleled by similar differences in the performance of AIP spiking activity. Hence, AIP MUA can at least qualitatively account for the precision of 3-D structure perception of the monkeys in our task.
In Verhoef et al. (2010), we observed relatively late MUA-CPs in AIP during a 3-D structure categorization task that employed the same stimuli as those of this study but in which stimuli were presented for a fixed duration of 800 msec before the monkeys' response saccade. In fact, for monkey M1, which also participated in that study (as monkey B2), a CP was observed that arose ∼750 msec after stimulus onset and peaked after stimulus offset. A possible explanation for this difference in findings is that the monkeys used a different task strategy in the RT task. For example, to form rapid yet accurate decisions about 3-D structures, monkeys may have combined 3-D structure information from both IT and AIP (and other areas) in the RT task. Furthermore, in some cases, CPs may be easier to detect during a RT task, because this task increases the proportion of the viewing interval that is used for deciding and might reduce variability in the onset of the decision period. In addition, the CPs presented above were always measured in the centers of clusters of AIP neurons having a similar 3-D structure preference. In contrast, the CPs in Verhoef et al. (2010) were always measured at the first cortical position where 3-D structure-selective MUA was encountered, which was at best near the border of a cluster. A recent study by Nienborg and Cumming (2014) suggests that choice-related activity is more easily discovered when the brain region examined contains neurons that cluster according to their preference for the relevant task feature. The reasoning behind this statement goes as follows: Several studies have shown that neurons with similar stimulus preferences also exhibit higher noise correlations, that is, the trial-by-trial fluctuations in the responses of these neurons (Cohen & Maunsell, 2009; Kohn & Smith, 2005; Bair, Zohary, & Newsome, 2001; Zohary, Shadlen, & Newsome, 1994). Moreover, it has been suggested that noise correlations contribute significantly to CPs because correlated noise does not average out and therefore drives decisions that are based on a weighted sum of neuronal activity within a population (Haefner, Gerwinn, Macke, & Bethge, 2013; Nienborg, Cohen, & Cumming, 2012; Shadlen, Britten, Newsome, & Movshon, 1996). However, a decision based on the difference in average activity between neuronal populations would eliminate activity that is shared between populations due to the subtraction, thereby preventing it from influencing the decision. In this view, shared activity that is common between neuronal pools with different stimulus preferences may lower the CP, whereas correlated activity that is specific to neurons with similar stimulus preferences increases the CP (Cohen & Newsome, 2009). In clusters of neurons with similar stimulus preferences, the pool-specific correlated activity is expected to explain a larger proportion of the response variance, thereby causing CPs to be amplified. Hence, measuring CPs in clusters of neurons possessing similar stimulus preferences may be more sensitive.
Spike–RT correlations have been observed in other dorsal stream areas (e.g., MT, MST, VIP) using tasks that involved judgments about stimulus motion (Price & Born, 2010; Cohen & Newsome, 2009; Cook & Maunsell, 2002). We observed positive spike–RT correlations for nonpreferred choices and negative spike–RT correlations for preferred choices. The difference in spike–RT correlations between preferred and nonpreferred choices was most pronounced for sites with larger CPs. These findings agree with models that explain the formation of perceptual decisions based on weighted evidence originating from opponent neuronal populations, in our case one with convex- and another with concave-selective neurons, that directly or indirectly influence each other's input into the decision stage via, for example, lateral or feed-forward inhibition (Ditterich et al., 2003). In a previous study in which microstimulation was applied to 3-D structure-selective neuronal clusters in IT, the pattern of microstimulation effects on the RTs of the monkeys provided additional evidence that such a mechanism underlies 3-D structure categorization (Verhoef et al., 2012). Hence, converging evidence from IT and AIP indicates that 3-D structure categorization relies on the activity of opponent pools of neurons. In future work, we will determine whether the behavioral effects from microstimulation in AIP corroborate this idea.
Hitherto, we have examined two areas, namely, IT (Verhoef et al., 2010, 2012) and AIP, for their relationship with 3-D structure categorization. It is still unclear to what extent each area provides 3-D structure information independently of the other. We have previously shown that the activity in IT and AIP synchronizes during 3-D structure categorization, suggesting that the two areas interact with each other during task performance (Verhoef, Vogels, & Janssen, 2011). Future studies need to resolve whether this functional interaction facilitates 3-D structure categorization.
Our findings add to the growing evidence that activity in dorsal stream areas correlates with performance during more traditional perceptual tasks such as motion detection (Cook & Maunsell, 2002), direction of heading discrimination (Gu et al., 2010), motion categorization (Swaminathan & Freedman, 2012), and coarse depth discrimination (Uka & DeAngelis, 2004). Here we show that the activity in dorsal stream area AIP relates to 3-D structure categorization behavior in several ways, suggesting a role for AIP in 3-D structure perception.
We thank Inez Puttemans, Piet Kayenbergh, Gerrit Meulemans, Stijn Verstraeten, Marjan Docx, Wouter Depuydt, and Marc De Paep for assistance. We thank Steve Raiguel for comments on an earlier version of this paper. This research was supported by Fonds voor Wetenschappelijk Onderzoek Vlaanderen (G.0713.09 and G.0A22.13N), Programmafinanciering PFV10/008, Geconcerteerde Onderzoeksacties GOA 10/19, and Interuniversitaire Attractiepolen IUAP P6/29. Bram-Ernst Verhoef is a postdoctoral research fellow of the Flemish fund for scientific research (FWO).
Reprint requests should be sent to Peter Janssen, Lab Neuro- en Psychofysiologie, KU Leuven, Herestraat 49, Bus 1021, B-3000 Leuven, Belgium, or via e-mail: Peter.Janssen@med.kuleuven.be.