Practicing simple visual detection and discrimination tasks improves performance, a signature of adult brain plasticity. The neural mechanisms that underlie these changes in performance are still unclear. Previously, we reported that practice in discriminating the orientation of noisy gratings (coarse orientation discrimination) increased the ability of single neurons in the early visual area V4 to discriminate the trained stimuli. Here, we ask whether practice in this task also changes the stimulus tuning properties of later visual cortical areas, despite the use of simple grating stimuli. To identify candidate areas, we used fMRI to map activations to noisy gratings in trained rhesus monkeys, revealing a region in the posterior inferior temporal (PIT) cortex. Subsequent single unit recordings in PIT showed that the degree of orientation selectivity was similar to that of area V4 and that the PIT neurons discriminated the trained orientations better than the untrained orientations. Unlike in previous single unit studies of perceptual learning in early visual cortex, more PIT neurons preferred trained compared with untrained orientations. The effects of training on the responses to the grating stimuli were also present when the animals were performing a difficult orthogonal task in which the grating stimuli were task-irrelevant, suggesting that the training effect does not need attention to be expressed. The PIT neurons could support orientation discrimination at low signal-to-noise levels. These findings suggest that extensive practice in discriminating simple grating stimuli not only affects early visual cortex but also changes the stimulus tuning of a late visual cortical area.
Practice in sensory detection and discrimination tasks improves task performance (Gibson, 1963). Although such perceptual learning effects in the visual system have been studied extensively at the behavioral level (Aberg & Herzog, 2012; Sagi, 2011; Fine & Jacobs, 2002), the underlying neural mechanisms are still unclear. Initial studies showed changes in the tuning of macaque V1 and V4 neurons after extensive training in a fine orientation discrimination task (Raiguel, Vogels, Mysore, & Orban, 2006; Yang & Maunsell, 2004; Schoups, Vogels, Qian, & Orban, 2001) with smaller and less consistent effects across studies in V1 (Ghose, Yang, & Maunsell, 2002; Schoups et al., 2001). However, studies in dorsal stream areas, middle temporal and medial superior temporal, showed no perceptual learning effects on neural tuning or response strength during direction (Law & Gold, 2008), heading (Gu et al., 2011), or depth discrimination tasks (Uka, Sasaki, & Kumano, 2012). In these areas, the correlation between behavioral choices and neural responses increased during early task learning for depth discrimination (Uka et al., 2012) and during direction discrimination learning (Law & Gold, 2008), which may suggest that the learning to discriminate involves a reweighting of the stable visual cortical signals that are used to form the perceptual decision (Law & Gold, 2009). This idea is supported by evidence of perceptual learning-induced changes in decision-related responses of lateral intraparietal neurons to the choice targets in the motion direction task (Law & Gold, 2008).
We recently reported that the response properties of macaque area V4 changes during the course of practicing a coarse orientation discrimination task (Adab & Vogels, 2011). In that task, the animals discriminated two gratings that differed by 90°. Task difficulty was manipulated by adding noise to the grating, that is, by lowering the signal-to-noise ratio (SNR). Behavioral performance at low SNRs increased during the course of training, which was accompanied by an improvement of V4 neurons to discriminate the gratings. However, V4 is only a single area among many visual areas that might show perceptual learning-related changes in the representation of the simple discriminanda in this task. Indeed, areas downstream from V4 may show other or more pronounced changes in their stimulus representations by virtue of the connections between V4 and such areas or by inherent plasticity of these later areas. These potential changes in the tuning properties of later visual areas can contribute to changes in the performance during perceptual learning and thus should be taken into account in models of perceptual learning. Thus, here we asked whether and how the representations of the trained noisy grating stimuli in late visual cortical areas were changed by perceptual learning. To answer this question, we first identified candidate areas that responded to low SNR stimuli with monkey fMRI. The fMRI data yielded a posterior inferior temporal (PIT) cortical region that was activated by the low SNR gratings in the trained monkeys. We subsequently recorded the responses of single neurons to trained and untrained orientations in this fMRI defined PIT region, assessing whether their response properties were affected by practicing orientation discrimination.
The two rhesus monkeys (Macaca mulatta, both male) of our V4 learning study (Adab & Vogels, 2011) served as subjects. After that study, the animals continued to practice the coarse orientation discrimination task at a fixed stimulus location of 3° eccentricity (lower visual field, 225° polar angle). In addition, they received training in the color discrimination task. Before the fMRI mapping study, both animals were trained to fixate for long durations in a mock fMRI setup. During the fixation training, the monkeys were exposed to natural images that differed from those used in the fMRI mapping. Animal care and experimental procedures were approved by the ethical committee of the KU Leuven Medical School.
Noisy Grating Stimuli
The gamma-corrected grating stimuli and display were the same as in the V4 study (Adab & Vogels, 2011). Circular patches (2° diameter) containing a 100% Michelson contrast sinusoidal grating (2 cycles/degree) were spatially masked by noise and then superimposed on a noise background that filled the display. The SNR was manipulated by random replacement of the grating pixels by noise. The noise of the background and stimuli patches was refreshed on every trial in the single unit recording tasks. 0% SNR patches were detectable at stimulus onset, which aimed to reduce spatial uncertainty. The noise of both the stimulus and the background was generated from the same sinusoidal luminance distribution. The trained orientations were 22.5° and 112.5° in monkey M and 67.5° and 157.5° in monkey P.
fMRI Methods, Design, and Data Analysis
Functional scans were obtained while the monkeys were fixating a small red target (0.14° wide). During scanning, the monkeys sat in a sphinx position with their heads fixed in an MR-compatible chair at a distance of approximately 57 cm from a screen. The gamma-corrected stimuli were projected onto the screen. Eye position was continuously monitored (120 Hz; Iscan, Burlington, MA) during scanning. The monkey received a juice reward for maintaining fixation within a square window of 2° × 2°.
Before scanning, the contrast agent monocrystalline iron oxide nanoparticle (MION; Feraheme, AMAG Pharmaceuticals, Inc., Lexington, MA, 8–11 mg/kg) was injected intravenously. The monkeys were scanned on a 3T Siemens Trio scanner following standard procedures (Vanduffel et al., 2001). fMRIs were acquired with a custom-made eight-channel coil (Ekstrom, Roelfsema, Arsenault, Bonmassar, & Vanduffel, 2008) and a gradient-echo single-shot EPI sequence (repetition time = 2 sec, echo time = 17 msec, flip angle = 75°, 80 × 80 matrix, 40 slices, no gap, 1.25 mm isotropic voxel size). Slices were oriented transversally covering the whole brain. High-resolution anatomical MRIs were acquired under ketamine/xylazine anesthesia, using a single radial transmit–receive surface coil and a MPRAGE sequence (repetition time = 2200 msec, echo time = 4.05 msec, flip angle = 13°, 320 × 260 matrix, 208 slices, 0.4 mm isotropic voxel size).
The stimuli were 20% SNR gratings of different orientations (22.5°, 67.5°, 112.5°, and 157.5°) and random dot texture patterns (randomly positioned dots with sizes varying between 0.06° and 0.43°). The data obtained with the random dot pattern are not relevant for the present analysis and will not be described here. A novel noisy grating was used for each presentation. The stimuli were superimposed on a noisy background and presented at the trained or untrained location (3° eccentricity in the left or right lower visual quadrant, respectively) for 300 msec with a variable ISI of 3500 msec (range = 3000–4000 msec) on average. The background noise was varied across runs. The stimuli defined six conditions: trained orientations at the trained and untrained locations, untrained orientations at trained and untrained locations, and the texture at the trained and untrained locations. In addition to these six conditions, there was a “fixation” condition consisting of the noise background with the same duration as the stimulus presentations. Each run started with the presentation of the background for 10 sec, followed by 99 events (including the “fixation” condition null event; each event lasting 300 msec) and ended with another 14 sec of only the background. The duration of a run was 400 sec. The fixation target was presented continuously throughout the whole run. The seven conditions were presented in a pseudorandom order with the constraint that a particular condition had to be preceded equally often by each condition within a given run (Jastorff, Kourtzi, & Giese, 2009). Forty-nine (7 × 7) events were required to completely counterbalance the sequence. We included 99 events in each run, ensuring complete counterbalancing for events except the first one of a run. This first event for a given run was selected from each condition with equal probability.
The procedure for processing the data has been described in detail elsewhere (Popivanov, Jastorff, Vanduffel, & Vogels, 2012). The only exception is that our functional data were smoothed using a 1.5-mm full-width half-height Gaussian kernel. Data analysis was performed with SPM5 (Wellcome Department of Cognitive Neurology, UK). All runs in which the monkey was fixating at least 94% of the time were combined in a fixed effects model for each monkey separately in native space. The results were analyzed with a general linear model with five regressors (texture pattern, trained and untrained orientations combined at the trained and untrained locations and the “fixation” condition) plus six additional head-motion regressors per run. Each of these five conditions was modeled by convolving a Gamma function (delta = 0, tau = 8, and exponent = 0.3), modeling the MION hemodynamic response function, at the onset of the condition. We then computed general linear model contrasts between the grating presentations at trained and untrained locations and the “fixation” condition. Additional analyses with regressors for each of the seven conditions (separating trained and untrained orientations) were also conducted.
Single Unit Recordings: Tasks
Passive Fixation Task
Eight oriented gratings (0°, 22.5°, 45°, 67.5°, 90°, 112.5°, 135°, and 157.5°) with 80% SNR were presented interleaved during passive fixation (fixation point size: 0.27°) at either trained (for recordings in the trained hemisphere) or untrained locations (for recordings in the untrained hemisphere). Each stimulus was shown for 250 msec, preceded and followed by a fixation period of 500 and 100 msec, respectively. Completed trials were rewarded by a drop of juice. The display was filled during the whole course of a trial with the background noise, which was refreshed on each trial. The mean number of presentations was 19 per orientation. Fixation window size was 1.5° × 1.5° for all tasks.
Color Discrimination Task (Figure 1B)
A colored spot of 1° diameter was presented in the upper ipsilateral visual field at 10.3° eccentricity together with a noisy grating at the contralateral trained (for recordings in the trained hemisphere) or untrained (for recordings in the untrained hemisphere) locations. The grating and the colored spot were presented for 250 msec following a fixation period of 500 msec. After their presentation, the monkey had to continue fixating for another 200 msec. This fixation period was followed by the presentation of two target points, and the animals indicated the color of the spot by saccading toward the corresponding target. Correct responses were rewarded by a drop of juice. The display was filled with the background noise during the whole course of a trial. The color difference was titrated for each monkey. The SNR (10–40%) and orientation (two orthogonal trained and two orthogonal untrained orientations that differed by 45° from the trained) of the grating were independent of the target color. The noise of the gratings and background were refreshed on each trial. The mean number of presentations was 18 per orientation and SNR.
Coarse Orientation Discrimination Task (Figure 1A)
This task is identical to that described elsewhere (Adab & Vogels, 2011). Either one of two trained oriented gratings which could have different SNR levels (0–40%) was presented for 250 msec on top of the noise background following a fixation period of 500 msec. After another 200 msec, the animals had to indicate the orientation by a saccadic eye movement to one of the two presented target points. Correct responses were rewarded with a drop of juice. Orientations and SNRs were presented in random order. The noise of the gratings and the background were refreshed on each trial. The phase of the gratings was randomized across trials.
Receptive Field Mapping
Receptive fields were quantitatively mapped in a subset of the neurons in the trained and untrained hemisphere. Temporally modulated checkerboards (9.5 Hz; stimulus size = 3° × 3°; checker size = 1.5°) were presented for 107 msec in a random order at a 7 × 7 locations (spacing = 3°) of an invisible grid centered on the fixation point during fixation. The receptive fields of the neurons in the recorded part of PIT showed on average the strongest activity in the contralateral lower visual field quadrant with a peak at or close to the trained (neurons from trained hemisphere) or untrained locations (neurons from untrained hemisphere). The latter position bias is not surprising because we searched for responsive neurons with stimuli at the trained or untrained location.
Single Unit Recording Methods and Data Analysis
Standard electrophysiological recording techniques were employed. Action potentials were recorded with epoxy-coated tungsten electrodes. Subjects' eye movements were monitored using infrared eye tracking (500 Hz; EyeLink, Ontario, Canada). Single units were discriminated on-line with a threshold and time window discriminator and timings of well-isolated single units were saved together with behavioral events for later offline analysis. MR (MPRAGE; resolution = 600 μm3) images of the brain with markers of recording grid positions were acquired before and in between recording sessions for verification of the recordings sites. These anatomical images were coregistered with the fMRI t score images. In the “trained hemisphere” of monkey M, we recorded at nine adjoining guide tube positions (spacing = 1 mm) but 87/123 responsive neurons were from three neighboring guide tube positions (1 mm apart; about 7 mm anterior with respect to the auditory meatus). In the “untrained” hemisphere of monkey M, recordings were from four neighboring guide tube positions, with 41/66 neurons from one guide tube position. In the “trained” hemisphere of monkey P, recordings were from three neighboring guide tube positions with 45/55 neurons from a single guide tube position (6 mm anterior). In order not to bias the data, we pooled all responsive neurons from different recording positions. Responsive neurons were searched during passive fixation with eight oriented gratings.
The preferred orientation is the orientation, out of the eight tested in the passive fixation task, with the greatest net response. The population orientation tuning curves were computed by defining the preferred orientation of each neuron using the odd or the even trials, and then the tuning curve was computed for the other half of the trials. This procedure, in which (i) the preferred orientation and (ii) the responses to the stimulus orientations used to compute the tuning curves are based on independent trials, avoids an overestimation of the peak of the tuning curve. The tuning curves were averaged across neurons, after alignment of the individual tuning curves with respect to the preferred orientation. Following Adab and Vogels (2011), CPs were computed by z scoring the responses for each SNR <40% with at least one correct and one incorrect choice for each orientation. The grand CP (Britten, Newsome, Shadlen, Celebrini, & Movshon, 1996) is the AUROC for the distributions of the z scores, pooled across SNRs and orientations, and sorted according to the animal's choice.
We trained two monkeys extensively in a coarse orientation discrimination task (Figure 1A) at low SNRs (Adab & Vogels, 2011). Using fMRI, we first localized regions that were activated by gratings of 20% SNR. Guided by the fMRI data, we then recorded single unit activity in an inferior temporal (IT) area, PIT, which was activated by these stimuli.
fMRI Mapping of Areas Activated by Low SNR Gratings after Training
With an event-related fMRI design, we sequentially presented four differently oriented gratings (22.5°–157.5°, randomly shown in steps of 45°) of 20% SNR at the trained location and at an untrained location having the same 3° eccentricity. Spatial frequency (2 cycle/degree) and size (2°) were identical to the stimuli used in the training phase of the monkeys.
For monkey P, we analyzed 25 runs (700 stimulus presentations per condition) in which the monkey was fixating for at least 94% within a fixation window of 2° × 2°, whereas 65 runs (1820 presentations/condition) passed this fixation criterion in monkey M. To map potential regions that demonstrate perceptual learning-related changes, we took a conservative approach by contrasting the responses to the low signal 20% SNR gratings with the noise background. We took this approach because learning-related changes in the tuning of single neurons may not show up in fMRI activations that are based on the contrast of trained versus untrained orientations and thus can be missed. In fact, contrasting the trained and untrained orientations produced no significant activations (at p < .05; family-wise error [FWE] corrected) in this study.
Contrasting the fMRI response to the noisy gratings (pooled across the four orientations) with the response to the noise background resulted in four activated regions (p < .05 in at least one hemisphere; FWE corrected; t > 4.9): V2/V3, V4, a region in the PIT cortex and in pFC (area 46v). The presence of the V4 recording chamber on the trained hemisphere prevented close positioning of the phased-array receive coil over the trained hemisphere. Thus, activation levels between hemispheres could not be directly compared.
The PIT activation was significant in each of the four hemispheres, either at the FWE-corrected (2/4 hemispheres) or at uncorrected level (p < .001; “trained” hemisphere monkey M and “untrained” hemisphere monkey P). The location of this activation was consistent in both animals, being close to the anterior part of posterior middle temporal sulcus (PMTS; Figure 2), extending somewhat more dorsally toward the ventral bank of the STS in monkey M.
The fMRI activation to the high noise grating stimuli in this PIT region in both trained animals guided the subsequent electrophysiological recordings. We addressed two major questions: (1) to what degree do single neurons in this region show orientation selective responses to gratings and (2) are the response properties of these neurons affected by the coarse orientation discrimination learning?
Single Unit Responses in the fMRI-defined PIT Region: Orientation Selectivity
We made vertical microelectrode penetrations from the lower bank of the STS to the PMTS, covering the PIT region that was defined by the fMRI activation to the 20% SNR gratings irrespective of their orientation. We searched for responsive neurons while the animals were performing a passive fixation task in which gratings of eight different orientations (0°–157.5°, step = 22.5°) with high SNR (80%) were presented on top of the noise background at the trained location. Responsive cells were observed on the lateral convexity of PIT dorsal to and in the PMTS. The range of the depths of the responsive neurons (based on depth readings) was approximately 4.3 mm in monkey P and 5.9 mm in monkey M, with interquartile depth ranges of 1.8 mm and 2.7 mm, respectively. This is much wider than orientation columns in early visual cortical cortex (Tanigawa, Lu, & Roe, 2010), and to the best of our knowledge, there is no evidence for orientation columns in macaque PIT (Vanduffel, Tootell, Schoups, & Orban, 2002). All responsive PIT neurons were pooled in the analyses presented below (n = 178 neurons; monkey P: 55 neurons; monkey M: 123 neurons).
The large majority of the responsive neurons (overall 87%; monkey P: 95%; monkey M: 83%) showed a significant effect of orientation (one-way ANOVA; p < .05) tested during passive fixation. The degree of orientation selectivity was quantified by the SI (see Methods). The median SI was 0.31 (monkey P: 0.37; monkey M: 0.30; Figure 3A). The orientation tuning of a PIT neuron with an SI of 0.30, which is close to that of the median value of the population, is shown in Figure 3B, whereas that of the populations of neurons for each animal are presented in Figure 3C. Note that the population tuning curves were computed by defining the preferred orientation of each neuron on half of the trials and averaging the responses for the other half of the trials, which avoids an overestimation of the peak of the tuning curve. Overall, the degree of orientation selectivity in this PIT region (measured after extensive training in the coarse orientation discrimination task) was similar to that observed in V4 in the same animals at the “late” stage of the training (median SI = 0.35; Adab & Vogels, 2011).
Single Unit Responses in the fMRI-defined PIT Region: Effects of Practicing Coarse Orientation Discrimination
First, we assessed whether the coarse orientation training affected the orientation preference of the neurons, that is, whether relatively more neurons preferred the trained orientations. Because the trained orientations differed between the two monkeys, we computed the number of neurons as a function of the smallest absolute difference between the preferred orientation and the trained orientations (|Tr-Pref|; step of 22.5°; 80% SNR gratings). The maximum |Tr-Pref| value is 45° because the two trained orientations differed by 90°. Note that when the distribution of the preferred orientations is uniform, twice as many neurons will have a |Tr-Pref| value of 22.5° compared with 0° and 45°; thus, a peak at a 22.5° would be expected in the observed distribution. However, this was clearly not the case in the neural data because the distribution of the number of neurons as a function of |Tr-Pref| differed significantly from that expected from a uniform distribution of preferred orientations in each animal (Figure 4A; chi-square tests; monkey P: p = 3 × 10−6; monkey M: p = 8 × 10−6). In both animals, a higher proportion of neurons preferred (one of) the trained orientations compared with all untrained orientations (|Tr-Pref| = 0°; Figure 4A). Note that the probability that this overrepresentation of the trained orientations occurred by chance in both monkeys, which were trained by different orientation pairs, is ¼ × ¼ = 0.0625.
For each neuron and orientation, we normalized the net response to that of its preferred orientation and then averaged across neurons per animal, producing population orientation tuning curves (Figure 4B). ANOVA showed a significant effect of Orientation on the mean normalized response in each animal (monkey P: p = .027; monkey M; p < 10−7). In both monkeys, these orientation population response curves were double peaked with—across monkeys—three of the four peaks at a trained orientation and one peak 22.5° offset from a trained orientation. Because the trained orientations differed between the two monkeys by 45°, the population response curves differed significantly between the two animals (ANOVA; interaction Monkey × Orientation: p = 4 × 10−5). This demonstrates that the effect of Orientation on the population response was not because of a preference for a particular orientation but, instead, was due to whether or not an orientation was trained.
Next, we determined the discriminability of single PIT neurons for two pairs of orientations, one pair consisting of the two trained orientations (trained pair) and another pair consisting of the two orientations that differed by 45° from the trained orientations (untrained pair). The discriminability was computed as the AUROC (see Methods), which takes into account both the difference in response between the two orientations as well as the response variability of the neuron. The mean AUROC for the 80% SNR stimuli presented during passive fixation was significantly greater for the trained compared with the untrained pair in each animal (paired t test; monkey P: p = .0073; n = 55; monkey M: p = .0017; n = 123; Figure 5A, B, see SNR 80%). The mean AUROC for the trained and untrained pairs were 81% and 74%, respectively, when the data of both animals were combined (paired t test; p = 3.6 × 10−5; n = 178).
The higher discriminability for the trained orientation pair at the single neuron level resulted in a greater classification accuracy for the trained compared with the untrained orientation pairs at the PIT population level. This was demonstrated by training and testing a correlation-based classifier (Meyers, Freedman, Kreiman, Miller, & Poggio, 2008). Training and testing was performed using seven and three randomly sampled trials per orientation, respectively. Spike counts in an analysis window of 50–300 msec for the 80% SNR stimuli served as input. We trained classifiers for samples of neurons with numbers (N) varying between 5 and 20 (step size 5), drawing randomly 2000 times N neurons from the recorded population. The classification accuracy, averaged across the 2000 draws per N, was significantly greater for the trained compared with the untrained orientation pairs for each N (e.g., N = 5: monkey P: 92.4% (±0.3 (SE)) vs. 84.2% (±0.4); monkey M: 86.7% (±0.4) vs. 79.2% (±0.4); N = 20: monkey P: 99.96% (±0.02) vs. 97.8% (±0.2); monkey M: 99.5% (±0.1) vs. 96.4% (±0.2)).
To determine whether the enhanced discriminability for the trained orientations may reflect an attentional effect, we measured the responses of a subset of the neurons to the two trained and untrained orientations for SNRs of 10%, 15%, 20%, and 40% while the animals were performing an orthogonal color discrimination task. This subset of neurons (n = 103) were also tested during passive fixation. All of these neurons, except one, showed a significant response to the 80% SNR gratings during passive fixation. The mean AUROC of this subset of neurons was significantly larger for the trained than for the untrained orientations during passive fixation (mean AUROC: trained orientation pair: 82%; untrained orientation pair: 75%; paired t test; p = .0009; n = 102). In the color discrimination task, we presented a small colored spot in the upper ipsilateral visual field, simultaneously with the grating at the trained location. The animals had to indicate, by means of a saccade to one of two subsequent targets, which of the two colors had been presented (Figure 1B). The color difference was titrated for each individual monkey so that the average color discrimination performances were well below ceiling during the recordings. The average color discrimination performance was 85% correct in each monkey (monkey P: n = 41; monkey M: n = 62). Importantly, there was no significant effect of the grating orientation nor a significant interaction between the grating orientation and the SNR on the color discrimination performance in either monkey (ANOVA: main effect of Orientation and interaction of Orientation and SNR in each monkey; p > .60). In fact, the color discrimination performance was highly similar for trials during which trained (monkey P: 85% correct; monkey M: 85.2%) and untrained orientations (monkey P: 85.3%; monkey M: 85.5%) were presented, indicating that potential differences between the neural responses to the trained and untrained orientations cannot result from orientation-dependent attentional factors.
ROC analysis showed that the PIT neurons discriminated with a higher accuracy the trained compared with the untrained orientation pairs (Figure 5A, B, see SNRs 10–40%) when these were presented during the orthogonal color discrimination task. In each animal, the mean AUROC was significantly larger for the trained compared with the untrained orientations (ANOVA; main effect of Orientation pair: monkey P: p = .02, n = 41 neurons; monkey M: p = .0016, n = 62; both animals combined: p = .0001, n = 103). When combining the two animals, significantly greater discriminability for trained compared with untrained orientation pairs was present at 20% (61% vs. 56%; paired t test: p = .00024) and 40% SNR (76% vs. 66%; p = 7.1 × 10−7). These data demonstrate that the improved AUROC for trained compared with untrained orientation pairs is also evident when subjects are performing a difficult attention-demanding orthogonal task.
Further analyses showed that this orientation-dependent training effect on the AUROC was primarily driven by the increased difference between the mean responses for the trained orientations because the Fano factor (response variance/mean response) was unaffected by stimulus orientation (mean Fano factor in passive fixation task, pooled across monkeys, for trained orientations: 1.18 and for untrained orientations: 1.19 (paired t test; p = .80)). Because a larger number of neurons preferred one of the trained orientations compared with the untrained orientations (Figure 4A), one expects that on average the differences in response between the two trained orientations (that differ by 90°) will be larger than for the untrained orientations. In other words, the improved discriminability for the trained compared with untrained orientation pairs (Figure 5A, B) could result from the training-induced bias in orientation preference. On the other hand, the mean difference in AUROC between the trained and untrained orientation pairs may not entirely be because of the shift in orientation preference. To examine this, we computed the AUROC for the trained and untrained orientation pairs for two groups of neurons, those preferring a trained orientation (|Tr-Pref| = 0°) and those preferring an untrained orientation (|Tr-Pref| = 45°). If the effect of training on AUROC depends only on the orientation preference shift, then one would expect that the mean AUROCs for the trained and untrained orientation pairs do not differ between these two groups of neurons. Figure 5C shows the mean AUROCs for both groups of neurons recorded during passive fixation for two orientation pairs: preferred orientation versus preferred + 90° (Pref, Pref+90) and preferred − 45° versus preferred + 45° (Pref-45, Pref+45). Note that the (Pref, Pref+90) pair are trained orientations for neurons with |Tr-Pref| equal to 0° and untrained orientations for the other group of neurons. The opposite holds for the other orientation pair (see Figure 5C). As expected, the mean AUROC is larger for the (Pref, Pref+90) compared with the (Pref-45, Pref+45) pair (ANOVA; main effect of Orientation pair: p < 10−7) because the former includes the preferred orientation. If the training effect seen in Figure 5(A, B) merely resulted from the higher proportion of neurons tuned to trained orientations, then one would expect that the mean AUROCs for the two pairs would be similar for the two groups of neurons. However, the interaction between orientation pair and neuron group was close to significance (ANOVA with factors Monkey, Orientation pair, and Neuron group; p = .09) with a higher AUROC for the trained orientation pair for the neurons preferring an untrained orientation (n = 31 neurons) compared with the untrained orientation pair for neurons preferring a trained orientation (n = 84). The mean AUROC for an orientation pair that included the preferred orientation (Pref, Pref+90) did not differ between the two groups of neurons.
The interaction between Orientation pair and Neuron group was stronger and highly significant for the data obtained in the color discrimination task (ANOVA; p = .00056; Figure 5D). For this analysis, the AUROCs of both monkeys for the 20% and 40% SNR conditions were combined because a significant effect of Training was only present for these SNRs (see above). For the neurons preferring the trained orientations (n = 52; blue curve), the mean AUROC was—as expected—greater when the orientation pair included the preferred orientation (Pref, Pref+90) compared with the other pair (orientations 45° offset from preferred orientation). However, this was not the case for the neurons preferring orientations 45° offset from the trained orientations (n = 15; red curve). For these neurons, the mean AUROC for the trained orientations was similar to that for the untrained orientations, although the trained orientations in these neurons were 45° offset from their preferred orientation. This interaction between orientation pair and neuron group was stronger at the lower SNRs employed in the color task than for the 80% SNR stimulus used in the passive fixation task (compare Figure 5C and D). This difference in the strength of the interaction may reflect the difference in SNR, that is, additional learning effects for the highly trained low SNR stimuli. Overall, the data suggest that the training enhanced the discriminability for the trained orientations even for neurons not preferring the trained orientations. On the basis of these data, the increase in AUROC for the trained compared with the untrained orientations does not solely reflect the larger proportion of neurons tuned to the trained orientation. Indeed, in addition to the shift in preferred orientation, there was an enhanced discriminability for the trained orientations, especially at lower SNRs and for neurons preferring the untrained orientations.
We recorded also from single neurons of the fMRI-activated PIT region of the “untrained” hemisphere of monkey M. The median SI of the 66 responsive PIT neurons in the “untrained” hemisphere was 0.14, which was significantly lower than the median SI of 0.30 for the trained hemisphere in the same animal (n = 123; Mann–Whitney U test; p = 1.02 × 10−7). Thus, although responses to the preferred orientations had similar strengths in the “trained” and “untrained” hemisphere, the average orientation tuning was broader in the untrained compared with the trained hemisphere (Figure 6A). The mean normalized response was similar across orientations in the “untrained” hemisphere (one-way ANOVA: p = .14) and differed significantly from that obtained in the “trained” hemisphere of the same animal (ANOVA: interaction Orientation and Hemisphere: p = .0078; Figure 6B). Whereas 44% of the PIT neurons of the “trained” hemisphere of monkey M preferred the trained orientations, only 25% showed such preference in the “untrained” hemisphere. Neither during passive fixation nor when performing the orthogonal color discrimination task (mean behavioral performance of 87% correct) did the mean AUROC for the “untrained” hemisphere neurons differ between the trained and untrained orientation pairs (passive fixation, 80% SNR; paired t test: p = .80, n = 66; color discrimination, ANOVA: main effect of Orientation pair: p = .50; interaction Orientation and SNR: p = .66, n = 31; Figure 6D). However, ANOVA showed a significant interaction between the Hemisphere and Orientation pair during passive fixation (p = .027, n = 189) and a significant interaction between Hemisphere, Orientation pair, and SNR during the color discrimination (p = .00073, n = 93; Figure 6C, D). In summary, although orientation-sensitive neurons were present in the “untrained” hemisphere, their responses were similar for trained and untrained orientations, unlike in the “trained” hemisphere.
PIT Responses in the Coarse Orientation Discrimination Task
We recorded the responses of a subsample of the 81 PIT neurons (monkey P: 30 responsive neurons; monkey M: 51 neurons; only “trained” hemisphere) while the animals were performing the coarse orientation discrimination task for randomly interleaved SNRs that varied between 0 and 40% (trained orientations at the trained location). The psychometric curves, averaged across the behavioral data of those recordings, were highly similar for the two animals (Figure 7A, red and blue solid lines, ANOVA: main effect of Monkey: p = .09, interaction Monkey and SNR: p = .49), with a 75% correct threshold of 12% SNR. Also, the neurometric curves did not differ significantly between the two animals (Figure 7A, red and blue stippled lines, ANOVA: main effect of Monkey: p = .24, interaction Monkey and SNR: p = .58). The average neurometric curve was substantially lower than the psychometric, with a 75% correct neural threshold between 20% and 40% SNR in each monkey. Nonetheless, the average neural performance at 10% SNR (mean AUROC = 56%; n = 81 neurons) was significantly above chance (t test; p = 2.2 × 10−7). Thus, the population of PIT neurons can signal the orientation of trained stimuli with high levels of noise. Figure 7A also compares the neurometric performance of the PIT neurons with the average neurometric performance obtained in V4 (n = 59) of the same animals at the “late” stage of training (Adab & Vogels, 2011). ANOVA on the pooled data of the two animals with factors SNR and Area showed a significant interaction between the two factors (p = .024), which was due to a significantly higher neurometric performance in area V4 (83%) compared with PIT (77%; unpaired t test; p = .009) at 40% SNR. No significant differences between the two areas were present at lower SNRs.
To assess whether the PIT responses covaried with the behavioral responses of the animals, we computed for each neuron grand CPs, combining data for SNRs smaller than 40% (see Methods). The mean CP was 0.53, which is significantly larger than chance (t test; p = .0013; n = 81; Figure 7B). The mean CP rose to 0.56 (p = .001; n = 20) when considering only neurons for which the AUROC was greater than 90% at 40% SNR in the coarse orientation discrimination. The mean CP in PIT tended to be larger than that observed in V4 in the same animals (mean: 0.52; neurons with AUROC > 90% at 40% SNR: 0.53; Adab & Vogels, 2011), a trend that failed to reach statistical significance (Mann–Whitney U test, p = .2). Applying the correction procedure proposed by Kang and Maunsell (2012) for unbalanced ratios of behavioral choices across stimulus conditions yielded somewhat larger grand CPs for PIT (mean = 0.56; n = 81).
Functional imaging in monkeys that were trained to discriminate two orthogonal gratings, with low SNR, revealed activations induced by noisy gratings in a posterior IT region. Subsequent single unit recordings in PIT showed orientation selective responses that were comparable to V4 neurons in the same animals. More importantly, more PIT neurons preferred the trained compared with untrained orientations and showed a greater discriminability of trained versus untrained orientations. The effects of training on the responses to grating stimuli were also present when the animals were performing a difficult orthogonal task in which the grating stimuli were task-irrelevant, suggesting that the training effect does not need attention to be expressed. This is in contrast to some perceptual learning effects in V1 with more complex stimuli, which were only present during task performance (Li, Piech, & Gilbert, 2004, 2008). A comparison of the responses between trained and untrained hemispheres suggested that the orientation selectivity in this region was enhanced by training. These data show that the responses of a late cortical area, PIT, change during the perceptual learning of a simple discrimination task using simple oriented gratings.
The PIT region we recorded from appears to correspond to the lower visual field representation of PIT/TEO with low eccentricity receptive fields (Boussaoud, Desimone, & Ungerleider, 1991). This region is anterior to V4A (Roe et al., 2012). Tracer studies suggest that this PIT/TEO region receives direct input from dorsal V4 (Ungerleider, Galkin, Desimone, & Gattass, 2008) as well as of other extrastriate areas (Markov et al., 2014). Single unit recordings have shown that neurons in PIT are form selective (Yasuda, Banno, & Komatsu, 2010; Connor, Brincat, & Pasupathy, 2007; Hikosaka, 1999; Kobatake & Tanaka, 1994; Desimone & Gross, 1979), but despite their form selectivity, many still respond to simple features (Kobatake & Tanaka, 1994). In an older study, Vogels and Orban (1994) reported a mean orientation SI of 0.36 in a region that, likely, overlapped with PIT, which is comparable to that observed here in PIT of the trained hemisphere. These previous recordings were performed after extensive training in a successive orientation discrimination task. This observation, together with the present finding that orientation selectivity was lower in the untrained compared with the trained hemisphere, suggests that high orientation selectivity in PIT may require extensive training in orientation discrimination.
To the best of our knowledge, no other studies on orientation selectivity in PIT exist. Previous single cell studies demonstrated learning effects for shapes and objects in macaque IT cortex (Woloszyn & Sheinberg, 2012; Li & Dicarlo, 2008, 2010; Cox & DiCarlo, 2008; De Baene, Ons, Wagemans, & Vogels, 2008; Op de Beeck, Wagemans, & Vogels, 2007; Freedman, Riesenhuber, Poggio, & Miller, 2006; Baker, Behrmann, & Olson, 2002; Kobatake, Wang, & Tanaka, 1998), but without exception, these studies recorded more anterior in IT. Srihasam, Mandeville, Morocz, Sullivan, and Livingstone (2012) reported fMRI activations in juvenile but not adult monkeys after symbol training in the occipitotemporal sulcus, more medial than where we recorded. Thus, our study is the first demonstrating learning-related changes in lateral, posterior IT. Human fMRI studies have revealed that learning changes the representation of low saliency shapes in higher-order visual cortical areas such as LOC and the posterior fusiform region (Kourtzi, Betts, Sarkheil, & Welchman, 2005), although in this case the stimuli were more complex than those employed in our study.
We show here that for simple stimuli, gratings, a higher-order visual area undergoes changes in its responses following training. Thus, training to discriminate simple visual stimuli modifies the responses of neurons in multiple visual areas at different levels of processing, and there is no reason to assume that only those in early visual cortex contribute to the behavioral performance improvement. In fact, CPs were similar in PIT and V4, which may indicate that both areas contribute to the behavioral decisions. Because V4 and PIT neurons project mainly to the same regions (Webster, Bachevalier, & Ungerleider, 1994; Distler, Boussaoud, Desimone, & Ungerleider, 1993), except for projections to areas located more anterior in the temporal lobe, they could be read out in parallel in the present task. Alternatively, because PIT is hierarchically a higher region than V4, it is possible that PIT neurons contribute more than V4 to the decisions in this task, with the similar CPs in both areas reflecting noise correlations between these areas. Causal methods, artificially interfering with neural activity, will be needed to distinguish these possibilities.
Unlike what we show here for PIT, previous single unit studies of perceptual learning in areas V1 (Ghose et al., 2002; Crist, Li, & Gilbert, 2001; Schoups et al., 2001), V2 (Ghose et al., 2002), and V4 (Adab & Vogels, 2011; Raiguel et al., 2006; Yang & Maunsell, 2004) found no increase in the proportion of neurons tuned to trained stimuli. The findings in V1, V2, and V4 contrast with perceptual learning effects in monkey primary somatosensory (Recanzone, Merzenich, Jenkins, Grajski, & Dinse, 1992) and auditory (Recanzone, Schreiner, & Merzenich, 1993) cortices, which did show enhanced representations of the trained stimuli. These findings combined with ours suggest that perceptual learning-related plasticity in late (such as PIT) but not early visual cortical areas resembles the plasticity present in primary auditory and somatosensory cortices.
It is unclear to what degree the learning-related changes we observed in PIT reflect changes in the areas that provide input to PIT (Markov et al., 2014). We prefer the hypothesis that the effects we observed in PIT at least partially reflect plasticity in PIT and do not merely result from a pooling of perceptual learning related changes in the areas that provide input to PIT. The increased responses to the trained orientations and more neurons preferring the trained orientations in PIT might reflect the fine tuning of a trained orientation template (Dosher & Lu, 1998), which enhances the signal for the trained orientations. This fine tuning can be viewed as a reweighting of the input to PIT neurons, favoring the trained orientations. Thus, perceptual learning may modify the response of multiple visual areas in a cascade, which is more complex than posited in current formal models of perceptual learning that considered changes in only early visual cortex (Bejjanki, Beck, Lu, & Pouget, 2011; Roelfsema & van Ooyen, 2005) or of the link between a stable early visual cortical signal and the decision (Sotiropoulos, Seitz, & Series, 2011; Law & Gold, 2009; Petrov, Dosher, & Lu, 2005).
We thank I. Puttemans, P. Kayenbergh, G. Meulemans, S. Verstraeten, M. Docx, W. Depuydt, D. Mantini, R. Peeters, T. Janssens, and M. De Paep for assistance and Dr. J. Taubert for critical reading of a draft. This study was supported by GOA, IUAP, PF, FWO, and the People Programme (Marie Curie Actions) of the EU 7th Framework Programme FP7/2007-2013/ (REA Grant PITN-GA-2011-290011).
Reprint requests should be sent to Rufin Vogels, Neurofysiologie, Campus Gasthuisberg, Herestraat, 3000, Leuven, Belgium, or via e-mail: Rufin.firstname.lastname@example.org.