There is substantial evidence that object representations in adults are dynamically updated by learning. However, it is not clear to what extent these effects are induced by active processing of visual objects in a particular task context on top of the effects of mere exposure to the same objects. Here we show that the task does matter. We performed an event-related fMRI adaptation study in which we derived neural selectivity from a release of adaptation. We had two training conditions: “categorized objects” were categorized at a subordinate level based on fine shape differences (Which type of fish is this?), whereas “control objects” were seen equally often in a task context requiring no subordinate categorization (Is this a vase or not?). After training, the object-selective cortex was more selective for differences among categorized objects than for differences among control objects. This result indicates that the task context during training modulates the extent to which object selectivity is enhanced as a result of training.
We know from previous studies that the representations underlying human object recognition and categorization are adaptive, even in the adult brain (Kourtzi & DiCarlo, 2006; Moore, Cohen, & Ranganath, 2006; Op de Beeck, Baker, DiCarlo, & Kanwisher, 2006; Tanaka, Curran, & Sheinberg, 2005; Op de Beeck, Wagemans, & Vogels, 2003; Grill-Spector, Kushnir, Hendler, & Malach, 2000; Gauthier, Tarr, Anderson, Skudlarski, & Gore, 1999; Gold, Bennett, & Sekuler, 1999; Goldstone, 1998; Schyns, Goldstone, & Thibaut, 1998). Many studies have compared objects trained during a training phase with objects that were never seen during training. Hence, any difference between these conditions might be caused by the explicit training regime, for instance, object discrimination or subordinate categorization, as much as by mere exposure to the stimuli presented during the training. Here we present a human functional magnetic resonance imaging (fMRI) study, in which the effect of training task was investigated while controlling the amount of exposure across conditions.
The mechanisms underlying human object recognition can be adapted to the current needs of the perceiver. At a behavioral level, perceptual discrimination training improves the ability to recognize and categorize objects (Op de Beeck et al., 2003; Grill-Spector et al., 2000; Gold et al., 1999; Goldstone, 1998; Schyns et al., 1998). Perceptual expertise is characterized by a downward shift in the level at which objects are first identified. Although novices are faster in making judgments about objects at a basic level (e.g., bird) than at a subordinate level (e.g., woodpecker), experts are equally fast in making both types of judgments (Tanaka & Taylor, 1991). Psychophysical findings show that perceptual categorization, and not perceptual exposure per se, is necessary for the development of perceptual expertise (Tanaka et al., 2005).
Human fMRI studies have suggested several neural correlates of these behavioral changes in the object-selective cortex: increased blood oxygenation level-dependent (BOLD) activation (Moore et al., 2006; Op de Beeck et al., 2006; Grill-Spector et al., 2000; Gauthier et al., 1999), altered spatial distribution of activation across the extrastriate cortex (Op de Beeck et al., 2006), and increased neural selectivity of the object-selective lateral occipital complex (LOC) as derived from the release of adaptation (Jiang et al., 2007). Similar effects of training on neural processing have been noted using single-cell recordings in nonhuman primates (Op de Beeck, Wagemans, & Vogels, 2007; Freedman, Riesenhuber, Poggio, & Miller, 2003, 2006; Rainer, Lee, & Logothetis, 2004; Baker, Behrmann, & Olson, 2002; Sigala & Logothetis, 2002).
All human fMRI studies included a comparison between trained objects and objects that had never been seen during the training. Especially noteworthy is the study by Jiang et al. (2007) that investigated the effect of category learning with an event-related adaptation design. The results suggested that training increases the neural selectivity in the object-selective cortex for differences among trained objects compared to the selectivity for control objects. This result was found for differences among any trained objects, thus even between objects that were grouped in the same category during the training. Thus, the categorization task might actually be totally irrelevant for the outcome of the experiment, and mere exposure to the stimuli might be enough to induce these effects.
Here we present a study that is very similar to the study by Jiang et al. (2007): We scanned subjects with an event-related adaptation design after a training phase in which they had learned to categorize a set of shapes, all from the same basic-level category, into two subordinate categories. In fMRI adaptation, responses are compared between a condition with an exact stimulus repetition (“same” trials) and a condition with two different stimuli (“different” trials). Higher responses in “different” trials compared to “same” trials suggest a release of adaptation, and this release of adaptation is used as an index for neural selectivity (Krekelberg, Boynton, & van Wezel, 2006; Grill-Spector & Malach, 2001).
In our study, we compared the overall activity and the neural selectivity for the trained objects (“categorized”) in the object-selective cortex with the activity and the selectivity for objects that had been seen equally often in a task that did not require subordinate categorization (“control”). Using a within-subject design, we investigated whether the plasticity of the human object recognition mechanisms must be attributed to perceptual exposure or to perceptual categorization at a finer level of discrimination (see Tanaka et al., 2005 for evidence at a behavioral level). We found more release of adaptation for the categorized compared to the control objects, suggesting more neural selectivity for categorized objects. Thus, it matters what subjects do with the stimuli during the training phase.
Nineteen subjects (4 men, aged 20–26 years) participated in this experiment. The data of one subject were removed due to excessive head motion. The excessive motion was already clear from a visual inspection during scanning, and it was confirmed by the motion correction parameters (translation and rotation) needed to align all functional images: the amount of variability in these parameters between successive time points was 5.3 times higher in this subject compared to the average across subjects, and 3.0 times higher compared to the subject with the second largest head motion. Informed consent was obtained from all subjects prior to the experiment. All procedures were approved by the relevant ethical boards, that is, the ethical committee of the Faculty of Psychology and Educational Sciences (K.U. Leuven) and the committee for medical ethics of the University Hospital.
We created three 2-dimensional shape spaces (fish, birds, and cars), consisting of 20 exemplars each (see Figure 1). In addition, we created a continuous set of vases. Each shape space was created by morphing between four stimuli, selected from a larger database (Op de Beeck & Wagemans, 2001), that were rated by subjects as being relatively different in shape compared to other objects from the same basic-level category (Panis, Vangeneugden, & Wagemans, in press-b). The images were generated using a commercially available morphing algorithm, Magic Morph (www.effectmatrix.com/morphing/index.htm). This algorithm generates a continuous set of stimuli by weighing corresponding reference points on a source and a target stimulus. The stimuli were black contour lines (300 × 225 pixels) presented on a white background. They were briefly masked by squares (380 × 280 pixels) containing noise (see Figure 2). Stimulus presentation and response registration was controlled by a PC running E-Prime software (www.pstnet.com/products/e-prime/). Stimuli were shown on a CRT monitor during training (resolution 1024 × 768 pixels, refresh rate 75 Hz), and projected on a mirror in front of the head during scanning by means of a liquid crystal display projector (1280 × 1024 pixels; Barco 6300; Barco, Kortrijk, Belgium). Stimuli, approximately 9° visual angle in width, were presented in the center of the screen, but their exact position could vary at random within 1° visual angle.
Training for the fMRI Study: Subordinate Categorization and Perceptual Experience
Subjects were trained with three shape spaces (birds, cars, and fish) during two sessions on two consecutive days prior to the day on which the scanning occurred. Each session lasted 50 min. The training was rehearsed on a laptop for 10 min before the scanning session. Each subject learned to categorize two out of the three shape spaces and saw the shapes of the third shape space with the same frequency in a control task. This procedure was used to investigate the effect of subordinate categorization on top of the effect of perceptual experience per se. For the perceptual categorization task (PCT), we divided each shape space in two categories according to a vertical or a horizontal category boundary. Shapes could then be grouped for analyses according to their distance to the category boundary (from 1 to 4). This is illustrated in Figure 1. The assignment of shape spaces and category boundaries to conditions was counterbalanced across the 18 subjects included in the analyses.
In each trial of the PCT, one shape was shown (stimulus duration = 200 msec; mask = 50 msec; response window = 1500 msec; intertrial interval = 500 msec) and subjects had to determine whether the shape belonged to Category A (left keypress) or to Category B (right keypress). An auditory tone was given to signal the correctness of the answer. A high tone indicated a correct answer, a low tone an incorrect answer.
The same trial sequence was used for the control task, which was an odd-man-out task (OMOT). In contrast to the PCT, subjects did not have to categorize the shapes at a subordinate level. Most trials contained a shape from the same basic-level category (birds, cars, or fish), and an oddball (always a vase) was presented in 10 out of the 110 trials. Participants were asked to press a button whenever they noticed the oddball.
Each training session consisted of eight PCT blocks (four blocks for the first and four blocks for the second shape space) and four OMOT blocks (for the third shape space). The order of the blocks was randomized before each session. One PCT block consisted of 100 trials. Each of the 20 shapes within the shape spaces was categorized five times in each block, and 20 times in each session. After each block, subjects received feedback regarding their overall percentage correct in that block. An OMOT block consisted of 110 trials. Apart from 10 vases (the oddballs), each shape from the third shape space was presented five times within each block, and 20 times in each session. Hence, shapes of all three shape spaces were seen equally often during the training, but only some of them were categorized at the subordinate level.
Control Experiment: The Influence of Subordinate Categorization on Discrimination Performance
To examine the influence of this training regime on the behavioral performance, we trained 16 different subjects during three sessions on consecutive days. In contrast to the training regime described above, we also used the shape space of the vases. Each subject learned to categorize two out of the four shape spaces in a PCT and saw the shapes of a third shape space in an OMOT. Shapes of the fourth shape space were not seen during this control experiment. The exact condition in which each shape space appeared was counterbalanced across subjects. The training phase was followed by a test phase, in which subjects performed a same–different task on shapes selected from the two categorized shape spaces and the control shape space. The test phase was only performed posttraining. This allowed us to directly investigate the influence of perceptual categorization on top of the influence of perceptual exposure. Accuracy rates on “same” and “different” trials were converted to d′, a bias-independent measure of perceptual sensitivity for shape differences, which we normalized to be the same (d′ = 1) for each shape space across all subjects. Without this normalization, there was a significant difference between shape spaces [F(3, 44) = 4.15, p = .01], with higher discriminability of vases (M = 1.39) than of birds (M = .95), cars (M = .87), and fish (M = 1.14). Using a paired t test across subjects on normalized d′, we found that categorization training improved the perceptual discrimination among “categorized” (M = 1.06) compared to “control” shapes (M = .87) [t(15) = 2.79, p = .01] (Figure 3). Hence, subordinate categorization increased discriminability to a larger extent than mere perceptual exposure. This conclusion is consistent with Tanaka et al. (2005).
Scanning was performed the day after the last training session at the Department of Radiology of the University Hospital in a 3-T Philips Intera magnet with an eight-channel SENSE head coil. The functional runs consisted of nine experimental runs and two localizer runs. For one subject, we collected data from eight experimental runs and two localizer runs. The localizer runs were acquired after the experimental runs and were followed by the acquisition of an anatomical scan. For the experimental runs, functional images were acquired with an ascending echo-planar imaging (EPI) sequence (152 time points per time series; 2000 msec repetition time [TR], 30 msec echo time [TE], 90° flip angle, 80 × 80 acquisition matrix, 2.75 × 2.75 mm in-plane resolution, 3 mm slice thickness, no interslice gap, 36 axial slices including most of the cortex except the most superior parts of frontal and parietal cortex). The protocol was slightly different for the localizer runs (164 time points per time series; 3000 msec TR). A T1-weighted anatomical image was acquired at the end of each session (9.6 msec TR, 4.6 msec TE, 256 × 256 acquisition matrix, 1 × 1 mm in-plane resolution, 1.2 mm slice thickness, 182 coronal slices).
The EPI images from the two localizer runs were acquired to define the object-selective LOC. Localizer runs consisted of blocks of fixation spot, intact familiar, intact new, scrambled familiar, and scrambled new objects (Kourtzi & Kanwisher, 2000). Each block lasted 30 sec and was repeated thrice within each localizer run. This sequence was followed by an 18-sec fixation period. The sequence of the blocks was randomized prior to the experiment and was different for both runs. Subjects performed a color-change detection task. They were asked to press a button each time the fixation cross changed in color. This occurred approximately every 3 sec.
Each trial in the experimental runs lasted 2000 msec. The first image was presented for 350 msec, followed by a mask of 150 msec, the second shape (150 msec) and a second mask (1050 msec) (see Figure 2). For half of the subjects (n = 9), the mask was replaced by a fixation spot. In each block, we presented 144 trials, preceded by 4 and followed by 4 fixation trials. One block contained 10 “catch” trials, 14 fixation trials, and 120 event trials. The catch trials consisted of a bird, car, or fish, followed by a vase. The event trials consisted of two birds, two cars, or two fish. Both shapes within each event trial were selected from the same shape space and were either the same or different. In different trials, we used the smallest difference on the morphing line. Only the 16 morph stimuli were used during scanning, and not the original four shapes (the “corner stimuli” in Figure 1). The conditions were randomized a priori using a genetic algorithm (Wager & Nichols, 2003).
In the scanner, subjects performed a vase detection task. They were asked to press one button whenever they saw two birds, two cars, or two fish, and another button whenever they saw a vase. Analyses on the behavioral data were performed using the data of 16 subjects, as responses were not properly encoded due to technical problems for the remaining two subjects. Responses were coded as correct or incorrect and reaction times (RTs) were registered after the offset of the second stimulus. Subject detected 88% of the vases (SEM = 0.02) and made a false alarm in 12% of the trials in which two birds, two fish, or two cars were presented (SEM = 0.04). They responded significantly slower on catch trials (M = 396 msec, SD = 20) than on the other trials (M = 296 msec post stimulus offset, SD = 15) [t(15) = 7.63, p < .0001]. There was no significant difference between the categorized and control objects in the proportion of hits [t(15) = 1.12, p = .28], false alarms [t(15) = 0.43, p = .67], or in the RTs [t(15) = −.70, p = .49].
Analysis of Imaging Data
Data were analyzed using the Statistical Parametric Map software package (SPM5, Wellcome Department of Cognitive Neurology, London), as well as custom Matlab code.
Images were first corrected for differences in acquisition time (time slicing). They were then realigned to correct for head movements. The functional images of each subject were coregistered with his or her anatomical image. The coregistered anatomical image was segmented and the resulting parameters were used to spatially normalized the functional images into the standard Montreal Neurological Institute (MNI) template. During the normalization, the images were resampled to a voxel size of 3 × 3 × 3 mm. Finally, the images were spatially smoothed with a 6-mm full-width half-maximum kernel before statistical analysis.
Regions of Interest (ROIs)
The object-selective voxels in the lateral occipital cortex were identified using the data of the localizer runs. For comparison purposes, we defined two additional ROIs: (1) the early visual cortex and (2) the left inferior frontal cortex (see also Vuilleumier, Henson, Driver, & Dolan, 2002).
Lateral occipital complex
We used SPM5 to identify LOC using the data of the localizer runs. The BOLD signal was modeled using a boxcar response model smoothed with a hemodynamic response function. The general linear model contained four independent variables (intact vs. scrambled, new vs. familiar) and six regressors (the translation and rotation parameters needed for the realignment). Object-selective areas were defined as those areas in the lateral occipital and ventral occipito-temporal cortex with a larger BOLD activity while subjects viewed intact versus scrambled objects. The t map corresponding to that contrast (uncorrected p value of .0001) was overlaid on the individual's coregistered anatomical image to select the voxels in the areas of interest. The LOC contained, on average, 166 voxels in the left hemisphere (mean MNI coordinates [−42 −75 −6]) and 196 voxels in the right hemisphere (mean MNI coordinates [45 −72 −6]). Figure 4 illustrates the LOC for the subject with the largest, the median, and the smallest LOC.
Early visual cortex
In all the subjects, we defined an ROI around the calcarine sulcus. We selected those voxels that were significantly activated by visual stimulation during the localizer runs (intact and scrambled objects minus rest, uncorrected p value < .0001). The ROI in the early visual cortex contained, on average, 102 voxels in the left hemisphere (mean MNI coordinates [−18 −80 −4]) and 109 voxels in the right hemisphere (mean MNI coordinates [15 −81 13]).
Left inferior frontal gyrus
In 15 subjects, we defined a region in the left inferior frontal gyrus using the event-related trials in which we presented a vase (vases minus rest). The ROI was delineated in a similar way as the LOC, using an uncorrected p value of .0001 for 12 subjects, .001 for 2 subjects, and .005 for 1 subject. In the remaining three subjects, we were not able to find significant activation in the left inferior frontal gyrus. This ROI contained, on average, 33 voxels (mean MNI coordinates [−33 27 0]).
fMRI Data Analysis: ROI Analysis
The time course of the BOLD-signal intensity was extracted by averaging the data from all the voxels within the independently defined ROIs. We averaged the signal intensity across the trials in each condition from 0 to 12 sec posttrial onset (7 time points). These event-related time courses of BOLD-signal intensity were then converted to percent signal change by subtracting the corresponding value for the fixation condition and dividing by that value. We then computed the peak for the time courses across conditions for each subject (occurring 4 or 6 sec after trial onset). The percent signal change at the peak served as the measured response for each condition and was used in a repeated measures ANOVA.
fMRI Data Analysis: Whole-brain Analysis
We used SPM5 for the random effects analysis. This analysis takes into account the variability between subjects. First, we calculated three contrast images for each subjects, corresponding to the main effect of adaptation (different minus same), the main effect of training task (categorized objects minus control objects), and the interaction between adaptation and training task ([different categorized minus same categorized] minus [different control minus same control]). Second, the contrast images of the 18 subjects were taken together and entered in a one-way ANOVA. The level of significance was set at p < .05, corrected for multiple comparisons at the whole brain level.
Participants were trained to categorize two out of the three shape spaces according to a vertical or a horizontal category boundary. Accuracy and RT data were submitted to a repeated measures ANOVA with two factors (session, distance). Accuracy increased from Session 1 to Session 2 [F(1, 17) = 72.03, p < .0001] and was modulated by the distance of the stimulus to the category boundary [F(3, 15) = 94.16, p < .0001] (Figure 5A). RTs decreased from Session 1 to Session 2 [F(1, 17) = 27.14, p = .0001], and participants tended to be slower for stimuli near the category boundary [F(3, 15) = 16.46, p = .0001] (Figure 5B). Participants saw shapes of the third shape space in an OMOT. In this task, they could detect 99% of the oddballs (SEM = 0.003).
Neural Selectivity in the LOC
The LOC was identified for each subject using independent localizer scans. The event-related response in the LOC for each condition, averaged across subjects, is shown in Figure 6.
For statistical analyses, we defined the timing of the peak of the response per subject (which occurred 4 or 6 sec after trial onset). Using these peak responses as dependent variable, we performed a repeated measures ANOVA with two within-subject factors (same vs. different and categorized vs. control) and one between-subject factor (mask vs. no mask). There was no main effect of mask, nor did the effect of mask interact with any other variable (p > .05). We observed a significant difference between same and different trials [same < different: F(1, 16) = 15.32, p = .001], an effect that is traditionally interpreted as reflecting a release of adaptation related to the neural selectivity of the underlying neuronal population (Grill-Spector & Malach, 2001). There was no significant difference between categorized and control objects, with a small trend toward a greater signal for categorized objects than for control objects [categorized > control: F(1, 16) = 3.69, p = .07]. However, using paired t test, we found an effect of categorization for different trials [t(17) = 2.57, p = .02], but not for same trials [t(17) = 0.79, p = .44].
Most importantly, we found a significant interaction between the adaptation effect and the effect of categorization [F(1, 16) = 7.29, p = .02]. As can be seen in Figure 6, the adaptation effect was larger for categorized objects than for objects that were seen equally often in a control task. This result suggests higher neural selectivity for categorized compared to control objects. To control for possible shifts in the baseline signal strength (signal at time zero relative to trial onset, see Figure 6), we performed an additional analysis in which we normalized the signal at all time points by subtracting the signal at trial onset. We still observed an overall adaptation effect [F(1, 16) = 89.9, p < .0001] and, most importantly, a significant interaction between the adaptation effect and the effect of categorization with a stronger adaptation effect for categorized compared to control objects [F(1, 16) = 5.40, p = .03] (see Figure 7).
We analyzed further subdivisions of LOC (left vs. right hemisphere, lateral occipital vs. posterior fusiform regions) in 12 of the subjects (the LOC in the other subjects did not contain all subregions), and the enhanced neural selectivity for categorized objects did not interact significantly with region of interest [F(3, 33) = 0.94, p > .4; after normalization to baseline: F(3, 33) = 0.29, p > .8].
Finally, we checked whether the training induced any effect of the position of the categorized shapes in each shape space relative to the learned category boundary. As illustrated in Figure 8, shapes could be at a different side of this boundary (between-category), or at the same side of the boundary (within-category). We observed no significant differences in release of adaptation between these conditions [t(17) = 1.03, p = .32], suggesting that the enhanced neural selectivity for categorized objects transferred to differences among categorized objects that were irrelevant for the learned category structure.
Other Regions of Interest
Early Visual Cortex
We investigated whether the enhanced selectivity for categorized objects occurred also in the anatomical location of the primary visual cortex. We analyzed the event-related responses in visually active voxels around the calcarine sulcus, and this region did not show a stronger selectivity for categorized compared to control objects [F(1, 17) = .05, p > .8; after normalization to baseline: F(1, 17) = .54, p > .4].
Left Inferior Frontal Gyrus
We analyzed the event-related responses in a control region in the left inferior frontal cortex. This region did not show a significant release of adaptation [F(1, 14) = 0.21, p = .66]. There was no main effect of training task [F(1, 14) = 1.60, p = .23] or an interaction between the release of adaptation and the task [F(1, 14) = 0.90, p = .36].
We additionally performed a random effects analysis. This analysis allowed us to search for regions other than the LOC affected by the training task. No voxels survived at a threshold of .05, corrected for multiple comparisons at the whole brain level.
We described an event-related fMRI study in which we derived neural selectivity from the release of adaptation in trials with two different stimuli compared to trials with twice the same stimulus. An earlier fMRI study included similar outline stimuli that we used in the present study, and it revealed that the release of adaptation for pairs of images was a function of the parametrically varied difference between the images: More different images were associated with more release of adaptation, indicating higher selectivity (Panis, Vangeneugden, Op de Beeck, & Wagemans, in press-a). We compared neural selectivity for objects that were categorized at a subordinate level during a training phase with control objects that were seen equally often, but without subordinate categorization. We found higher selectivity for the categorized objects compared to the control objects. Among the trained objects, there was no difference in selectivity depending on whether the shape differences were relevant or irrelevant for the learned category rule. These results indicate that subordinate categorization per se, independent from visual exposure, results in a higher selectivity for shape differences in the LOC.
Studies of object learning typically do not control for visual exposure. A few studies have used an internal control by designing the training task so that some object features are irrelevant and some are relevant for the task, and relevance is counterbalanced across subjects. If a difference is seen in performance or selectivity between the irrelevant and the relevant features, then these effects indicate that the task matters. However, although this procedure is possible for a task such as categorization, it is not applicable to a task such as discrimination, where all objects need to be discriminated. Furthermore, the application of this procedure in studies of category learning has given mixed results. At a behavioral level, strong effects of feature relevance are only found for very distinct features (e.g., size and color; see Goldstone, 1994), relatively simple shape properties (e.g., aspect ratio and curvature, see Op de Beeck et al., 2003), and features that are located in different parts of the objects (e.g., Sigala & Logothetis, 2002). In other cases, including everyday objects with a complex shape, category learning seems to induce no difference in sensitivity for relevant and irrelevant dimensions (Jiang et al., 2007; Op de Beeck et al., 2003; Op de Beeck, Wagemans, & Vogels, 2001). As a result, even studies with this theoretically ideal control provide results that do not tell us whether it is the active training task that matters, or just the mere exposure to the objects. Within the literature on perceptual expertise, some behavioral between-subjects studies showed that the level of categorization during training profoundly influenced the discriminability of novel exemplars or novel categories (Tanaka et al., 2005).
Most studies, including all published fMRI object learning studies that we know of, had no internal control at all, and compared between trained objects and control objects that were never seen during training. Studies of perceptual learning of simpler visual properties, such as orientation discrimination, have included tighter controls. For example, Schoups, Vogels, Qian, and Orban (2001) presented oriented gratings at peripheral locations, and they compared the selectivity for gratings at a task-relevant location with selectivity for gratings at a task-irrelevant location in which stimuli had also been shown during training. In that case, the trained and control gratings were seen equally often, but only the trained gratings were attended to.
Thus, what is the defining difference between the two tasks that we used during training? Operationally, the most obvious difference is the level of categorization. For the OMOT used for the control objects, shapes had to be categorized at a basic level (Is it a vase or not?). For the PCT used for the categorized objects, shapes had to be categorized at a subordinate level (Which type of fish is it?). Of course, this simple operational difference will inevitably influence many cognitive processes.
First, subordinate categorization (e.g., a PCT) might facilitate attentional processes more than categorization at a higher level of abstraction (e.g., an OMOT). Murray and Wojciulik (2004) showed that attention not only increases the BOLD response in the LOC but also increases the neural selectivity of this region, as measured by the release of adaptation. Hence, one could argue that subjects pay more attention during the fMRI task to objects that were previously categorized compared to objects that were seen equally often in the control task. Several aspects of our data suggest that this argument cannot fully explain our results. First, although the data suggest a difference in task difficulty during the training phase, no differences in task performance were present during the vase detection task in the fMRI session. Second, when asked to explicitly pay attention to the difference between shapes (control experiment), subjects showed a higher discriminability for shapes that were previously categorized compared to shapes that were seen equally often in a control task. d′ is a bias-independent measure of perceptual performance. Third, although the (nonnormalized) data of Murray and Wojciulik (2004) showed a main effect of attention, even in their “same” condition, we failed to find evidence for an effect of categorization in the “same” condition. We observed a trend toward a main effect of training task when including all trials, but subsequent paired t tests showed that this effect could be accounted for by the effect on “different” trials. The dissociation between “same” and “different” trials is a considerable advantage of our event-related design, compared to previously used blocked designs. Hence, we suggest that attention might explain part of the effect in our study, as well as in previous studies that investigated the neural underpinnings of perceptual learning, but that attentional mechanisms cannot fully account for our results.
Furthermore, previous behavioral studies suggest that human perceivers use different object features when judging an object, depending on the nature of the categorization task (e.g., Tanaka et al., 2005; Schyns, Bonnar, & Gosselin, 2002; Schyns et al., 1998). These studies add evidence to the hypothesis that the crucial difference might be the level of detail with which the stimuli had to be processed during the training phase (see also Op de Beeck, Beatse, Wagemans, Sunaert, & Van Hecke, 2000). Thus, our results suggest that the effect of categorization training on neural selectivity depends on the level of categorization and the differences in cognitive processing associated with it.
Our study did not provide evidence for a higher neural selectivity for “between-category” compared to “within-category” differences. These data are in line with other human fMRI (Jiang et al., 2007) and single-cell recordings (Op de Beeck et al., 2001) studies showing no task-independent category effects after a training of categorization. On the contrary, Sigala and Logothetis (2002) did report higher selectivity for diagnostic over nondiagnostic features (but see Tanaka, 2004; Baker et al., 2002). It must be noticed in this context that we did not ask our subjects to categorize objects during the test phase. Thus, it might still be possible to observe category-selective responses in the object-selective and frontal cortex when making explicit judgments at a subordinate level (Jiang et al., 2007).
Our results are also relevant for a central topic in studies of visual learning: How much does learning depend on the bottom–up statistical properties of the input (unsupervised learning) relative to the feedback information about which stimulus properties are important to look for (supervised learning)? There is a substantial amount of data indicating that unsupervised learning occurs in adults (e.g., Fiser & Aslin, 2001; Purves, Lotto, Williams, Nundy, & Yang, 2001; Rosenthal, Fusi, & Hochstein, 2001). Unsupervised learning as a consequence of mere exposure might also have occurred in our study without us noticing it, as our design did not include a no-exposure baseline condition to look for the effect of mere exposure per se. Nevertheless, our study provides a clear illustration of the role of the task context in which objects are encountered for object learning. More studies will be needed to find out the relative contribution of unsupervised and supervised learning.
In sum, our study provides the first functional neuroimaging evidence that the level of categorization required during category learning influences the neural selectivity in the object-selective cortex for fine shape differences between the categorized objects afterward.
This work was supported by research grants from the Fund for Scientific Research (FWO Flanders, G.0218.06) and from the University Research Council (GOA/2005/03-TBA, IDO/02/004, and IMPH/06/GHW). We thank Bart Ons for helpful discussions. We are grateful for the valuable comments made by the two Reviewers, which helped us to strengthen this article.
Reprint requests should be sent to Johan Wagemans, Laboratory of Experimental Psychology, University of Leuven (K.U. Leuven), Tiensestraat 102, B-3000 Leuven, Belgium, or via e-mail: firstname.lastname@example.org.