Some of the brain areas in the ventral temporal lobe, such as the fusiform face area (FFA), are critical for face perception in humans, but what determines this specialization is a matter of debate. The face specificity hypothesis claims that faces are processed in a domain-specific way. Alternatively, the expertise hypothesis states that the FFA is specialized in processing objects of expertise. To disentangle these views, some previous experiments used an artificial class of novel objects called Greebles. These experiments combined a learning and fMRI paradigm. Given the high impact of the results in the literature, we replicated and further investigated this paradigm. In our experiment, eight participants were trained for ten 1-hr sessions at identifying Greebles. We scanned participants before and after training and examined responses in FFA and lateral occipital complex. Most importantly and in contrast to previous reports, we found a neural inversion effect for Greebles before training. This result suggests that people process the “novel” Greebles as faces, even before training. This prediction was confirmed in a postexperimental debriefing. In addition, we did not find an increase of the inversion effect for Greebles in the FFA after training. This indicates that the activity in the FFA for Greebles does not depend on the degree of expertise acquired with the objects but on the interpretation of the stimuli as face-related.
Substantial evidence supports a dissociation between face perception and general object recognition, but what determines this dissociation is a matter of debate. Domain-specific hypotheses attribute this dissociation to the special status of faces per se (Kanwisher, 2000), whereas process-specific hypotheses attribute this dissociation rather to the differential involvement of perceptual or cognitive processes that are not necessarily specific to faces (Tarr & Gauthier, 2000). Here, we implement one of the major paradigms used to argue in favor of process-specific hypotheses: laboratory training with Greebles. We show that the neural signature of this training is actually most consistent with domain-specific hypotheses.
Evidence for dissociations between face perception and generic object recognition comes from multiple paradigms. First, results from behavioral experiments indicate a different style of cognitive processing for faces (holistic processing) than for other objects (part-based processing). For example, effects of orientation inversion are more pronounced for faces than for other objects (Yin, 1969) and parts are particularly integrated into a whole in upright faces (Maurer, Le Grand, & Mondloch, 2002; Tanaka & Sengco, 1997; Tanaka & Farah, 1993). Second, brain injury can produce selective deficits in face recognition while leaving object discrimination intact (Dricot, Sorger, Schiltz, Goebel, & Rossion, 2008; Wada & Yamamoto, 2001) and vice versa (Moscovitch, Winocur, & Behrmann, 1997). Third, brain imaging experiments have found neural responses that are highly specific to faces (Kanwisher & Yovel, 2006; Kanwisher, McDermott, & Chun, 1997). The fusiform face area (FFA), a cortical region in the ventral temporal lobe, is activated at least twice as strongly when subjects view faces than when they view any other class of visual stimuli yet tested (Kanwisher et al., 1997). Several researchers also found a weak inversion effect in the FFA: a stronger activation for upright than for inverted faces (Yovel & Kanwisher, 2004, 2005). Fourth, single unit recordings in monkeys have revealed a clustered population of cells in the temporal cortex that respond selectively to faces (see Tsao & Livingstone, 2008, for an overview), suggesting a role for these cells in face recognition (Afraz, Kiani, & Esteky, 2006).
There are at least two alternative hypotheses that try to account for this dissociation between face and object recognition. The first hypothesis is the face specificity hypothesis, which states that cognitive and neural mechanisms underlying face perception are selectively engaged in perceptually processing faces and play little if any role in the perceptual analysis of nonface stimuli (Robbins & McKone, 2007; Duchaine & Nakayama, 2006; Duchaine, Yovel, Butterworth, & Nakayama, 2006; McKone, Kanwisher, & Duchaine, 2006; McKone & Robbins, 2006; Moore, Cohen, & Ranganath, 2006; Yue, Tjan, & Biederman, 2006; McKone & Kanwisher, 2005; Xu, Liu, & Kanwisher, 2005; Yovel & Kanwisher, 2005; Grill-Spector, Knouf, & Kanwisher, 2004; Rhodes, Byatt, Michie, & Puce, 2004; Carmel & Bentin, 2002; Kanwisher, 2000).
The second hypothesis, the expertise hypothesis, claims that mechanisms for face processing are not engaged only by faces but are also applied in expert within-class discrimination of nonfaces (McGugin & Gauthier, 2010; Curby, Glazek, & Gauthier, 2009; Harley et al., 2009; Wong, Palmeri, & Gauthier, 2009; Gauthier & Bukach, 2006; Gauthier, Curby, Skudlarski, & Epstein, 2005; Xu, 2005; Rossion, Curran, & Gauthier, 2002; Gauthier, Skudlarski, Gore, & Anderson, 2000; Tarr & Gauthier, 2000; Gauthier, Behrmann, & Tarr, 1999; Gauthier & Tarr, 1997). According to this hypothesis, face perception is special because we had a lifetime experience discriminating individual faces but almost no experience at making similar within-class discrimination with other objects. Thus, faces are indeed processed in a special manner, but the mechanisms carrying out that processing can work on other object classes given enough experience. This expertise hypothesis predicts that, when we develop enough expertise in within-class object discrimination, the same mechanisms will be engaged in the processing of faces and of objects-of-expertise. Thereby, experts with nonface objects should also show the hallmarks that usually differentiate face processing from object processing: objects of expertise should also be processed holistically and engage face-specific neural mechanisms.
To test the predictions of the expertise hypothesis, one important approach has been to test laboratory-trained subjects using an artificial class of novel objects called Greebles (see Figure 1). Participants were trained to identify different Greebles over several sessions (8–10 hr of training). Gauthier, Tarr, Anderson, Skudlarski, and Gore (1999) have examined brain activations before, during, and after Greeble training to monitor changes in the neural substrates thought to mediate perceptual expertise, more specifically the FFA. Their results showed that activation in the right FFA for upright minus inverted Greebles increased throughout Greeble training. This indicates that the inversion effect (a hallmark of holistic processing) in the right FFA increases after training. Furthermore, Gauthier and Tarr (2002) also found a positive correlation between FFA activity and holistic processing (behaviorally) across the five subjects.
Although Gauthier, Tarr, et al. (1998) interpreted their data as evidence in favor of the expertise hypothesis, there are some problems with this conclusion. First, they mostly examined changes of brain activation in face-selective regions. Therefore, this study lacks a critical test for the expertise hypothesis: Are the training effects specific to the FFA? Second, Greebles have a high degree of structural similarity to faces and body combinations. Therefore, a small expertise effect (including a positive correlation between FFA activity and holistic processing) for Greebles could reflect a specialized face-processing mechanism learning to stretch its definition of a face rather than a generic expertise effect that could occur for any object class. Third, there are some methodological issues with this study (e.g., the use of only five participants in their fMRI experiment; for an overview, see McKone & Kanwisher, 2005).
Given these concerns, combined with the prominent status and high impact of this earlier report in the literature, we conducted a replication study of the fMRI Greeble training experiment (Gauthier & Tarr, 2002; Gauthier, Tarr, et al., 1999) with some important adjustments. First, we also included the lateral occipital complex (LOC; the area that mainly processes objects) in our analyses to examine if there are any or even stronger changes in brain activation due to training beside the training effects found in FFA. We chose this region because other researchers (Op de Beeck, Baker, DiCarlo, & Kanwisher, 2006; Yue et al., 2006) already found training effects in LOC, but not in FFA. Second, we manipulated the context in which the stimuli were introduced during training. With this manipulation, we tried to influence the interpretation of the stimuli as living things (with faces) or artifacts/tools. A potential role of stimulus interpretation was suggested post hoc by Op de Beeck et al. (2006) because they observed one subject in their study that interpreted the trained stimuli (artificially created stimuli referred to as “smoothies”) as faces (reported in the postexperimental debriefing), and only this subject showed a training-related increase in FFA activity. Here, we manipulated stimulus interpretation experimentally. This manipulation was implemented at the beginning of the behavioral training, after the first scan session (therefore, the data of the first scan session cannot be affected by this manipulation).
The expertise hypothesis predicts a training-induced inversion effect in FFA, but no effect of interpretation (as expertise effects should be found with nonface objects) and does not make clear predictions about other regions, such as LOC. In contrast, the face specificity hypothesis predicts no training effect in FFA (but possible training effects elsewhere in the brain, e.g., LOC) and a possible effect of the interpretation of the stimuli (only inversion effect for stimuli that are interpreted as faces). Unless otherwise mentioned, our design was identical to the design used by Gauthier and colleagues (for the training experiment, Gauthier, Williams, Tarr, & Tanaka, 1997; for the fMRI experiment, Gauthier, Tarr, et al., 1999).
Our results show an inversion effect for Greebles before training, and this inversion effect did not increase after training. Furthermore, all our subjects interpreted the stimuli as living things, independent of the manipulation of interpretation. These findings indicate that the inversion effect for Greebles, before as well as after learning, is related to the interpretation of these novel objects as living things with faces rather than to expertise per se. Therefore, we also conclude that the Greeble stimuli set is not a good stimulus set to dissociate the two alternative hypotheses.
Eight students from K.U. Leuven participated in the experiment in return for payment. All experiments were approved by the relevant ethical boards, the ethical committee of the Faculty of Psychology and Educational Sciences (K.U. Leuven), and the committee for medical ethics of the K.U. Leuven.
The stimuli were 30 Greebles (Gauthier & Tarr, 1997), photorealistically rendered 3-D objects that share similar parts in a common spatial configuration. As shown in Figure 1, Greebles can be categorized into five different classes (which we will refer to as “families”) on the basis of the shape of the main body. A Greeble is distinguishable from other members of its own family by the shape of its appendages. The five families and the 30 individual Greebles (6 of each family) were given nonsense word labels (e.g., “Akst,” “Forn”). The Greebles were all rendered with the same gray shade, stippled texture, and overhead lighting direction. Images were about 6.5 cm high × 3.25 cm wide and, when viewed from about 60 cm from the screen, yielded a display area of approximately 6.2° × 3.1° of visual angle. The experiments were run on a Dell computer and a 17-in. CRT screen.
Participants were trained following the procedure used by Gauthier et al. (1997). The training procedure required participants to learn and recognize Greebles at both the family and the individual level. Each participant was trained for approximately 9 hr (during ten 1-hr sessions spread out over 2 weeks) Each session included a combination of seven different tasks: family inspection, family categorization, individual inspection, naming with response, naming with feedback, naming and verification (for a more detailed description; see Gauthier et al., 1997). The participants learned the family labels and individual names for five Greebles in the first session and then learned five more Greebles in each of the following three sessions. Ten Greebles remained unknown throughout the training, which made the tasks more difficult. In the family inspection, the individual inspection, naming with response, and naming with feedback participants only saw Greebles for which they already had learned a name. In the naming and verification tasks, however, all 30 Greebles were shown, even if participants did not know them by name. The correct response for unnamed Greebles in the naming task was to press the space bar. Each Greeble was shown twice in this task, for a total of 60 trials. When unnamed Greebles were seen in the verification task, participants were to respond “same” if they were preceded with a label that was not included during the training phase or “different” if they were preceded by the name of another learned Greeble.
All aspects of the procedure were modeled after Gauthier et al. (1997), but in addition, we also manipulated the interpretation of the stimuli before training started, but after the first scan session. Half of the participants were informed that the stimuli were called Greebles and could be differentiated into a family, a gender and an individual name. This information, which is similar to the context and labels used by Gauthier et al. (1997), might increase the odds of interpreting the stimuli as living creatures (although the instructions never mentioned explicitly that the stimuli could be living creatures). In contrast, the other half of the participants received different information, indicating that the stimuli were nonliving machines or machine parts. In this condition, we never mentioned the name “Greebles” or the labels “family” and “gender,” because they might induce the interpretation of a living creature. All participants were asked to familiarize with the general shape as well as with the different parts of the objects, so that they were able to differentiate between families as well as individual objects.
After the entire experiment, we conducted a short debriefing and asked the participants how they interpreted the stimuli, whether this interpretation had changed throughout the experiment and how they had solved the training task (how had they discriminated between the different stimuli?).
Data from the verification task were analyzed for trends over the course of training. We combined groups of tests into “bins.” Bin 1 included Verification Tests 1–3 (5 Greebles known), Bin 2 included Verification Tests 4–7 (10 Greebles known), Bin 3 included Tests 8–11 (15 Greebles known), and subsequent bins always included three verification tests. Accuracy and response times, averaged across verification tests in each bin, were analyzed. All response times reported in this article are geometric means calculated on correct trials only.
The eight participants of the training experiment also participated in two fMRI-scan sessions (one pretraining and one posttraining). Informed consent was obtained from all subjects.
Stimuli and fMRI Task
Participants were scanned once before they had any experience with Greebles and once after they completed the training procedure. Each scan session was very similar to the scan sessions conducted by Gauthier, Tarr, et al. (1999). In short, participants performed a sequential matching task with faces and Greebles in upright or inverted orientation (see Figure 2). There were six sequential matching runs per session. Five stimulus sets, each including eight grayscale faces, and eight Greebles of the same family (not used during training) were used (with the order of sets counterbalanced across subjects). Greebles within a block were always from the same family, making family names nondiagnostic. The faces (obtained from Niko Troje and Heinrich Bülthoff, Max Planck Institute, Tübingen, Germany) were scanned in a three-dimensional laser scanner. Each run included two repetitions of an CACBCDCFC cycle where A, B, D, and F were sequential matching epochs of 24 sec showing upright or inverted faces or Greebles (order counterbalanced in different runs) and C was a fixation block of 12 sec. Each epoch included eight trials and showed pairs of stimuli all upright or inverted. Subjects performed same/different identity judgments by pressing one of two buttons. We conducted two localizer runs per session and each run was four CACBCD cycles where A, B, and D were passive viewing epochs of 20 sec showing faces, everyday objects or scrambled pictures (order counterbalanced in different runs) and C was a fixation block of 6 sec. Each epoch showed 20 different stimuli.
fMRI Imaging Parameters and Analyses
Images were acquired in a Siemens TRIO TIM Scanner (Department of Radiology of K.U. Leuven) with a 12-channel head coil with an EPI sequence (100 time points per time series or “run,” repetition time = 3 sec, echo time = 30 msec, acquisition matrix = 80 × 80), resulting in a 2.5 × 2.5 in-plane voxel size, 50 slices oriented roughly halfway between a coronal and a horizontal orientation and including most of the cortex except the most anterior/superior parts of the frontal and parietal cortex (slice thickness = 2.5 mm, flip angle = 90°, field of view = 200 mm). We also acquired a T1-weighted anatomical image. We used a Barco RLM R6+ projector to present the stimuli. We made sure that head position in the posttraining session was very similar to the position in the pretraining session. Furthermore, posttraining slices were positioned manually to be as close as possible to the slices in the first session by visual comparison of pre-training and posttraining overlays of the slice outlines on the anatomy.
Data were analyzed using the Statistical Parametric Map software package (SPM5, Wellcome Department of Neuroimaging, London, UK) as well as custom Matlab code. Preprocessing involved realignment to correct for motion, coregistration of functional and anatomical images, segmentation, and spatial normalization. During spatial normalization, functional images were resampled to a voxel size of 2.5 × 2.5 × 2.5 mm. Finally, functional images were spatially smoothed (5 mm FWHM kernel). Statistical modeling of the signal in each voxel in each subject included a general linear model applied to preprocessed images, with four independent variables (one variable for each stimulus condition) and six covariates (the translation and rotation parameters needed for realignment).
After averaging the results for the four localizer runs (for each subject seperately), a “face-specific” area in the right hemisphere (the rFFA: t contrast of faces minus objects) and an “object-specific” area (the LOC: t contrast of objects minus scrambled pictures) were identified as the voxels significantly activated at the threshold of p < .0001 (uncorrected for the number of voxels). Anatomically, the selected voxels were always localized around the fusiform gyrus and the lateral and ventral part of the occipito-temporal cortex, respectively. For the analysis of the functional scans (sequential matching runs), the ROIs from the localizer scans were used.
The general linear model was used to compute the response of each voxel in each condition (resulting in “beta” values). The response for each condition in each voxel was normalized by the mean value in each voxel to obtain a measure of percent signal change compared with this mean (this mean can be interpreted as being similar to a fixation baseline given that the fixation blocks were not explicitly modeled in the GLM analysis). The average response across all voxels of an ROI was computed for each individual subject, and this response was combined across subjects by averaging. We also recomputed the results by normalizing the beta values after ROI averaging. This did not yield different results.
Manipulation of Interpretation
The manipulation of the interpretation of the Greebles yielded some interesting results: all eight participants interpreted the Greebles as “faces” or “face-like stimuli.” Some citations of participants in the “nonliving” condition were as follows (but all eight participants gave similar responses): “I saw the stimuli as different figures, little men with all their own appearances,” “I interpreted the stimuli as faces to make it easier to distinguish between them,” and “I saw the stimuli as figures with eyes, ears, noses and arms.” Clearly, our participants interpreted these stimuli as living creatures although we explicitly tried to avoid this interpretation, more so than in any previous study (in half of our participants). The participants also indicated that these interpretations did not change since the start of the experiment (the first scan session). Therefore, we abandoned our original goal of manipulating face interpretation. We collapsed the two conditions, treating all subjects as one condition (the “face interpretation” condition). Analyses with this additional between-subject factor did not yield any significant interactions (or trends) including this factor.
Figure 3A shows the mean accuracy for the recognition of Greebles in the verification task. Participants got better during training (F(8, 126) = 39.86, p < .0001), independent of the type of recognition-task involved (F(8, 126) = 1.197). This indicates that participants improved in recognizing Greebles, on the individual and on the family level, throughout training.
Group means for response times (correct trials only) in the verification task are shown in Figure 3B. For the response times, we found an improvement throughout training (F(8, 126) = 24.192, p < .001). Furthermore, participants reacted significantly slower in the individual verification trials than in the family verification trials (F(8, 126) = 20.715, p < .001). Surprisingly, the interaction between task and session was not significant (F(8, 126) = 0.147). This finding is in contradiction with the findings of Gauthier et al. (1997). The arrows in Figure 3B indicate the sessions where there is no significant difference in RT between the individual and family trials. Some previous studies considered their subjects as experts and stopped training, once a subject reached this criterion (Gauthier & Tarr, 2002). According to this criterion, all our subjects reached the expert status at some point during training (ranging from Sessions 1 to 7, mean of sessions = 4).
Participants were scanned once before they had any experience with Greebles and once after they completed the training procedure. To isolate expert processing, we compared conditions with upright and inverted images for faces and Greebles. The expertise hypothesis predicts that training with upright Greebles would lead to an increase in activation for upright Greebles minus activation for inverted Greebles in the face-specific ROI (FFA) but no comparable change for the (untrained) faces. On the other hand, the face specificity hypothesis does not expect an increase in the inversion effect in the FFA, given that subjects indicated that their interpretation of the Greebles did not change during training. Due to the fact that subjects invariably interpreted the Greebles as containing faces from the start of the experiment, we would predict a significantly larger response for upright than for inverted Greebles in FFA, both in the first and second scan session. An ANOVA was used to test these predictions.
As a control measure, we analyzed the accuracy of the eight subjects in the sequential matching task during the two scan sessions (mean values are given in Figure 4). Subjects made no more than 15% errors in the sequential matching task for any condition or at any point in the experiment. A session by stimulus-type ANOVA revealed that subjects got faster after training in all conditions (F(1, 28) = 31.754, p < .0001). There was no significant interaction effect between Stimulus Type and Session (F(1, 28) = 0.054). This is the same pattern of results as observed by Gauthier, Tarr, et al. (1999).
Activations in FFA
For faces, we found a main effect of Orientation, F(1, 7) = 12.083, p < .05, but no main effect of Session, F(1, 7) = .102, or interaction effect between Session and Orientation, F(1, 7) = 4.355, p = .075 (see Figure 5A). As predicted by both hypotheses, this indicates that there is no training effect for faces in FFA. The significant effect of Orientation indicates an inversion effect for faces in FFA (more activation for upright then for inverted faces).
For Greebles, we found a main effect of Orientation, F(1, 7) = 52.182, p < .001, but no main effect of Session, F(1, 7) < 1, and no interaction effect between Session and Orientation, F(1, 7) < 1. The results are shown in Figure 5B. This is the same pattern of results as observed for faces: There is an inversion effect for Greebles in FFA, but this inversion effect is not affected by training. However, there was already a significant inversion effect in the first scan session (t(7) = 3.117, p = .017), and it is not increasing (if anything, it tends to become smaller). These results are in contradiction with results previously found by Gauthier, Tarr, et al. (1999). The inversion effect observed for Greebles in the first scan session is probably caused by the interpretation of the Greebles as face-like.
We also explored the correlation between the signal change in FFA and the behavioral results. We calculated a behavioral index of change in expertise [(RT for individual trials in Session 1 − RT for family trials in Session 1) − (RT for individual trials in Session 9 − RT for family trials in Session 9)] and an index of change in the inversion effect for Greebles in the FFA before and after training [(activation for upright Greebles in the FFA in Scan Session 1 − activation for inverted Greebles in the FFA in Scan Session 1) − (activation for upright Greebles in the FFA in Scan Session 2 − activation for inverted Greebles in the FFA in Scan Session 2)]. The expertise hypothesis expects this correlation to be positive. Instead, we found a slightly negative correlation, r = −.240.
Activations in LOC
Figure 6 shows the activations in LOC for the four conditions in the pretraining and posttraining scan. No differences in activation levels were statistically significant. There tended to be a small drop in activation for faces in Session 2. Although this decrease was not significant, F(1, 7) = 1.030, p = .344, it might be at least part of the explanation of why we found an increase in LOC activity for Greebles relative to faces.
However, differences were noted in relative activation of trained (Greebles) versus untrained stimuli (faces). We calculated various indices comparing Greebles with faces: upright Greebles/upright faces, inverted Greebles/upright faces and inverted Greebles/inverted faces. These indices show a significant change between Session 1 and Session 2 in the relative activation for Greebles compared with faces: more activation for Greebles in the second session than in the first, F(1, 7) = 8.376, p < .05 (see Figure 7). This pattern of results was found for all three different indices. Thus, there is a trend toward a training effect in the LOC, rather than in the FFA.
These findings are in agreement with Op de Beeck et al. (2006), who also found a training effect in LOC after training their participants with novel object classes (smoothies, cubies, and spikies). However, we did not find a correlation between change in performance during training (performance in Session 9 − performance in Session 1) and change in activation in the LOC (activation for upright Greebles/upright faces in LOC in Scan Session 2 − activation for upright Greebles/upright faces in LOC in Scan Session 1), r = .097. Op de Beeck et al. (2006) did observe such a correlation. This difference is possibly because of a difference in training procedure: Our training was more complex and semantic than the one used by Op de Beeck et al. (2006). Therefore, it is more difficult to find an appropriate behavioral index: There is no straightforward index for discriminability in the current paradigm.
Our results have important implications for interpreting the results of previous research conducted with the Greeble paradigm and the correct conclusions that can be drawn from experiments conducted with this paradigm.
First and most importantly, we find a large inversion effect for Greebles in the FFA in the first scan session. This result indicates that the neural inversion effect for Greebles obtained in previous studies (Gauthier, Tarr, et al., 1999) is possibly because of the interpretation of the stimuli as faces. Secondly, this inversion effect does not increase after training. This finding is most consistent with the face specificity hypothesis.
It is important to note that this conclusion is based on positive, significant effects (a significant inversion effect in the first scan session) and that the failure to confirm the predictions of the expertise hypothesis is not because of a lack of statistical power (our experiments contained eight participants, which are more than the five used in the original study by Gauthier, Tarr, et al., 1999). Importantly, the basic disagreement with the previous study is actually caused by a significant inversion effect in the FFA in the first scan session, an effect that proved to be very robust in our study. However, we would like to acknowledge that our results do not offer conclusive evidence against the expertise hypothesis. It is possible we do not see a training effect in FFA because participants interpret Greebles as faces from the beginning of the experiment, and they are already experts in recognizing faces. However, this does not explain the discrepancy between our study and the study conducted by Gauthier, Tarr, et al. (1999), one of the cornerstones of the evidence in favor of the expertise hypothesis.
In the present study, we have focused on ROI analyses, and we did not address the distribution of training effects across the extrastriate cortex, in particular how training effects relate to the spatial distribution of pretrained object selectivity and face selectivity (the present study was not designed to dissociate these factors). Op de Beeck et al. (2006) applied such a method and found more training effects in the right lateral occipital gyrus than in other regions (e.g., the FFA). In addition, they observed a general increase in activation in the LOC and not in the FFA, which is consistent with our current findings as well as with many other studies in the literature (Gillebert, Op de Beeck, Panis, & Wagemans, 2009; van der Linden, Murre, & van Turennout, 2008; Jiang et al., 2007; Op de Beeck et al., 2006; Grill-Spector, Kushnir, Hendler, & Malach, 2000). Gauthier, Tarr, et al. (1999) also found training effects in other face-selective regions than the FFA (e.g., occipital face area and bilateral FFA), but it now seems that the effects are even more distributed and might in some contrasts even be stronger in non-face-selective cortex than in face-selective cortex.
In addition, our study indicates that in fact Greeble training is not an ideal paradigm to dissociate face-specific and expertise-related processes. People interpret the Greebles as face-like independent of the manipulation of the interpretation. In a study with a patient with visual object agnosia without prosopagnosia, Gauthier, Behrmann, and Tarr (2004) showed that Greebles do not necessarily have to be interpreted as faces. The patient performed poorly on Greebles, indicating that his intact face-specific abilities do not extend to include Greebles. These results suggest that insofar as the patient is relying on face-specific visual processes, these processes do not a priori treat Greebles as faces. However, our data show that normal participants usually (and spontaneously) see these stimuli as faces or face-like.
We do not know why our fMRI results diverge from Gauthier, Tarr, et al. (1999). Our original intention was to replicate the emerging inversion effect because of training and then try to manipulate the interpretation of the stimuli away from faces. As far as we know, this would have been the first replication. However, we failed in both intentions: replicating the original finding and manipulating stimulus interpretation. This is probably caused by the same reason: Subjects interpret the Greebles as living creatures with faces, even when we try to induce another interpretation. Given this information, it is of course no surprise that a massive inversion effect is already present in FFA before training. Given that we modeled our training and scanning methods as close as possible to the methods described by Gauthier, Tarr, et al. (1999), the most straightforward explanation for the absence of an inversion effect before training in the earlier report is the relatively low number of subjects.
Our behavioral findings are consistent with some reports in the literature, but also in this case, different results have been observed by Gauthier et al. (1997). Although Gauthier and colleagues have considered the response time comparison (equal speed for family level and exemplar level recognition) an important measure of expertise, other researchers have already raised some concerns about this criterion (McKone & Kanwisher, 2005). First, some participants of Gauthier et al. (1997) never reached the expert level or did not show any signs of asymptoting in the last bins, which shows an overlap between their and our data. Second, our participants seem to be “experts” in Sessions 1, 2, and 8 (see the arrows in Figure 4). Thus, many subjects reach the criterion after very little training. This is also the case for some participants in the article of Gauthier et al. (1997). Furthermore, Gauthier & Tarr (2002) do not examine the group means of response time, but they look at the mean response time per participant. During training, they observe whether the participant reaches the criterion for expertise. Once the participant reaches the expert level, they abort training. All our participants reached this criterion at one point or another in training. Furthermore, our data show that expertise is not stable: Our participants are “experts” in Session 8, but they do not stay experts in Sessions 9 and 10. Third, it is apparent that if the subset of individual Greebles used in our study were more alike than the ones used in the study of Gauthier, Tarr, et al. (1999), the criterion would have been more difficult to meet. Thus, this criterion is strongly influenced by the parameters of the experiment, and so it is a very questionable measure of a qualitative shift in recognition processes. Most importantly, whatever the status of this behavioral measure of expertise, it is irrelevant for our fMRI findings given that the basic discrepancy between our study and previous work is found before any training, namely the presence of a Greeble inversion effect in FFA in the first scan session.
Finally, although many studies have proposed that laboratory training paradigms provide a valid method to test the relationship between expertise and face selectivity (Gauthier & Tarr, 2002; Gauthier, Tarr, et al., 1999), we fully acknowledge that 10 hr of laboratory training is not the same as a lifetime of experience (see also Duchaine, Dingle, Butterworth, and Nakayama, 2004). The present study does not address the effects of long-time expertise. Two previous studies have shown effects of real-world expertise in the face-selective cortex, using bird and car experts (Xu, 2005; Gauthier et al., 2000). It is not clear what implications our results have for findings with real-world experts. Nevertheless, our study highlights the relevance of at least two important concerns that were not addressed in these studies on long-time expertise. First, training effects are likely to appear also outside face-selective cortex, as we and many before us observed training effects in LOC (Op de Beeck et al., 2006; Yue et al., 2006). In fact, Xu (2005) and Gauthier, Tarr, et al. (1999) mentioned the appearance of spots of activity outside FFA for the expert objects, and very recent findings suggest that these effects of (long-term) expertise are indeed very widespread in visual cortex (Harel, Gilaie-Dotan, Malach, & Bentin, 2010). Without a thorough investigation of activity outside face-selective cortex, findings are not conclusive for any hypothesis claiming that FFA has a selective, “special” role in expertise. Second, similarity of the objects to faces might underlie the location of training effects in the FFA rather than the expert processing per se. In that respect, bird experts are not a good control group, and only data of car experts seem directly relevant for differentiating the expertise hypothesis from the face specificity hypothesis, provided that the strength of effects in face-selective cortex is compared with the strength of effects in other sub-regions of the ventral object vision pathway.
We conclude, in line with findings from other approaches such as neuropsychology (Duchaine et al., 2004) other training paradigms (Op de Beeck et al., 2006) and studies investigating long-term expertise (Harel et al., 2010), that the fMRI activity changes related to Greeble training indicate that FFA has no such special role in visual expertise as proposed by the expertise hypothesis. Instead, FFA appears to be activated according to the degree to which stimuli like Greebles are interpreted as face-related, which does not necessarily require any training.
This work was supported by the Research Council of K.U. Leuven (IMPH/06/GHW, CREA/07/004), the Fund for Scientific Research-Flanders (G.0281.06, 1.5.022.08), a Methusalem grant (METH/08/02) from the Flemish Government, a federal research action (IUAP P6/29), and the Human Frontier Science Program (CDA 0040/2008).
Reprint requests should be sent to Hans P. Op de Beeck, Tiensestraat 102, bus 3714, 3000 Leuven, Belgium, or via e-mail: Hans.OpDeBeeck@psy.kuleuven.be.