Abstract

fMRI studies have reported three regions in human ventral visual cortex that respond selectively to faces: the occipital face area (OFA), the fusiform face area (FFA), and a face-selective region in the superior temporal sulcus (fSTS). Here, we asked whether these areas respond to two first-order aspects of the face argued to be important for face perception, face parts (eyes, nose, and mouth), and the T-shaped spatial configuration of these parts. Specifically, we measured the magnitude of response in these areas to stimuli that (i) either contained real face parts, or did not, and (ii) either had veridical face configurations, or did not. The OFA and the fSTS were sensitive only to the presence of real face parts, not to the correct configuration of those parts, whereas the FFA was sensitive to both face parts and face configuration. Further, only in the FFA was the response to configuration and part information correlated across voxels, suggesting that the FFA contains a unified representation that includes both kinds of information. In combination with prior results from fMRI, TMS, MEG, and patient studies, our data illuminate the functional division of labor in the OFA, FFA, and fSTS.

INTRODUCTION

Prior work with fMRI has identified three face-selective regions in occipito-temporal cortex: the fusiform face area (FFA) (Kanwisher, McDermott, & Chun, 1997; McCarthy, Puce, Gore, & Allison, 1997), found in the mid-fusiform gyrus; the occipital face area (OFA) (Gauthier et al., 2000), found in the lateral inferior occipital gyri; and a face-selective region in the posterior part of the superior temporal sulcus that we will refer to here as the “fSTS” (Allison, Puce, & McCarthy, 2000; Hoffman & Haxby, 2000). Despite considerable research, the precise role of each of these regions in face perception remains unclear. Here, we approached this question by asking what aspects of the face stimulus each of these regions is sensitive to: the configuration of the face (i.e., the T-shaped configuration of eyes above nose above mouth), and/or the presence of individual face parts (i.e., eyes, nose, and mouth).

Several prior studies suggest a functional division of labor among the three face-selective regions, with each apparently involved in a different aspect of face perception. Specifically, several studies have supported the hypothesis (Calder & Young, 2005; Haxby, Hoffman, & Gobbini, 2000) that the OFA and the FFA are more involved in recognition of individual identity, whereas the fSTS is more involved in recognition of social information in faces. For example, the FFA and the OFA, but not the fSTS, are correlated trial-by-trial with successful detection and identification of faces (Grill-Spector, Knouf, & Kanwisher, 2004), whereas attention to eye-gaze direction of faces (vs. identity of the same faces) increases the fMRI response of the fSTS but not the FFA or the OFA (Hoffman & Haxby, 2000; see also Winston, Henson, Fine-Goulden, & Dolan, 2004). Further evidence showing that the FFA and the OFA are necessary for determining face identity comes from the fact that the critical lesion site for apperceptive prosopagnosia is in the region of the FFA (Riddoch, Johnston, Bracewell, Boutsen, & Humphreys, 2008; Barton, Press, Keenan, & O'Connor, 2002; Wada & Yamamoto, 2001) and/or OFA (Steeves et al., 2006; Rossion et al., 2003) as defined by anatomical coordinates.

Almost all prior work on these regions has investigated the effect of task or repetition on second-order aspects of face stimuli, that is, how one individual face differs from another. Here, we ask the more basic first-order question of how the response of each region is affected by the mere presence (vs. absence) of basic properties of face stimuli: face parts (eyes, nose, and mouth) and the basic T-shaped face configuration of those parts. Given that face processing is, in large part, automatic (Vuilleumier, 2000), the stimulus manipulations used here may be stronger than the relatively weak (Hoffman & Haxby, 2000) or nonexistent (Yovel & Kanwisher, 2004) effects of task manipulations on face-selective regions tested in prior work. Any differences between face-selective regions in the aspects of face stimuli they respond to should provide clues about the function of those regions. For example, prior evidence linking the FFA to face identification suggests that this region must be sensitive to both face parts and face configurations, as both are relevant to face identification. Consistent with this hypothesis, Yovel and Kanwisher (2004) found strong activation in the FFA when subjects discriminated between faces either on the basis of part appearance or part spacing. On the other hand, the higher response in the fSTS during an eye-gaze direction discrimination task than an identity discrimination task (Hoffman & Haxby, 2000) and the recent report of a patient with a circumscribed STS lesion with a deficit in gaze perception (Akiyama et al., 2006) suggest that this region may be more sensitive to face parts (at least eyes). Although the functional profile of the OFA is harder to predict, several lines of evidence suggest that it may be more selective for face parts (Pitcher, Walsh, Yovel, & Duchaine, 2007; McCarthy, Puce, Belger, & Allison, 1999), and for “earlier” representations closer to the physical properties of faces (Rotshtein, Henson, Treves, Driver, & Dolan, 2005; Tanskanen, Nasanen, Montez, Paallysaho, & Hari, 2005).

Here, we measured the response of the OFA, FFA and fSTS to the first-order properties of face stimuli by decomposing face stimuli into two relatively independent and intuitively natural components: face configuration and face parts. Specifically, we measured fMRI responses to face stimuli in which we orthogonally varied whether the images contained: (a) real face parts versus solid black ovals in the corresponding locations; and (b) veridical face configurations versus rearranged nonface configurations (Figure 1). Because external contours (a roughly oval shape with hair on the top and sides) may interact with the processing of internal face features (Sinha & Poggio, 1996), we varied part and configuration information both in the context of whole faces (including external contours) and in versions of the same stimuli from which external contours were removed, leaving only a bounding rectangle around the central face region. During the scan, subjects were instructed to passively view the images to minimize biases toward specific aspects of the face stimuli that may arise in the context of a given face task. Based on previous reports, we predicted that the response of the FFA would depend strongly on the presence of both a veridical face configuration and veridical face parts, whereas the responses of the OFA and the fSTS would depend more strongly on the presence of real face parts in the stimuli.

Figure 1. 

Stimulus manipulations. The 2 × 2 design involved orthogonal manipulation of the presence versus absence of face parts (eyes, nose, and mouth; horizontal axis) and face configurations (the placement of these parts in a face arrangement vs. scrambled arrangement; vertical axis). External contours were either intact or removed.

Figure 1. 

Stimulus manipulations. The 2 × 2 design involved orthogonal manipulation of the presence versus absence of face parts (eyes, nose, and mouth; horizontal axis) and face configurations (the placement of these parts in a face arrangement vs. scrambled arrangement; vertical axis). External contours were either intact or removed.

METHODS

Subjects

Nine subjects (age 18–45; 7 men) participated in the study, which was conducted at the Athinoula A. Martinos Center for Biomedical Imaging at the Massachusetts General Hospital, Charlestown, MA. All subjects were right-handed, and had normal or corrected-to-normal vision. The fMRI protocol was approved by both MIT COUHES and by the Partner's IRB. Informed consent was obtained from all subjects before participating.

Stimuli

Eight stimulus categories (Figure 1) were constructed from veridical face photographs by orthogonally eliminating versus preserving face parts (i.e., eyes, nose, mouth), face configurations, and external contours (i.e., hairline, chin, ears). To vary whether information about face parts and face configuration was present, face parts were either intact, or were replaced by black ovals in their corresponding locations, and face configurations were either left intact, or were rearranged into novel nonface configurations. The size of the ovals was approximately matched to the actual size of corresponding face parts in each face stimulus, and the arrangements of nonface configurations varied across exemplars. The effect of external contours was eliminated by cropping stimuli to show the central face region only. In addition, two other stimulus categories were included for comparison: photographs of houses, and images containing only external contours. There were 50 exemplars from each stimulus category, and each exemplar occurred twice in the whole session (i.e., 100 trials per category).

Experimental Procedure

Each subject participated in a single session consisting of (1) two blocked-design functional localizer scans, and (2) 10 event-related experimental scans. The localizer scan lasted 5 min and 15 sec and consisted of sixteen 15-sec epochs with fixation periods interleaved. During each epoch, 20 different photographs of a given stimulus category (frontal-view human faces or familiar objects) were shown. Each photograph was presented for 300 msec followed by a blank interval of 450 msec. The experimental scan contained a total of 176 trials, with 10 experimental trials for each of the 10 conditions (8 stimulus manipulations plus house and external-feature-only conditions) plus 76 fixation trials. A new trial began every 1.5 sec. In each trial, an image from the relevant condition (subtending about 6.2° by 6.2° visual angle) was presented at the center of gaze for 300 msec followed by a fixation-only interval of 1.2 sec. The order of conditions was counterbalanced using the optseq2 program (http://surfer.nmr.mgh.harvard.edu/optseq) so that trials from each condition, including the fixation condition (i.e., temporal jitter), were preceded, on average, equally often by trials from each of the other conditions.

Subjects pressed a button whenever they saw two identical pictures in a row (1-back task) in the localizer scan, and in the experimental scan they fixated on a black dot continuously present in the center, and passively viewed stimuli displayed in a pseudorandom order.

Scanning Procedures and Data Analysis

Scanning was done on a 3-T Siemens Trio scanner, using a custom eight-channel phased-array surface coil (built by Dr. Lawrence Wald), which provided a relatively high spatial resolution and signal-to-noise ratio in posterior brain regions. Fifteen 2-mm-thick (20% skip) near-axial slices were collected (in-plane resolution = 1.4 × 1.4 mm), oriented parallel to each subject's temporal cortex to cover the inferior portion of the occipital lobes as well as the posterior portion of the temporal lobes, including the OFA, the FFA and part of the STS. T2*-weighted, gradient-echo, echo-planar imaging procedures were used (TR = 3 sec, TE = 32 msec, flip angle = 90°). Motion correction, intensity normalization, and spatial smoothing (FWHM = 3 mm) were performed prior to signal averaging (FreeSurfer functional analysis stream, Cortechs, Charlestown, MA).

Face-selective regions of interest (ROIs) were identified separately for each subject and hemisphere from the localizer scan. Specifically, the FFA was defined as the set of contiguous voxels in the mid-fusiform gyrus that showed significantly higher responses to front-view human faces compared to familiar objects (p < 10−4, uncorrected). The OFA and the fSTS were defined in the same way but localized in inferior occipital cortex and the STS, respectively (Figure 2). Finally, a non-face-selective region, the lateral occipital (LO), was also localized by the contrast of objects versus faces. This region was designed to serve as a control for the possible confounding role of attention because stimuli with black ovals may not attract as much attention as those with face parts. For the ROI analysis, percent signal change data were extracted and averaged by condition across all 10 runs and all voxels within each subject's predefined ROIs. Because the fMRI response typically lags 4 to 6 sec after the neural response, the magnitude of the ROI activity was measured as the average percentage change in MR signal at the latency of 6 and 9 sec (TR = 3) compared to a fixation as a baseline.

Figure 2. 

Face-selective regions, the OFA, FFA, and fSTS, from an fMRI localizer scan (p < 10−4, uncorrected) in the right hemisphere of a typical subject, shown on a flattened surface. Sulci are shown in dark gray and gyri in light gray. Talairach coordinates and statistics for each ROI from a group analysis (right hemisphere): OFA (BA 18), coordinates = 46, −78, −7, voxel number = 118, max t value = 4.6; FFA (BA 37), coordinates = 43, −53, −12, voxel number = 71, max t value = 3.9; fSTS (BA 22), coordinates = 60, −51, 9, voxel number = 259, max t value = 6.3.

Figure 2. 

Face-selective regions, the OFA, FFA, and fSTS, from an fMRI localizer scan (p < 10−4, uncorrected) in the right hemisphere of a typical subject, shown on a flattened surface. Sulci are shown in dark gray and gyri in light gray. Talairach coordinates and statistics for each ROI from a group analysis (right hemisphere): OFA (BA 18), coordinates = 46, −78, −7, voxel number = 118, max t value = 4.6; FFA (BA 37), coordinates = 43, −53, −12, voxel number = 71, max t value = 3.9; fSTS (BA 22), coordinates = 60, −51, 9, voxel number = 259, max t value = 6.3.

In addition to the traditional ROI-based analysis, we also employed multivoxel pattern analysis (O'Toole et al., 2007; Haynes & Rees, 2006; Norman, Polyn, Detre, & Haxby, 2006) to examine whether face configurations and face parts were represented by the same or different sets of neuronal populations in the previously defined ROIs. Rather than pooling the responses across voxels, we measured the sensitivity of each voxel to configural effect (indexed by the t value for that voxel comparing conditions with vs. without face configurations) and featural effect (indexed by the t value for that voxel comparing conditions with vs. without parts). For this analysis, the data were first re-preprocessed without spatial smoothing so as to maximize the sensitivity to any information present in the spatial pattern of response. Finally, the correlation across voxels was then calculated between these two sets of t values to measure the similarity of the spatial activation patterns of these two effects.

RESULTS

The FFA and the OFA were successfully localized in both hemispheres of all subjects. Analysis of the data from the experimental scan showed no difference in the pattern of response between the right and left hemispheres for each of the ROIs (ps > .1). Therefore, analysis of these two regions was based on a pooled analysis in which right- and left-hemisphere voxels were combined in each ROI. The right fSTS was localized in all nine subjects, whereas the left fSTS was found in only four subjects. Therefore, the analysis of the fSTS was carried out in the right fSTS only. Having identified the ROIs in each subject, we then calculated the magnitude of the BOLD response in each of these ROIs in each of the conditions of the main experiments. These data were used to examine what stimulus information each of the three ROIs is sensitive to.

The magnitude of the response of each ROI to each stimulus condition (Supplementary Figure 1) was analyzed in a four-way ANOVA, where the factors were face-selective cortical region (OFA vs. FFA vs. fSTS), face parts (real vs. black ovals), face configurations (veridical vs. scrambled), and external contours (external contours vs. square cutouts). This ANOVA found significant main effects of cortical region [F(2, 8) = 28.0, p < .001], face parts [F(1, 8) = 180.2, p < .001], and external contours [F(1, 8) = 53.6, p < .001]. The significant interactions of cortical region by face configuration [F(2, 7) = 12.7, p < .005], by face parts [F(2, 7) = 28.0, p < .001], and by external contours [F(2, 7) = 21.8, p < .001] indicated that the amount of information about each of these dimensions differs across ROIs. No other main effects, or two-way, three-way, or four-way interactions reached significance (all ps > .1). In addition, these predefined ROIs also showed a significantly higher response to faces than nonfaces (i.e., houses) in the experimental scans (ps < .01), indicating that the localizer-defined face-selective regions were reliably face selective.

The significant main effect of face parts was also found individually in each region, indicating that all three face-selective regions are sensitive to face parts. Specifically, FFA [F(1, 8) = 426.7, p < .001], OFA [F(1, 8) = 89.0, p < .001], and fSTS [F(1, 8) = 31.1, p < .001] responses were each independently significantly higher when real face parts rather than black ovals were present (Figure 3, left). This featural effect was observed independent of whether the face configuration was present [FFA: F(1, 8) = 95.7, p < .001; OFA: F(1, 8) = 22.0, p < .002; fSTS: F(1, 8) = 23.1, p < .001] or absent [FFA: F(1, 8) = 55.7, p < .001; OFA: F(1, 8) = 54.8, p < .001; fSTS: F(1, 8) = 9.1, p < .02]. Nonetheless, the sensitivity to face parts varies across face-selective regions as the featural effect was significantly higher in both the FFA and the OFA than that in the fSTS [FFA vs. fSTS: F(1, 8) = 45.6, p < .001; OFA vs. fSTS: F(1, 8) = 22.3, p < .002; FFA vs. OFA: F < 1]. Finally, the higher response to parts is unlikely to reflect an overall attentional bias favoring stimuli with parts because the higher response to parts was not found in the non-face-selective region LO (F < 1).

Figure 3. 

Responses of the FFA, the OFA, and the fSTS to face parts and configurations. Featural effect (left): The stimuli were pooled by the presence (black) versus absence (gray) of face parts. The hemodynamic responses of the face-selective regions were averaged across voxels, stimulus categories, and subjects. The y-axis indicates the percent signal change. The error bars show the standard error of the mean of the BOLD responses across subjects. Configural effect (right): The stimuli were pooled by whether stimuli contained veridical face configurations or not.

Figure 3. 

Responses of the FFA, the OFA, and the fSTS to face parts and configurations. Featural effect (left): The stimuli were pooled by the presence (black) versus absence (gray) of face parts. The hemodynamic responses of the face-selective regions were averaged across voxels, stimulus categories, and subjects. The y-axis indicates the percent signal change. The error bars show the standard error of the mean of the BOLD responses across subjects. Configural effect (right): The stimuli were pooled by whether stimuli contained veridical face configurations or not.

Much like the responses to face parts, the FFA [F(1, 8) = 37.4, p < .001], OFA [F(1, 8) = 70.8, p < .001], and fSTS (F(1, 8) = 23.5, p < .001) also showed a greater response to the stimuli with external contours compared to stimuli from which the external contours were cropped (Supplementary Figure 2). In addition, the sensitivity to external contours was the highest in the FFA [FFA vs. OFA: F(1, 8) = 6.3, p < .05], and the lowest in the fSTS [OFA vs. fSTS: F(1, 8) = 15.4, p < .002].

However, only the FFA, not the OFA or the fSTS, is sensitive to face configurations. This was revealed by the previously mentioned significant interaction of cortical regions by face configuration, along with a significant configural effect for the FFA [F(1, 8) = 11.8, p < .01], but not for the OFA or the fSTS (both Fs < 1) (Figure 3, right). This interaction of cortical region by face configuration was observed independent of whether the face parts were present [FFA vs. OFA: F(1, 8) = 17.3, p < .001] or absent [FFA vs. OFA: F(1, 8) = 12.0, p < .005]. In addition, no significant interaction of face configurations by face parts in the FFA was observed [F(1, 8) < 1], consistent with the fact that the FFA showed a significantly larger response to stimuli with face configurations regardless of whether face parts were present [F(1, 8) = 5.7, p < .05] or absent [F(1, 8) = 11.3, p < .01].

Are face parts and face configurations represented by a single neural population in the FFA, or does the FFA contain distinct neural populations, one responsive only to face parts and another responsive only to face configurations? To test these two alternatives, we used multivoxel pattern analysis to compare the similarity between the spatial activation pattern of face configurations and that of face parts. Specifically, a t value was extracted for the configural effect (stimuli with face configurations–stimuli without) and the featural effect (stimuli with face parts–stimuli without) for each voxel within the FFA in each subject, and then a correlation between these two sets of t values was calculated, separately for each subject. The correlation score of the right FFA's spatial activation patterns of two effects from a representative subject is shown in Figure 4A (r = .29, p < .001). To check further the distribution of r values against chance, we used a permutation test that randomly shuffled the pairings between the configural and featural effects for each voxel 1000 times and computed a correlation coefficient each time. This gave us random distributions of r values for each ROI, which enabled us to measure the difference between the observed r value and the mean of the shuffled r values in Z-score units. The Z score of the r value for this subject shown in Figure 4B was 7.8, which was significantly different from the mean value (r = 0) of the correlation coefficients from the permutation test (p < .001). Eight out of nine subjects showed significant Z scores (evaluated at p < .02) in the FFA, which was significantly greater than chance (χ2 = 5.4, p < .02), suggesting that the FFA not only is sensitive to both face configurations and face parts, but also is engaged in integrating them into a single holistic representation. Similar correlations between configural and featural effects among voxels were not observed in the OFA and the fSTS (ps > .3).

Figure 4. 

The correlation of the configural and featural effects in the FFA from a typical subject. (A) The featural effect was significantly correlated with the configural effect indexed by t values across voxels in both the right (r = .29) and left (r = .24, not shown) FFA. (B) Distribution of correlation coefficients for randomly shuffled pairs of configural and part effect across voxels pooled from both the right and left FFA. The vertical dotted line indicates the mean value (r = 0) of the correlation coefficients from the permutation test, and the solid line indicates the observed correlation coefficient.

Figure 4. 

The correlation of the configural and featural effects in the FFA from a typical subject. (A) The featural effect was significantly correlated with the configural effect indexed by t values across voxels in both the right (r = .29) and left (r = .24, not shown) FFA. (B) Distribution of correlation coefficients for randomly shuffled pairs of configural and part effect across voxels pooled from both the right and left FFA. The vertical dotted line indicates the mean value (r = 0) of the correlation coefficients from the permutation test, and the solid line indicates the observed correlation coefficient.

DISCUSSION

In this study, we measured the extent to which two first-order aspects of face stimuli, the presence of face parts and the configuration of those parts, drive three face-selective regions in occipito-temporal cortex, namely, the OFA, the FFA, and the fSTS. We find that the OFA and the fSTS are sensitive to the presence of face parts in the stimulus but not to the presence of a veridical face configuration, whereas the FFA is sensitive to both kinds of information. Further, only in the FFA is the response to configuration and part information correlated across voxels, implying that the FFA contains a unified representation that includes both kinds of information. These results dovetail interestingly with a number of recent findings that collectively help specify the functional division of labor and connectivity between the different face-selective regions in the human ventral visual pathway.

Our results fit most strikingly with those of Pitcher et al. (2007), who applied TMS to the right OFA and selectively disrupted subjects' ability to discriminate faces on the basis of differences in face parts, but not differences in the spacing among those parts. Importantly, that study found that this disruption occurred at a relatively early latency, only when TMS was applied 60 and 100 msec after stimulus onset, not later. Although this study manipulated the second-order property of face parts (i.e., the shape of face parts), whereas we measured the first-order property (i.e., the presence vs. absence of face parts), it clearly reinforces our finding that the OFA is involved in the analysis of face parts but not face configurations, and further shows that the OFA is not only activated by but necessary for the analysis of face parts (see also Rossion et al., 2003), and that it conducts a relatively early stage of face processing.

The idea that the OFA conducts an earlier stage of face processing than the FFA is consistent with its location posterior to the FFA as well as the fact that its response is more biased to the contralateral hemifield than that of the FFA (Hemond, Kanwisher, & Op de Beeck, 2007). This idea is further consistent with the hypothesis that the OFA is the source of the face-selective M100 response, whereas the FFA is the source of the face-selective M170 response (Liu, Harris, & Kanwisher, 2002; Halgren, Raij, Marinkovic, Jousmaki, & Hari, 2000; but see McCarthy et al., 1999; Bentin, Allison, Puce, Perez, & McCarthy, 1996 for evidence that the FFA may not be the source of the N170 measured with ERPs). That hypothesis, if correct, provides another link to the present data because an MEG experiment parallel to that described here (Liu et al., 2002) found that the face-selective M100 response is sensitive only to face parts, whereas the M170 is sensitive to both face parts and face configurations.

In contrast to the OFA, the FFA responded both to the presence of face parts and to their veridical configuration. This finding is generally consistent with previous studies that have shown a link between the neural activity of the FFA and behavioral signatures of the holistic processing of faces, including the face-inversion effect (Mazard, Schiltz, & Rossion, 2006; Yovel & Kanwisher, 2005; Gauthier, Tarr, Anderson, Skudlarski, & Gore, 1999; Haxby et al., 1999; Kanwisher, Tong, & Nakayama, 1998; but see Aguirre, Singh, & D'Esposito, 1999), the composite effect (Schiltz & Rossion, 2006), and the discrimination of the fine distance among face parts (Rotshtein, Geng, Driver, & Dolan, 2007; Yovel & Kanwisher, 2004; Barton et al., 2002; but see Maurer et al., 2007). However, our findings of a similar response to face configurations and face parts in the right and left FFAs do not fit straightforwardly with the findings of a previous study (Rossion et al., 2000), which found a higher response in the region of the right FFA when subjects attended to a whole face relative to a face part and the opposite pattern in the region of the left FFA. Although the reason for this discrepancy is not clear, the paradigms are quite different, with Rossion et al. (2000) relying on an attention/task manipulation, and our study relying on a stimulus manipulation. Other studies have found both FFAs to be involved in processing configural information of faces, with the right FFA showing either no higher engagement (Yovel & Kanwisher, 2005; Haxby et al., 1999) or only a slight preference for configural processing (Mazard et al., 2006; Schiltz & Rossion, 2006).

Thus, considerable converging evidence suggests that the OFA conducts an early analysis of faces that is necessary for the perception of face parts, and the FFA conducts a later analysis of both the parts and configurations of faces (see also Calder & Young, 2005; Haxby et al., 2000). An obvious extension of the above conclusion is the speculation that the FFA and the OFA comprise a hierarchical network for face perception with the FFA inheriting the part sensitivity of the OFA, and then further integrating or elaborating this information to include sensitivity to the spatial configuration of these parts. This hierarchical representation of faces seems to reflect a widely held view, according to which global properties of an object are represented by the activity of neurons that receive convergent input from populations of neurons that encode relatively simple and local features of the object at lower levels in the hierarchy (Lerner, Hendler, Ben-Bashat, Harel, & Malach, 2001; Tanaka, 1996; Barlow, 1972). Consistent with this hypothesis, high spatial frequencies have been found to be critical for identifying face parts, whereas low spatial frequencies are more important for discriminating spacing differences between those parts (Goffaux, Hault, Michel, Vuong, & Rossion, 2005), suggesting that the OFA and the FFA may process faces at difference scales, with local properties of face extracted in the OFA and then more global properties in the FFA for further global processing (see also Tanskanen et al., 2005).

Although the hierarchical hypothesis is plausible, several considerations suggest that the OFA is not the only input to the FFA. First, given that the OFA is responsive only to face parts and not to face-like configurations of ovals, it cannot be the only input to the FFA, which responded to face-like configurations of ovals in the present study. That is, the representation of face parts in the OFA is not a necessary intermediate step for the representation of face configurations in the FFA. Second, both intact discrimination of face spacing (Pitcher et al., 2007) and activation of the FFA (Steeves et al., 2006; Rossion et al., 2003) can occur when the OFA is disrupted. Thus, although the OFA likely sends input to the FFA, the FFA likely receives input from other areas as well (see also Barbeau et al., 2008; Allison, Puce, Spencer, & McCarthy, 1999; Halgren et al., 1994).

Our further finding that the response to face parts and face configurations is correlated across voxels in the FFA is consistent with several other lines of evidence for an integrated, holistic representation that contains both kinds of information. First, Yovel and Kanwisher (in press) have found high correlations across subjects in performance discriminating face parts and face spacing, but only for upright faces, consistent with the idea that a common mechanism is engaged in the analysis of both kinds of information. Second, although Schiltz and Rossion (2006) found evidence for holistic representations in both the OFA and the FFA using “composite” face stimuli, this effect was largest for the FFA. Third, recent work manipulated binocular disparity of stimuli so that faces were perceived either as wholes or parts and again found that, although both the OFA and the FFA have both holistic and part-based representations, only in the FFA are such representations modulated by familiarity (Harris & Aguirre, 2008). These lines of evidence indicate that information about different aspects of the face (parts and configuration) is ultimately integrated into a single holistic representation of the face, and this holistic representation may reside in the FFA (see also Rotshtein et al., 2007).

Much like the OFA, the fSTS was sensitive only to face parts, not face configurations. This finding is generally consistent with prior findings implicating this region in the discrimination of gaze direction and expression. However, the OFA and the fSTS may extract different aspects of information from face parts. For example, a recent fMRI study reports that the fSTS was activated by the directional information from eye gaze but not from the physical properties of the eyes, whereas the OFA showed the opposite pattern (Materna, Dicke, & Thier, 2008). On the other hand, even gaze and expression perception are presumably clearer in faces with veridical configurations than faces with parts rearranged, which seems inconsistent with our finding that the fSTS was sensitive only to face parts. One possibility is that the STS regions previously implicated in the perception of gaze and expression may not be identical to the region tested here (see also Winston et al., 2004).

Finally, our finding that all three face-selective regions are sensitive to the external contours of faces suggests that this aspect of faces is also used for constructing the representation of faces at different stages of face processing. Indeed, when fine-grained details of the internal face features are missing, the coarse information of external features may help to detect faces among objects (see also Cox, Meyers, & Sinha, 2004). Further, a change of the external contours can actually alter our identification of faces even when the internal features remain constantly (Sinha & Poggio, 1996; Haig, 1986). Alternatively, the sensitivity to the external contours of the face-selective regions may reflect the remnant of its critical role in recognizing faces for infants and children which becomes less important with age compared to the internal features (Want, Pascalis, Coleman, & Blades, 2003; Campbell & Tuck, 1995; Campbell, Walker, & Baron-Cohen, 1995). Further investigations are necessary to reveal the functional role of the external contours of faces in these three face-selective regions.

In sum, the current data dovetail with prior results from fMRI, TMS, MEG, and patient studies to indicate that the OFA is selectively involved in perceptual processing of face parts, whereas the FFA processes both the parts and configurations of faces. Although a feedforward model from the OFA to the FFA and the fSTS is likely to be part of the story (e.g., Rotshtein et al., 2007), several mysteries remain for future investigation. First, how can activation of the FFA and processing of face spacing can be preserved when the OFA is disrupted? Second, what is the role of the anterior temporal lobe in face perception (Behrmann, Avidan, Gao, & Black, 2007; Kriegeskorte, Formisano, Sorger, & Goebel, 2007)? Third, by what route does face information reach the fSTS? These and other mysteries will require better methods for tracking the responses of each of these regions over time to discover how exactly information flows in this face processing network.

Acknowledgments

We thank K. Nakayama, P. Sinha, J. DiCarlo, W. Freiwald, and G. Yovel for helpful discussions and comments. This study is supported by the National Institutes of Health (66696 to N. K.), National Eye Institute (EY13455 to N. K.), National Natural Science Foundation of China (30325025 to J. L.), and National Key Basic Research Development Program of China (2007CB516703 to J. L.).

Reprint requests should be sent to Jia Liu, Room 405, Yingdong Building, 19 Xinjiekouwai St., Haidian District, 100875, Beijing, China, or via e-mail: liujia@bnu.edu.cn.

REFERENCES

Aguirre
,
G. K.
,
Singh
,
R.
, &
D'Esposito
,
M.
(
1999
).
Stimulus inversion and the responses of face and object-sensitive cortical areas.
NeuroReport
,
10
,
189
194
.
Akiyama
,
T.
,
Kato
,
M.
,
Muramatsu
,
T.
,
Saito
,
F.
,
Nakachi
,
R.
, &
Kashima
,
H.
(
2006
).
A deficit in discriminating gaze direction in a case with right superior temporal gyrus lesion.
Neuropsychologia
,
44
,
161
170
.
Allison
,
T.
,
Puce
,
A.
, &
McCarthy
,
G.
(
2000
).
Social perception from visual cues: Role of the STS region.
Trends in Cognitive Sciences
,
4
,
267
278
.
Allison
,
T.
,
Puce
,
A.
,
Spencer
,
D. D.
, &
McCarthy
,
G.
(
1999
).
Electrophysiological studies of human face perception: I. Potentials generated in occipitotemporal cortex by face and non-face stimuli.
Cerebral Cortex
,
9
,
415
430
.
Barbeau
,
E. J.
,
Taylor
,
M. J.
,
Regis
,
J.
,
Marquis
,
P.
,
Chauvel
,
P.
, &
Liegeois-Chauvel
,
C.
(
2008
).
Spatio temporal dynamics of face recognition.
Cerebral Cortex
,
18
,
997
1009
.
Barlow
,
H. B.
(
1972
).
Single units and sensation: A neuron doctrine for perceptual psychology?
Perception
,
1
,
371
394
.
Barton
,
J. J.
,
Press
,
D. Z.
,
Keenan
,
J. P.
, &
O'Connor
,
M.
(
2002
).
Lesions of the fusiform face area impair perception of facial configuration in prosopagnosia.
Neurology
,
58
,
71
78
.
Behrmann
,
M.
,
Avidan
,
G.
,
Gao
,
F.
, &
Black
,
S.
(
2007
).
Structural imaging reveals anatomical alterations in inferotemporal cortex in congenital prosopagnosia.
Cerebral Cortex
,
17
,
2354
2363
.
Bentin
,
S.
,
Allison
,
T.
,
Puce
,
A.
,
Perez
,
E.
, &
McCarthy
,
G.
(
1996
).
Electrophysiological studies of face perception in humans.
Journal of Cognitive Neuroscience
,
8
,
551
565
.
Calder
,
A. J.
, &
Young
,
A. W.
(
2005
).
Understanding the recognition of facial identity and facial expression.
Nature Reviews Neuroscience
,
6
,
641
651
.
Campbell
,
R.
, &
Tuck
,
M.
(
1995
).
Recognition of parts of famous-face photographs by children: An experimental note.
Perception
,
24
,
451
456
.
Campbell
,
R.
,
Walker
,
J.
, &
Baron-Cohen
,
S.
(
1995
).
The development of differential use of inner and outer face features in familiar face identification.
Journal of Experimental Child Psychology
,
59
,
196
210
.
Cox
,
D.
,
Meyers
,
E.
, &
Sinha
,
P.
(
2004
).
Contextually evoked object-specific responses in human visual cortex.
Science
,
304
,
115
117
.
Gauthier
,
I.
,
Tarr
,
M. J.
,
Anderson
,
A. W.
,
Skudlarski
,
P.
, &
Gore
,
J. C.
(
1999
).
Activation of the middle fusiform “face area” increases with expertise in recognizing novel objects.
Nature Neuroscience
,
2
,
568
573
.
Gauthier
,
I.
,
Tarr
,
M. J.
,
Moylan
,
J.
,
Skudlarski
,
P.
,
Gore
,
J. C.
, &
Anderson
,
A. W.
(
2000
).
The fusiform “face area” is part of a network that processes faces at the individual level.
Journal of Cognitive Neuroscience
,
12
,
495
504
.
Goffaux
,
V.
,
Hault
,
B.
,
Michel
,
C.
,
Vuong
,
Q. C.
, &
Rossion
,
B.
(
2005
).
The respective role of low and high spatial frequencies in supporting configural and featural processing of faces.
Perception
,
34
,
77
86
.
Grill-Spector
,
K.
,
Knouf
,
N.
, &
Kanwisher
,
N.
(
2004
).
The fusiform face area subserves face perception, not generic within-category identification.
Nature Neuroscience
,
7
,
555
562
.
Haig
,
N. D.
(
1986
).
Exploring recognition with interchanged facial features.
Perception
,
15
,
235
247
.
Halgren
,
E.
,
Baudena
,
P.
,
Heit
,
G.
,
Clarke
,
J. M.
,
Marinkovic
,
K.
, &
Clarke
,
M.
(
1994
).
Spatio-temporal stages in face and word processing: I. Depth-recorded potentials in the human occipital, temporal and parietal lobes [corrected].
Journal of Physiology Paris
,
88
,
1
50
.
Halgren
,
E.
,
Raij
,
T.
,
Marinkovic
,
K.
,
Jousmaki
,
V.
, &
Hari
,
R.
(
2000
).
Cognitive response profile of the human fusiform face area as determined by MEG.
Cerebral Cortex
,
10
,
69
81
.
Harris
,
A.
, &
Aguirre
,
G. K.
(
2008
).
The representation of parts and wholes in face-selective cortex.
Journal of Cognitive Neuroscience
,
20
,
863
878
.
Haxby
,
J. V.
,
Hoffman
,
E. A.
, &
Gobbini
,
M. I.
(
2000
).
The distributed human neural system for face perception.
Trends in Cognitive Sciences
,
4
,
223
233
.
Haxby
,
J. V.
,
Ungerleider
,
L. G.
,
Clark
,
V. P.
,
Schouten
,
J. L.
,
Hoffman
,
E. A.
, &
Martin
,
A.
(
1999
).
The effect of face inversion on activity in human neural systems for face and object perception.
Neuron
,
22
,
189
199
.
Haynes
,
J. D.
, &
Rees
,
G.
(
2006
).
Decoding mental states from brain activity in humans.
Nature Reviews Neuroscience
,
7
,
523
534
.
Hemond
,
C. C.
,
Kanwisher
,
N. G.
, &
Op de Beeck
,
H. P.
(
2007
).
A preference for contralateral stimuli in human object- and face-selective cortex.
PLoS ONE
,
2
,
e574
.
Hoffman
,
E. A.
, &
Haxby
,
J. V.
(
2000
).
Distinct representations of eye gaze and identity in the distributed human neural system for face perception.
Nature Neuroscience
,
3
,
80
84
.
Kanwisher
,
N.
,
McDermott
,
J.
, &
Chun
,
M. M.
(
1997
).
The fusiform face area: A module in human extrastriate cortex specialized for face perception.
Journal of Neuroscience
,
17
,
4302
4311
.
Kanwisher
,
N.
,
Tong
,
F.
, &
Nakayama
,
K.
(
1998
).
The effect of face inversion on the human fusiform face area.
Cognition
,
68
,
B1
B11
.
Kriegeskorte
,
N.
,
Formisano
,
E.
,
Sorger
,
B.
, &
Goebel
,
R.
(
2007
).
Individual faces elicit distinct response patterns in human anterior temporal cortex.
Proceedings of the National Academy of Sciences, U.S.A.
,
104
,
20600
20605
.
Lerner
,
Y.
,
Hendler
,
T.
,
Ben-Bashat
,
D.
,
Harel
,
M.
, &
Malach
,
R.
(
2001
).
A hierarchical axis of object processing stages in the human visual cortex.
Cerebral Cortex
,
11
,
287
297
.
Liu
,
J.
,
Harris
,
A.
, &
Kanwisher
,
N.
(
2002
).
Stages of processing in face perception: An MEG study.
Nature Neuroscience
,
5
,
910
916
.
Materna
,
S.
,
Dicke
,
P. W.
, &
Thier
,
P.
(
2008
).
Dissociable roles of the superior temporal sulcus and the intraparietal sulcus in joint attention: A functional magnetic resonance imaging study.
Journal of Cognitive Neuroscience
,
20
,
108
119
.
Maurer
,
D.
,
O'Craven
,
K. M.
,
Le Grand
,
R.
,
Mondloch
,
C. J.
,
Springer
,
M. V.
,
Lewis
,
T. L.
,
et al
(
2007
).
Neural correlates of processing facial identity based on features versus their spacing.
Neuropsychologia
,
45
,
1438
1451
.
Mazard
,
A.
,
Schiltz
,
C.
, &
Rossion
,
B.
(
2006
).
Recovery from adaptation to facial identity is larger for upright than inverted faces in the human occipito-temporal cortex.
Neuropsychologia
,
44
,
912
922
.
McCarthy
,
G.
,
Puce
,
A.
,
Belger
,
A.
, &
Allison
,
T.
(
1999
).
Electrophysiological studies of human face perception: II. Response properties of face-specific potentials generated in occipitotemporal cortex.
Cerebral Cortex
,
9
,
431
444
.
McCarthy
,
G.
,
Puce
,
A.
,
Gore
,
J. C.
, &
Allison
,
T.
(
1997
).
Face-specific processing in the human fusiform gyrus.
Journal of Cognitive Neuroscience
,
9
,
605
610
.
Norman
,
K. A.
,
Polyn
,
S. M.
,
Detre
,
G. J.
, &
Haxby
,
J. V.
(
2006
).
Beyond mind-reading: Multi-voxel pattern analysis of fMRI data.
Trends in Cognitive Sciences
,
10
,
424
430
.
O'Toole
,
A. J.
,
Jiang
,
F.
,
Abdi
,
H.
,
Penard
,
N.
,
Dunlop
,
J. P.
, &
Parent
,
M. A.
(
2007
).
Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data.
Journal of Cognitive Neuroscience
,
19
,
1735
1752
.
Pitcher
,
D.
,
Walsh
,
V.
,
Yovel
,
G.
, &
Duchaine
,
B.
(
2007
).
TMS evidence for the involvement of the right occipital face area in early face processing.
Current Biology
,
17
,
1568
1573
.
Riddoch
,
M. J.
,
Johnston
,
R. A.
,
Bracewell
,
R. M.
,
Boutsen
,
L.
, &
Humphreys
,
G. W.
(
2008
).
Are faces special? A case of pure prosopagnosia.
Cognitive Neuropsychology
,
25
,
3
26
.
Rossion
,
B.
,
Caldara
,
R.
,
Seghier
,
M.
,
Schuller
,
A. M.
,
Lazeyras
,
F.
, &
Mayer
,
E.
(
2003
).
A network of occipito-temporal face-sensitive areas besides the right middle fusiform gyrus is necessary for normal face processing.
Brain
,
126
,
2381
2395
.
Rossion
,
B.
,
Dricot
,
L.
,
Devolder
,
A.
,
Bodart
,
J. M.
,
Crommelinck
,
M.
,
De Gelder
,
B.
,
et al
(
2000
).
Hemispheric asymmetries for whole-based and part-based face processing in the human fusiform gyrus.
Journal of Cognitive Neuroscience
,
12
,
793
802
.
Rotshtein
,
P.
,
Geng
,
J. J.
,
Driver
,
J.
, &
Dolan
,
R. J.
(
2007
).
Role of features and second-order spatial relations in face discrimination, face recognition, and individual face skills: Behavioral and functional magnetic resonance imaging data.
Journal of Cognitive Neuroscience
,
19
,
1435
1452
.
Rotshtein
,
P.
,
Henson
,
R. N.
,
Treves
,
A.
,
Driver
,
J.
, &
Dolan
,
R. J.
(
2005
).
Morphing Marilyn into Maggie dissociates physical and identity face representations in the brain.
Nature Neuroscience
,
8
,
107
113
.
Schiltz
,
C.
, &
Rossion
,
B.
(
2006
).
Faces are represented holistically in the human occipito-temporal cortex.
Neuroimage
,
32
,
1385
1394
.
Sinha
,
P.
, &
Poggio
,
T.
(
1996
).
I think I know that face.
Nature
,
384
,
404
.
Steeves
,
J. K.
,
Culham
,
J. C.
,
Duchaine
,
B. C.
,
Pratesi
,
C. C.
,
Valyear
,
K. F.
,
Schindler
,
I.
,
et al
(
2006
).
The fusiform face area is not sufficient for face recognition: Evidence from a patient with dense prosopagnosia and no occipital face area.
Neuropsychologia
,
44
,
594
609
.
Tanaka
,
K.
(
1996
).
Inferotemporal cortex and object vision.
Annual Review of Neuroscience
,
19
,
109
139
.
Tanskanen
,
T.
,
Nasanen
,
R.
,
Montez
,
T.
,
Paallysaho
,
J.
, &
Hari
,
R.
(
2005
).
Face recognition and cortical responses show similar sensitivity to noise spatial frequency.
Cerebral Cortex
,
15
,
526
534
.
Vuilleumier
,
P.
(
2000
).
Faces call for attention: Evidence from patients with visual extinction.
Neuropsychologia
,
38
,
693
700
.
Wada
,
Y.
, &
Yamamoto
,
T.
(
2001
).
Selective impairment of facial recognition due to a haematoma restricted to the right fusiform and lateral occipital region.
Journal of Neurology, Neurosurgery and Psychiatry
,
71
,
254
257
.
Want
,
S. C.
,
Pascalis
,
O.
,
Coleman
,
M.
, &
Blades
,
M.
(
2003
).
Recognizing people from the inner or outer parts of their faces: Developmental data concerning “unfamiliar” faces.
British Journal of Developmental Psychology
,
21
,
125
135
.
Winston
,
J. S.
,
Henson
,
R. N.
,
Fine-Goulden
,
M. R.
, &
Dolan
,
R. J.
(
2004
).
fMRI-adaptation reveals dissociable neural representations of identity and expression in face perception.
Journal of Neurophysiology
,
92
,
1830
1839
.
Yovel
,
G.
, &
Kanwisher
,
N.
(
2004
).
Face perception: Domain specific, not process specific.
Neuron
,
44
,
889
898
.
Yovel
,
G.
, &
Kanwisher
,
N.
(
2005
).
The neural basis of the behavioral face-inversion effect.
Current Biology
,
15
,
2256
2262
.
Yovel
,
G.
, &
Kanwisher
,
N.
(
in press
).
The representations of spacing and part-based information are associated for upright faces but dissociated for objects: Evidence from individual differences.
Psychonomic Bulletin & Review
.