Abstract

The degree to which face-specific brain regions are specialized for different kinds of perceptual processing is debated. This study parametrically varied demands on featural, first-order configural, or second-order configural processing of faces and houses in a perceptual matching task to determine the extent to which the process of perceptual differentiation was selective for faces regardless of processing type (domain-specific account), specialized for specific types of perceptual processing regardless of category (process-specific account), engaged in category-optimized processing (i.e., configural face processing or featural house processing), or reflected generalized perceptual differentiation (i.e., differentiation that crosses category and processing type boundaries). ROIs were identified in a separate localizer run or with a similarity regressor in the face-matching runs. The predominant principle accounting for fMRI signal modulation in most regions was generalized perceptual differentiation. Nearly all regions showed perceptual differentiation for both faces and houses for more than one processing type, even if the region was identified as face-preferential in the localizer run. Consistent with process specificity, some regions showed perceptual differentiation for first-order processing of faces and houses (right fusiform face area and occipito-temporal cortex and right lateral occipital complex), but not for featural or second-order processing. Somewhat consistent with domain specificity, the right inferior frontal gyrus showed perceptual differentiation only for faces in the featural matching task. The present findings demonstrate that the majority of regions involved in perceptual differentiation of faces are also involved in differentiation of other visually homogenous categories.

INTRODUCTION

The brain basis of face recognition is widely studied with fMRI to understand the neural components that reveal the nature of perceptual and cognitive processing specific to faces versus other object categories. However, one unanswered question is whether face network components are more strongly tuned to category delineations, preferring faces over other objects, or driven by certain cognitive and perceptual processes that are more strongly invoked by faces? We suggest that an examination of the perceptual processes associated with face recognition is pivotal for characterizing the degree of specialization for faces in the brain.

A core set of brain regions in human occipito-temporal cortex respond preferentially to faces, including the “fusiform face area” (FFA; located in the lateral middle fusiform gyrus; Kanwisher, McDermott, & Chun, 1997), occipital face area (OFA; situated in inferior occipital cortex; Rossion et al., 2003; Gauthier et al., 2000), and the face-selective STS (for a review, see Haxby, Hoffman, & Gobbini, 2000). The greater response to faces than to objects under various conditions has led to the formation of a domain-specific account of neural specialization for faces (Rhodes, Byatt, Michie, & Puce, 2004; Yovel & Kanwisher, 2004). A strong form of this account suggests that faces induce a greater response than nonfaces regardless of the processing type used to categorize a stimulus as a face or identify individual faces (Figure 1) and despite the degree of familiarity or expertise with the nonface comparison categories. For example, Yovel and Kanwisher (2004) showed that fMRI response in the FFA preferred faces over houses but showed no differential activation for processing facial features (the shape and size of eyes, nose, or mouth) or the spacing of features (second-order processing) nor did the FFA prefer either processing type for houses. Rhodes et al. (2004) also showed that the FFA responded more to faces than to another visually homogenous category (Lepidoptera) and that Lepidoptera expertise did not modulate the FFA response. In the domain-specific account, face-specific regions are tuned to faces and potentially all relevant aspects of face processing but the processing in these regions is not co-opted for other categories or for different levels of category expertise.

Figure 1. 

Hypotheses associated with the different accounts of information processing that could occur in brain regions involved in face and object processing. Hypothetical fMRI signal is shown on the y axis, and similarity level is shown on the x axis. Face conditions are shown in red; house conditions are shown in blue. A flat line indicates no significant modulation by similarity for a given condition, whereas a sloped line indicates significant similarity modulation as a reflection of processing the information associated with that condition.

Figure 1. 

Hypotheses associated with the different accounts of information processing that could occur in brain regions involved in face and object processing. Hypothetical fMRI signal is shown on the y axis, and similarity level is shown on the x axis. Face conditions are shown in red; house conditions are shown in blue. A flat line indicates no significant modulation by similarity for a given condition, whereas a sloped line indicates significant similarity modulation as a reflection of processing the information associated with that condition.

Alternatively, Gauthier, Tarr, Anderson, Skudlarski, and Gore (1999) promoted the perceptual expertise account, which is closely aligned with process specificity. When individuals are trained to discriminate items from a visually homogenous nonface category (i.e., greebles), the FFA is strongly activated once expertise was established following training (see also Xu, 2005). They argued that the right FFA supports the process of making fine distinctions among items from visually similar categories for which an individual has expertise. Other findings also support process specificity by showing that the FFA responds to nonfaces even in the absence of expertise (Haist, Lee, & Stiles, 2010; Joseph & Gathers, 2002), which suggests that the FFA is linked to processing information that is strongly associated with faces (such as high within-category visual similarity) but that this information is not reserved only for faces.

One challenge in assessing the domain and process specificity accounts is that the finding of more activation to a given category or process relative to another does not necessarily provide sufficient evidence that a neural node is specific or specialized for face processing (Joseph & Gathers, 2002; Joseph, 2001). A greater fMRI signal for faces may be driven by factors that are not directly related to the domain-relevant processing. For example, Yovel and Kanwisher's (2004) finding that the FFA responded more to faces than to houses but did not differentially respond to featural and second-order processing may have been explained by task difficulty. They reported that discriminating upright faces was harder than upright houses but that there were no performance differences between featural and second-order processing for these stimuli. The FFA may have responded more to faces than to houses due to the greater effort required for discrimination.

The FFA's lack of response to different types of perceptual processing (featural or second-order) could either mean that it is not engaged in these types of processing or is engaged in all processing types to the same degree. Liu, Harris, and Kanwisher (2010) reported that the FFA is sensitive to the presence of facial features and to their first-order configuration (i.e., the ordering of the eyes above the nose, which is above the mouth) and that this information processing is correlated, suggesting that the FFA is engaged in both featural and first-order configural processing. However, it is possible that other components of the face network are even more strongly engaged in these processing types, which can be addressed with a voxel-wise whole-brain analysis, as conducted by Maurer et al. (2007). They showed that second-order configural versus featural processing of faces did not isolate the FFA but instead isolated an FFA-adjacent region as well as right frontal regions (whereas featural vs. second-order processing isolated left frontal cortex). Lobmaier, Klaver, Loenneker, Martin, and Mast (2008) also compared featural and configural processing directly and did not show greater right FFA response to configural information. The left FFA, left lingual gyrus, and left parietal cortex, however, showed a greater response to facial featural information. Another way to probe neural substrates for processing type is through face inversion, which may disrupt configural processing or more strongly engage featural processing or both. Some studies show no difference in FFA activation to inverted and upright faces (Joseph et al., 2006; Leube et al., 2003), but others show an enhanced FFA response (Yovel & Kanwisher, 2004). However, inversion is an indirect approach to examine differences in featural and second-order processing. In summary, findings are mixed as to whether the FFA is engaged for different types of face processing to the same degree or whether other regions are responsible for such processing. Direct comparisons of processing types must ensure that the conditions are equated for difficulty, and lack of a differential response to processing types cannot necessarily be taken as evidence for domain specificity.

To directly probe the degree of domain and process specificity for faces, this study parametrically manipulated the degree of similarity related to three different types of perceptual face processing (first-order configural, second-order configural, and featural). Parametric manipulation of similarity has the advantage of changing demands on processing in a graded fashion, thereby directly tapping into differential processing of perceptual information in a quantitative manner. For example, two featurally dissimilar faces will be easier to discriminate than two faces sharing several features (Figure 2). The greater difficulty of discriminating two similar faces is related to the featural similarity manipulation in this example. We expect that increasing the similarity of two stimuli will require a greater degree of processing for that specific type of information (featural, first-order, or second-order) and will result in monotonically increasing functions for behavioral performance (increased RT or errors) and fMRI signal. Prior research has shown that parametrically varied perceptual similarity of objects successfully modulates fMRI signal in brain regions like the lateral occipital complex (LOC; Drucker & Aguirre, 2009) and ventral temporal cortex (Liu, Steinmetz, Farley, Smith, & Joseph, 2008; Joseph & Gathers, 2003). Similarity manipulations that use morphing of face identities also modulate ERP response amplitude proportionally (Kahn, Harris, Wolk, & Aguirre, 2010).

Figure 2. 

Sample stimuli used in this study. A sample target stimulus is shown in the “Identical” column. Sim3–Sim0 columns illustrate progressively less similarity with the target as more features, first-order relations, or second-order relations are changed. For example, featural face changes were created by changing the lips of the target face (Sim3); the lips and nose (Sim2); the lips, nose and eyebrows (Sim1); or the lips, nose, eyebrows, and eyes (Sim0). Although single stimuli are illustrated here, stimuli were presented in pairs so that the target and the Sim3 stimulus form a Sim3 pair.

Figure 2. 

Sample stimuli used in this study. A sample target stimulus is shown in the “Identical” column. Sim3–Sim0 columns illustrate progressively less similarity with the target as more features, first-order relations, or second-order relations are changed. For example, featural face changes were created by changing the lips of the target face (Sim3); the lips and nose (Sim2); the lips, nose and eyebrows (Sim1); or the lips, nose, eyebrows, and eyes (Sim0). Although single stimuli are illustrated here, stimuli were presented in pairs so that the target and the Sim3 stimulus form a Sim3 pair.

The present parametric design addresses the following concerns with prior studies. First, this approach directly manipulates the processing types of interest rather than relying on an indirect approach, such as inversion, to infer what type of perceptual information is processed in different brain regions. Second, the present design does not rely exclusively on a qualitative comparison of different processing types, which may or may not be equated for difficulty. The main hypothesis is that, if a given region is engaged for a specific type of processing, that region will show modulation by the greater demands on processing (i.e., a monotonically increasing similarity function). In the present framework, the modulation of fMRI signal by increasing processing demands is used as the main evidence for processing different kinds of perceptual information, with less emphasis on differences in fMRI signal magnitude for qualitative comparisons (such as featural vs. second-order processing differences in average magnitude of response in the two conditions). Third, the present approach (similar to Yovel & Kanwisher, 2004) compares faces with another visually homogenous category (houses) that has the same external contour as faces with the same relations or types of features that are systematically changed across similarity levels. Fourth, this study compared three types of perceptual processing that are relevant for faces and objects: featural, first-order configural, and second-order configural. Prior fMRI studies (Liu et al., 2010; Lobmaier et al., 2008; Maurer et al., 2007; Yovel & Kanwisher, 2004) have only compared two of these types of processing in the same study.

Regions were isolated with (1) a localizer task that presented blocks of faces, objects, and visual textures to define face-preferential and object-preferential regions, and (2) a perceptual differentiation task with parametrically varied featural, first-order or second-order similarity in the face condition using a similarity-weighted regressor that represented the four similarity levels in Figure 2. Within each region (isolated by either method), percent signal change from the matching task for each similarity level, category (faces or houses), and processing type (featural, first-order, or second-order) was extracted. Follow-up ANOVAs determined whether each region was tuned to one of the nonpreferred categories, processing types, or both.

If a given region of the face network is domain-specific, then it should be sensitive to face processing and insensitive to processing type. This would predict that the highest-order interaction for that region would be a Category × Similarity interaction of the form shown in Figure 1A. Alternatively, if any given region is process-specific, then it should show sensitivity to one processing type but not show differential sensitivity to category. The Processing Type × Similarity interaction of the form shown in Figure 1B would emerge. A third account, referred to as “category-optimized,” hypothesizes that face processing engages configural processing to a greater degree than does nonface processing, whereas house processing engages featural processing to a greater degree than does face processing. In this case, the highest-order interaction would be a Category × Processing Type × Similarity of the form shown in Figure 1C. Another possibility (generalized perceptual differentiation) is that a brain region is engaged in perceptual differentiation in a generalized sense. This account would predict a Category × Processing Type × Similarity interaction of the form shown in Figure 1D. In this case, the perceptual differentiation is not easily reduced to clearcut category and processing type distinctions because the effects of these variables are nonadditive.

METHODS

Participants

Fifty-nine healthy right-handed volunteers (mean age = 26.5 years, SD = 6.0 years, range = 18–42 years; 29 men) were compensated or received course credit for participation. Because of excessive head motion (>1.75 mm), data from eight participants were eliminated. No participants reported neurological or psychiatric diagnoses or pregnancy, and all provided informed consent before participating. All procedures were approved by the university's institutional review board.

Design and Stimuli

This was a 2 (Category: faces, houses) × 3 (Processing Type: featural, first-order, second-order) × 4 (Similarity Level: 0, 1, 2, 3, where 0 indicates no features or relations in common and 3 indicates that three features or relations were in common between paired stimuli) mixed blocked design. Participants were assigned to the featural (n = 16, eight men, mean age = 27.1 years, SD = 5.2 years), first-order (n = 17, eight men, mean age = 25.8 years, SD = 5.6 years), or second-order (n = 18, 10 men, mean age = 27.1 years, SD = 7.4 years) processing condition. Category and similarity were manipulated within subjects.

Photo-realistic faces were constructed using FACES 4.0 software (IQ Biometrix, Redwood Shores, CA), and house stimuli were created using Chief Architect 10.06a (Coeur d'Alene, ID). Adobe Photoshop 5.5 (San Jose, CA) was used for first-order configuration manipulations. Twenty-four faces were initially constructed so that none of the features overlapped, and these were used as the basis for making featural, first-order, and second-order changes and constructing stimulus pairs. Although no two pairs were repeated, the same face was repeated up to five times in different similarity conditions across both house or face runs. Forty-eight identical (same) pairs per category and processing type were used (referred to as Sim4; see Figure 2).

Featural Changes

For each original face, distracter faces were constructed so that one, two, three, or four features (eyes, nose, mouth, or eyebrows) were replaced, yielding four similarity (sim) levels (and 96 unique faces for each processing type). Sim0–Sim3 faces respectively shared 0–3 common features with the target face. The feature change for each sim level was counterbalanced across all stimulus pairs so that feature replacement was not confounded with sim level. The same procedures were used for house features (door, steps, and lower-level and upper-level windows).

First-order Changes

The first-order face changes were (a) eyes above nose, (b) eyes above mouth, (c) nose above mouth, or (d) eyebrows above eyes. The first-order house changes were (a) lower windows above/level with the door, (b) upper windows above steps, (c) door above steps, or (d) upper windows above door. The relation changed for each sim level was counterbalanced across all stimulus pairs so that relation replacement was not confounded with sim level.

Second-order Changes

The second-order face changes were (a) horizontal distance between the centroid of both eyes, (b) vertical distance between centroid of nose and top of forehead, (c) vertical distance between centroid of mouth and top of forehead, and (d) vertical distance between center of two eyes and top of forehead. For faces, an initial spacing of 2 SD from Farkas (1994) norms was used but was changed to a 3 SD spacing after 2 SD was identified as being too difficult to detect. The house changes were (a) horizontal distance between the centroid of both lower windows, (b) horizontal distance between the centroid of both upper windows, (c) vertical distance between center of lower windows and bottom of roof, and (d) vertical distance between center of upper windows and bottom of roof. Again, the relation change for each sim level was counterbalanced across all pairs to avoid confounding with sim level.

Procedure

Each participant completed five functional runs in counterbalanced order: two face-matching and two house-matching runs and a face localizer run. Each face and house run consisted of eight task blocks: two per sim level. Each task block (27.5 sec in length) consisted of eight trials: five different trials of a given block's sim level and three Sim4 (same) trials. The ratio of 5 “different”:3 “same” was used because the process of interest, perceptual differentiation, was most relevant on the “different” trials; therefore, we were able to sample more of the relevant behavior while also having a sufficient number of “same” trials so that responding was not completely biased toward responding “different.” Prior studies (e.g., Joseph & Gathers, 2003) showed that performance on “same” trials did not vary as a function of similarity level, but performance on “different” trials did vary across similarity level, as expected.

For each trial, participants saw either two faces or two houses for 2900 msec followed by a fixation interval for 538 msec. Participants indicated whether the two stimuli were the same (index finger) or different (middle finger) using a fiber-optic response pad (MRA, Inc., Washington, PA). Participants could respond at any point during the trial. A 12.5-sec rest period occurred between blocks, and each task block onset was triggered by a scanner pulse. The face localizer run consisted of nine blocks (three each of face, object, or texture) lasting 17.5 sec each, with 10 interleaved fixation blocks (12.5 sec each). During each block, 10 different yearbook faces, common objects, or visual textures appeared for 1000 msec followed by a fixation of 750 msec. Participants pressed a button each time a stimulus appeared to ensure attentive processing.

fMRI Data Acquisition and Analysis

Images were acquired using a Siemens 3T Trio MRI system (Erlangen, Germany): one 109-volume (272.5 sec) face localizer scan and four 133-volume (322.5 sec) task scans (gradient-echo EPI; echo time = 30 msec, repetition time = 2500 msec, flip angle = 80°, field of view = 22.4 × 22.4 cm, interleaved acquisition, 38 axial contiguous 3.5-mm slices for the face localizer scan, and 40 slices for the task scans). Hence, the total number of brain volumes used to sample perceptual differentiation behavior was 352 per subject (88 task block volumes per subject × 4 functional runs). A T1-weighted MPRAGE (echo time = 2.56 msec, repetition time = 1690 msec, inversion time = 1100 msec, field of view = 25.6 cm × 22.4 cm, flip angle = 12°, 176 contiguous sagittal 1-mm thick slices) and field map were also collected. E-Prime software (Version 1, www.pstnet.com; Psychology Software Tools, Pittsburgh, PA) running on a Windows computer connected to the MR scanner presented visual stimuli and recorded the time of each MR pulse, visual stimulus onset, and behavioral responses.

Preprocessing and statistical analysis were conducted using FMRIB software library (v. 4.1.7, FMRIB, Oxford University, Oxford, UK). For each subject, preprocessing included motion correction with MCFLIRT, brain extraction using BET, spatial smoothing with a 7-mm FWHM Gaussian kernel, and temporal high-pass filtering (cutoff = 100 sec). Statistical analyses were performed at the single-subject level (GLM, FEAT v. 5.98). Each localizer time series was modeled with three explanatory variables (EVs; face, object, and texture versus baseline) convolved with a double gamma hemodynamic response function and a temporal derivative. Contrasts of interest were face > fixation, object > fixation, texture > fixation, face > object, face > texture, and object > texture. For each participant, face localizer contrast maps were registered via the subject's high-resolution T1-weighted anatomical image to the MNI-152 template (12-parameter affine transformation; FLIRT) yielding images with spatial resolution of 2 × 2 × 2 mm. Mixed-effects group analyses (using FLAME1 + 2) yielded the group level statistical parametric map of each contrast. Group-level maps were cluster thresholded (Worsley, 2001) using corrected significance of p = .05 and Z > 2.44 for faces or 4.86 for objects (minimum cluster size = 831 or 1 voxels, respectively). A higher threshold was used for objects because the clusters were large and not easily decomposed at a lower threshold. Face-preferential regions were then isolated using logical combination (Joseph, Gathers, & Bhatt, 2011; Joseph & Gathers, 2002; Joseph, Partin, & Jones, 2002) of the group-level contrasts: face > object and face > texture and face > fixation. Object-preferential regions were identified by logical combination of objects > textures and objects > faces and objects > fixation. The logical combination was conducted at the group level rather than the individual subject level to avoid the possibility that a given participant would show no above-threshold activation for the combination of contrasts, which would produce 0s in the follow-up analyses. Because logical intersection was applied to the cluster-thresholded maps, the size of the resulting clusters could be smaller than the minimum cluster size. Hypothesis testing was conducted in confirmatory repeated-measures ANOVAs (which account for multiple subjects) with Bonferroni-corrected (alpha = .017) post hoc tests. These ANOVAs tested the main effect of Category (face, object, texture) on percent signal change relative to baseline with planned contrasts using face (or object for object-preferential regions) as the reference category. All of the face-preferential regions in Table 1 showed a significantly greater face than object or texture response (all ps < .017; similarly for object-preferential regions) except the left amygdala (AMG) in which one comparison fell short of significance (p = .02). In addition, all regions showed face (object) > fixation using a one-sample t test comparing to 0 (all ps < .017). Therefore, the preferential regions in Table 1 showed a stronger response for the condition of interest (face or object) compared with the three other conditions.

Table 1. 

ROIs Isolated by the Face Localizer Task

Region
Size
Coordinates (mm)
Main Effects
Interactions
Category
Similarity
Processing Type
C × S
C × P
P × S
C × P × S
x
y
z
F(1, 48) =
F(3, 144) =
F(2, 48) =
F(3, 144) =
F(2, 48) =
F(6, 144) =
F(6, 144) =
RFFA 312 43 −52 −21 28.1*** – – 5.48*** 5.4** 2.6* – 
ROFA 319 37 −79 −16 – 3.3* 3.5* – 4.9* – – 
RAMG 963 24 −12 −12 25.4*** – – – – – – 
RIFG 1531 47 26 14.5*** 4.9** – 3.57* – – 2.4* 
RoLOCa 416 35 −82 72.1*** 10.6*** – – 3.6* 2.5* – 
RtLOCa 31 44 −62 −4 5.2* 5.1*** – – 11.9*** – – 
RfLOCa 273 28 −43 −14 223.5*** – – 2.96* 6.7** – – 
RCAS 592 15 −92 −1 13.9** – – – 4.8* – – 
LAMG 410 −20 −12 −12 16.5*** – – – – – – 
LoLOCa 306 −35 −85 59.0*** 8.5*** – 3.55* 6.1** – – 
LtLOCa 329 −42 −66 −2 19.7*** 9.6*** – – 25.0*** – 2.8* 
LfLOCa 308 −29 −46 −13 175*** – – 3.3* 6.4** – – 
Region
Size
Coordinates (mm)
Main Effects
Interactions
Category
Similarity
Processing Type
C × S
C × P
P × S
C × P × S
x
y
z
F(1, 48) =
F(3, 144) =
F(2, 48) =
F(3, 144) =
F(2, 48) =
F(6, 144) =
F(6, 144) =
RFFA 312 43 −52 −21 28.1*** – – 5.48*** 5.4** 2.6* – 
ROFA 319 37 −79 −16 – 3.3* 3.5* – 4.9* – – 
RAMG 963 24 −12 −12 25.4*** – – – – – – 
RIFG 1531 47 26 14.5*** 4.9** – 3.57* – – 2.4* 
RoLOCa 416 35 −82 72.1*** 10.6*** – – 3.6* 2.5* – 
RtLOCa 31 44 −62 −4 5.2* 5.1*** – – 11.9*** – – 
RfLOCa 273 28 −43 −14 223.5*** – – 2.96* 6.7** – – 
RCAS 592 15 −92 −1 13.9** – – – 4.8* – – 
LAMG 410 −20 −12 −12 16.5*** – – – – – – 
LoLOCa 306 −35 −85 59.0*** 8.5*** – 3.55* 6.1** – – 
LtLOCa 329 −42 −66 −2 19.7*** 9.6*** – – 25.0*** – 2.8* 
LfLOCa 308 −29 −46 −13 175*** – – 3.3* 6.4** – – 

Size is in voxels with spatial resolution of 2 × 2 × 2 mm. Minimum cluster extent for each individual contrast prior to logical combination was 831–920 voxels for face-preferential regions (Z > 2.44) and 1 voxel for object-preferential contrasts (Z > 4.86). However, cluster sizes could be smaller than this after logical intersection. In addition, the LOC was manually divided into three segments based on anatomical boundaries, leading to smaller clusters. AMG = amygdala; CAS = calcarine sulcus; FFA = fusiform face area; IFG = inferior frontal gyrus; L = left; OFA = occipital face area; fLOC = lateral occipital complex, fusiform portion; oLOC = lateral occipital complex, occipital portion; tLOC = lateral occipital complex, temporal portion; R = right.

aThese regions were isolated as object-preferential; all other regions are face-preferential.

*p < .05.

**p < .01.

***p < .001.

For the face and house runs, the different sim level blocks were modeled as one task EV (all with the same event strength) and a second similarity EV (higher sim blocks were assigned a higher event strength) for the voxel-wise analyses. This approach controls for the overall task effect while isolating fMRI signal modulation due to sim. The assignment of the values 1 through 4 to event strengths representing the different sim level blocks is consistent with the experimental manipulation in that each subsequent similarity level introduces one additional feature or spatial relation relative to the prior similarity level. Statistical maps isolated by mixed-effects group analyses (FLAME1 + 2) for the task and similarity EVs were cluster thresholded (Z > 3.1 and corrected significance of p = .05; minimum cluster size = 225 voxels) then combined using logical intersection (“and”).

For each face- and object-preferential and face-matching ROI, percent signal change relative to fixation (all EV heights of 1) was extracted for each sim level and category in each participant's first-level analysis (using Featquery). For the ROIs from the localizer task, the percent signal change extracted (from the matching runs) was logically independent from the signal used to define the ROIs. The matching task ROIs were based only on one of those six conditions; therefore, 83% of the data were logically independent from the data used to define the ROI. In fact, 67% of the percent signal change data in matching-task ROIs were completely independent because that 67% came from different participants. We acknowledge that within featural face ROIs, for example, we would expect the effect of similarity to be significant because that is how the ROI was defined. However, the critical aspect of hypothesis testing was whether the similarity effect also emerged for the other five conditions. Therefore, although there is a small degree of dependence in the data, the critical hypotheses are based on data that is logically independent from defining the ROIs.

Percent signal change for the 4 Sim Levels × 2 Categories for each participant were submitted to a 3 (Processing Type: featural, first-order, second-order) × 4 (Sim) × 2 (Category: face, house) repeated-measures ANOVA with Processing Type as the between-subject factor for each ROI. RT on each trial was log-transformed to normalize the distribution of RT. Outliers were defined as three standard deviations above or below the mean RT (0.1% of the data). Error rate and correct log-transformed RT (logRT) were initially analyzed with a 3 (Processing Type: featural, first-order, second-order) × 4 (Sim Level) × 2 (Category: face, house) × 2 (Trial type: same, different) repeated-measures ANOVA with a between-subject factor (Processing Type) to demonstrate that the similarity manipulation was more pronounced for “different” trials. Having established that (see Results) we then collapsed over trial types and conducted a 3 (Processing Type: featural, first-order, second-order) × 4 (Sim Level) × 2 (Category: face, house) repeated-measures ANOVA for errors and logRT. Collapsing over trial type for the analysis of errors and logRT was also consistent with the ROI repeated-measures ANOVAs because using a block design did not allow deconvolution of the fMRI signal for specific trials. For all repeated-measures ANOVAs, results from the univariate tests are reported because there were no sphericity violations, following guidelines by Hertzog and Rovine (1985). When interactions with sim emerged, we used simple effects analysis (Keppel & Zedeck, 1989) and planned polynomial contrasts to determine whether the sim effect was monotonically increasing. “Monotonically increasing” was indicated by (a) a significant linear fit where the slope was positive or (b) a significant quadratic fit in which repeated contrasts of successive sim levels indicated that at least one sim level was greater than the prior level. If the best polynomial fit was cubic or if the quadratic fit indicated local decreases for any adjacent sim levels, then the sim effect was not monotonic. Establishing positive monotonicity was critical for concluding that perceptual differentiation was exhibited in a given ROI. Below, when we report simple effects of sim, those effects were monotonically increasing according to the above definition.

RESULTS

Behavioral Results

An initial analysis of behavior was analyzed with a Similarity × Same/Different repeated-measures ANOVA to establish that the effect of similarity was more pronounced on “different” than “same” trials. This interaction was significant for both errors, F(3, 165) = 41.0, p < .0001, and logRT, F(3, 165) = 43.3, p < .0001. As shown in Figure 3A, modulation of behavior due to similarity was driven more strongly by “different” than “same” trials. The contribution of “same” trials was constant across sim levels for errors and nearly constant across sim levels for logRT. The blocked design did not allow us to separately examine “same” and “different” responses; therefore, in all subsequent analyses, we collapsed across “same” and “different” trials given that the contribution of “same” trials to the similarity effect was nearly constant.

Figure 3. 

(A) Log-transformed RT (logRT) and error rates as a function of similarity level and same/different responding. (B) RT and errors in each of the six experimental conditions. Error bars are standard errors.

Figure 3. 

(A) Log-transformed RT (logRT) and error rates as a function of similarity level and same/different responding. (B) RT and errors in each of the six experimental conditions. Error bars are standard errors.

As expected, both error rate and logRT increased as a function of perceptual similarity (Figure 3), which demonstrated that this manipulation was effective at modulating perceptual discrimination performance. For logRT, the main effect of Similarity, F(3, 144) = 97.7, p = .0001, and the Category × Sim × Processing Type interaction were significant, F(6, 144) = 2.3, p = .039. Simple main effects of Sim (3 Processing Types × 2 Categories) were all significant (ps < .001). The main effect of Processing Type, F(2, 48) = 19.6, p = .0001, showed that first-order responding was faster than second-order or featural. The Category main effect, F(1, 48) = 13.6, p = .001, indicated that houses had longer RTs. The Category × Processing Type interaction, F(2, 48) = 8.1, p = .001, further qualified this effect: Houses took more time to respond than faces for featural (p = .002) and first-order processing (p = .004) but not for second-order processing.

For errors, the main effect of Sim, F(3, 144) = 60.7, p = .0001, and the Category × Sim × Processing Type interaction, F(6, 144) = 4.7, p = .0001, were significant. Simple main effects of Sim were significant for all processing type and category conditions (ps < .002) except for first-order face processing (p = .44). The Processing Type main effect, F(2, 48) = 20.6, p = .0001, showed that first-order processing was easier than second-order or featural. The Category × Processing Type interaction, F(2, 48) = 4.0, p = .024, and simple Category effect for each processing type indicated that houses were more difficult than faces for second-order processing (p = .015); otherwise, houses and faces were equated for featural and first-order performance.

ROIs from the Localizer Task

Face- or object-preferential regions are outlined in Table 1 with activation maps shown in Figure 4A. As expected, face-preferential regions included the right FFA, bilateral OFA, right inferior frontal gyrus (IFG), right STS, bilateral AMG and the bilateral calcarine sulcus (CAS). Object-preferential regions included a large expanse of bilateral occipito-temporal cortex consistent with the LOC (Malach et al., 1995). We manually divided the LOC into three portions: occipital, fusiform, and temporal. ROIs with negative percent signal change extracted from the matching task (brain stem, medial pFC, right temporal pole, right STS, right temporal–parietal junction) and ROIs with no main effects or interactions (left OFA, left CAS) were not included in Table 1. The repeated-measures ANOVA conducted in each face- (or object-) preferential region was expected to reveal a main effect of Category with fMRI signal greater for faces than houses or vice versa. The Category main effect was significant in all ROIs, except the right OFA and the right CAS (Table 1). However, the critical test was whether a region responded to perceptual differentiation for the nonpreferred category and for different processing types. Although the right FFA showed perceptual differentiation only for faces (significant Category × Sim interaction and significant simple effects of sim for faces, ps < .003), the Sim trend was not monotonically increasing. However, the right FFA showed evidence for process specificity: The Sim effect was significant and monotonically increasing only for first-order (p < .001). The right IFG showed a significant three-way interaction with simple main effects of Sim for featural and first-order faces (ps < .01) but the trend was monotonically increasing only for featural faces. The right occipital LOC showed perceptual differentiation only for first-order processing (Processing × Sim interaction; simple effect of sim, p < .009) regardless of category, thereby supporting process specificity. The left temporal LOC showed perceptual differentiation only for first-order houses (significant three-way interaction; simple effect of sim, p < .004). The right OFA showed evidence for generalized perceptual differentiation: The main effect of Sim was not further qualified by Category or Processing Type.

Figure 4. 

Group-level activation maps for (A) face-preferential (Z > 2.44, p = .05, corrected) and object-preferential (Z > 4.86, p = .05, corrected) activation in the localizer run and (B) featural (red), first-order (blue), and second-order (green) matching in the task runs (Z > 3.1, p = .05, corrected).

Figure 4. 

Group-level activation maps for (A) face-preferential (Z > 2.44, p = .05, corrected) and object-preferential (Z > 4.86, p = .05, corrected) activation in the localizer run and (B) featural (red), first-order (blue), and second-order (green) matching in the task runs (Z > 3.1, p = .05, corrected).

ROIs from the Face-matching Run

Regions that emerged from the logically combined task and sim-weighted EVs for faces for each processing type are listed in Table 2 and illustrated in Figure 4. Featural face processing was associated with the most extensive activation that included symmetric activation in the insula, superior parietal lobule (SPL), and IFG and activation in the cerebellum and anterior cingulate. Second-order and featural processing overlapped in the cingulate, right insula, and right SPL, but second-order uniquely recruited the right superior frontal gyrus. First-order processing emerged in the bilateral SPL, which partially overlapped with featural and second-order activation, and the right fusiform gyrus in a region somewhat consistent with the right FFA. The repeated-measures ANOVA conducted in each region was expected to reveal monotonically increasing similarity functions for the processing type and category that isolated the region in the voxel-wise analysis. However, the critical test was how the region responded to the other category and processing types not isolated in the voxel-wise analysis. As shown in Table 2, the majority of regions showed a significant Category × Sim × Processing interaction. Analysis of simple effects of Sim for each condition revealed that these regions showed perceptual differentiation that crossed category and processing type boundaries (consistent with generalized perceptual differentiation). None of these regions showed perceptual differentiation only for second-order faces or featural houses (category optimized account). Three regions (cerebellum from the featural task, ACC, and right superior frontal from the second-order task) showed a Category × Sim interaction which is potentially consistent with domain specificity. However, analysis of simple effects of sim in all three regions revealed perceptual differentiation for both faces and houses (with different slopes for the two categories). No regions showed only a Processing × Sim interaction, which would be consistent with process specificity.

Table 2. 

ROIs Isolated in the Face-matching Runs

Region
Size
Coordinates (mm)
Main Effects
Interactions
Category
Similarity
Processing Type
C × S
C × P
P × S
C × P × S
x
y
z
F(1, 48) =
F(3, 144) =
F(2, 48) =
F(3, 144) =
F(6, 144) =
F(6, 144) =
F(6, 144) =
Cingulate (F) 520 2.3 19.0 41.9 – 12.5*** 6.8** 3.8* – – 2.3* 
RINS (F) 378 36.0 20.8 −3.5 – 18.6*** 8.0** 5.0** – – 2.9* 
LINS (F) 322 −32.8 20.5 −0.7 – 8.7*** 6.1** 5.6** – – 3.5** 
RIFG (F) 856 46.7 14.6 22.9 8.3** 11.1*** 4.6* 3.6* – – 4.4*** 
LIFG (F) 125 −42.7 2.9 27.7 – 7.3*** – – 6.4** – 3.0** 
RSPL (F) 283 24.8 −65.6 40.5 63.4*** 18.0*** – 3.3* 5.9** – 3.0** 
LSPL (F) 387 −27.0 −56.1 44.0 24.0*** 15.6*** – – 5.8** – 2.9* 
Cerebellum (F) 229 −7.4 −75.6 −30.1 – 8.3*** – 4.1** – – – 
LSPL (1) 766 −26.8 −72.8 30.9 95.9*** 13.3*** – 3.0* 10.9*** – 2.3* 
RSPL (1) 2142 31.3 −74.7 22.3 56.3*** 13.2*** – 3.3* 8.6** 2.7* – 
Above RFFA (1) 206 46.0 −54.9 −13.0 – 4.9** –   5.8** 2.8* – 
RSPL (2) 684 26.7 −56.3 46.3 39.0*** 18.8*** 5.1* 3.0* – 2.6* 3.3** 
RINS (2) 484 35.1 20.2 −1.4 – 18.9*** 7.2** 4.7** – – 3.1** 
Cingulate (2) 842 3.0 18.0 43.1 – 11.4*** 7.6** 3.6* – – – 
RSFG (2) 116 24.0 −0.1 45.8 23.0*** 11.6*** 7.0** 3.8* – – – 
Region
Size
Coordinates (mm)
Main Effects
Interactions
Category
Similarity
Processing Type
C × S
C × P
P × S
C × P × S
x
y
z
F(1, 48) =
F(3, 144) =
F(2, 48) =
F(3, 144) =
F(6, 144) =
F(6, 144) =
F(6, 144) =
Cingulate (F) 520 2.3 19.0 41.9 – 12.5*** 6.8** 3.8* – – 2.3* 
RINS (F) 378 36.0 20.8 −3.5 – 18.6*** 8.0** 5.0** – – 2.9* 
LINS (F) 322 −32.8 20.5 −0.7 – 8.7*** 6.1** 5.6** – – 3.5** 
RIFG (F) 856 46.7 14.6 22.9 8.3** 11.1*** 4.6* 3.6* – – 4.4*** 
LIFG (F) 125 −42.7 2.9 27.7 – 7.3*** – – 6.4** – 3.0** 
RSPL (F) 283 24.8 −65.6 40.5 63.4*** 18.0*** – 3.3* 5.9** – 3.0** 
LSPL (F) 387 −27.0 −56.1 44.0 24.0*** 15.6*** – – 5.8** – 2.9* 
Cerebellum (F) 229 −7.4 −75.6 −30.1 – 8.3*** – 4.1** – – – 
LSPL (1) 766 −26.8 −72.8 30.9 95.9*** 13.3*** – 3.0* 10.9*** – 2.3* 
RSPL (1) 2142 31.3 −74.7 22.3 56.3*** 13.2*** – 3.3* 8.6** 2.7* – 
Above RFFA (1) 206 46.0 −54.9 −13.0 – 4.9** –   5.8** 2.8* – 
RSPL (2) 684 26.7 −56.3 46.3 39.0*** 18.8*** 5.1* 3.0* – 2.6* 3.3** 
RINS (2) 484 35.1 20.2 −1.4 – 18.9*** 7.2** 4.7** – – 3.1** 
Cingulate (2) 842 3.0 18.0 43.1 – 11.4*** 7.6** 3.6* – – – 
RSFG (2) 116 24.0 −0.1 45.8 23.0*** 11.6*** 7.0** 3.8* – – – 

– indicates nonsignificant results. Regions were isolated from featural (F), first-order (1), or second-order (2) face processing. INS = insula; IFG = inferior frontal gyrus; L = left; R = right; SPL = superior parietal; FFA = fusiform face area; SFG = superior frontal gyrus. Size is in voxels with spatial resolution of 2 × 2 × 2 mm. Minimum cluster extent for each individual contrast prior to logical combination was 225–305 voxels (Z > 3.1). However, cluster sizes could be smaller than this after logical intersection.

*p < .05.

**p < .01.

***p < .001.

Aggregate Regions

Given the large number of ROIs in Tables 1 and 2 and the fact that several of these ROIs overlapped, we aggregated the ROIs into six broad anatomical groupings: (a) AMG, (b) occipito-temporal cortex, (c) occipital cortex, (d) insula-anterior cingulate, (e) lateral frontal cortex, and (f) parietal cortex (Table 3). Percent signal change was averaged over individual ROIs in each aggregate group (by Category, Sim, and Processing Type conditions) and critical hypotheses outlined in Figure 1 were tested using repeated-measures ANOVAs (Table 3) with Category and Similarity as repeated factors and Processing Type as the between-subject factor.

Table 3. 

Aggregate ROI Analysis

Region
Components
Main Effect of
Interaction
Category
Similarity
Processing Type
C × S
C × P
P × S
C × P × S
F(1, 48) =
F(3, 144) =
F(2, 48) =
F(3, 144) =
F(6, 144) =
F(6, 144) =
F(6, 144) =
Amygdala RAMG (FL), LAMG (FL) 21.8*** – – – – – – 
Occipital RoLOC (FL), RtLOC (FL), RfLOC (FL), RCAS (FL), LoLOC (FL), LtLOC (FL), LfLOC (FL) 111.1*** 5.1** – 3.3* 13.0*** – – 
Occipito-temporal RFFA (FL), ROFA (FL), Above FFA (1) 9.7* 2.8* – 3.7* 4.9* 2.2* – 
Parietal RSPL (F), LSPL (F), RSPL (1), LSPL (1), RSPL (2) 60.9*** 17.8*** – 3.0* 7.2** 2.4* 2.8* 
Insula-cingulate RINS (F), LINS (F), Cingulate (F), RINS (2), Cingulate (2) – 14.4*** 8.5** 5.2** – – 3.1** 
Lateral prefrontal RIFG (FL), RIFG (F), LIFG (F), RSFG (2) – 9.4*** 3.8* 3.9* – – 3.1** 
Region
Components
Main Effect of
Interaction
Category
Similarity
Processing Type
C × S
C × P
P × S
C × P × S
F(1, 48) =
F(3, 144) =
F(2, 48) =
F(3, 144) =
F(6, 144) =
F(6, 144) =
F(6, 144) =
Amygdala RAMG (FL), LAMG (FL) 21.8*** – – – – – – 
Occipital RoLOC (FL), RtLOC (FL), RfLOC (FL), RCAS (FL), LoLOC (FL), LtLOC (FL), LfLOC (FL) 111.1*** 5.1** – 3.3* 13.0*** – – 
Occipito-temporal RFFA (FL), ROFA (FL), Above FFA (1) 9.7* 2.8* – 3.7* 4.9* 2.2* – 
Parietal RSPL (F), LSPL (F), RSPL (1), LSPL (1), RSPL (2) 60.9*** 17.8*** – 3.0* 7.2** 2.4* 2.8* 
Insula-cingulate RINS (F), LINS (F), Cingulate (F), RINS (2), Cingulate (2) – 14.4*** 8.5** 5.2** – – 3.1** 
Lateral prefrontal RIFG (FL), RIFG (F), LIFG (F), RSFG (2) – 9.4*** 3.8* 3.9* – – 3.1** 

– indicates non-significant results. Regions were isolated from face localizer (FL), featural (F), fist-order (1), or second-order (2) matching. AMG = amygdala; CAS = calcarine sulcus; FFA = fusiform face area; IFG = inferior frontal gyrus; INS = insula; fLOC = lateral occipital complex, fusiform portion; oLOC = lateral occipital complex, occipital portion; tLOC = lateral occipital complex, temporal portion; OFA = occipital face area; SFG = superior frontal gyrus; SPL = superior parietal.

*p < .05.

**p < .01.

***p < .001.

In Figure 5, the AMG showed a face preference but no effect of Similarity, Processing Type, or interactions. Occipital cortex showed a significant Category × Similarity interaction, but Sim effects were significant for both faces (p = .002) and houses (p = .044). The interaction was driven by slightly different shapes of the similarity function for each category. The occipito-temporal region showed a significant Category × Similarity interaction in which the Sim effect was significant for faces (p = .006) but not for houses (p = .112), consistent with domain specificity. However, the significant Processing Type × Sim interaction (consistent with process specificity) indicated that the Sim effect was significant only for first-order (p = .001) but not for featural (p = .12) or second-order (p = .45) processing. The remaining three regions (insula-cingulate, lateral prefrontal, and parietal cortex) showed evidence for generalized perceptual differentiation given the significant Category × Similarity × Processing Type interactions. In these regions, simple effects of Sim were significant for featural faces and first-order houses. Lateral prefrontal and parietal cortex also showed a significant simple effect of Sim for first-order faces, whereas insula-cingulate cortex also showed a significant simple effect of Sim for second-order faces (Figure 6).

Figure 5. 

fMRI signal plotted as a function of similarity, category, and processing type in three aggregate ROIs: amygdala, occipital region, and occipito-temporal region.

Figure 5. 

fMRI signal plotted as a function of similarity, category, and processing type in three aggregate ROIs: amygdala, occipital region, and occipito-temporal region.

Figure 6. 

fMRI signal plotted as a function of similarity, category, and processing type in three aggregate ROIs: insula-cingulate region, lateral prefrontal region, and parietal region.

Figure 6. 

fMRI signal plotted as a function of similarity, category, and processing type in three aggregate ROIs: insula-cingulate region, lateral prefrontal region, and parietal region.

A supplemental ANOVA also explored hemispheric differences in aggregate regions (ANOVA included a repeated factor “hemisphere” for all but the occipito-temporal region because that was composed only of right-hemisphere regions). The main effect of Hemisphere was significant for the occipital (right > left, F(1, 48) = 69.5, p = .0001), insula-cingulate (right > left, F(1, 48) = 68.2, p = .0001), and lateral frontal (left > right, F(1, 48) = 10.8, p = .002) regions. A Hemisphere × Category × Processing Type interaction (F(2, 48) = 8.1, p = .001) emerged in the lateral frontal region indicating greater left-hemisphere activation for featural face and house processing but no hemisphere effect for the other conditions and no interactions with similarity, suggesting this preference is not related to perceptual differentiation.

Concerns about Task Difficulty

To address whether fMRI signal in the matching-task ROIs were driven by task difficulty, apart from the similarity manipulation, we examined the association of percent signal change relative to baseline in the face conditions (averaged across sim level) and behavioral performance (averaged across sim level) in these regions (see Joseph & Gathers, 2003). We chose to examine signal magnitude for this analysis rather than cluster extent given that cluster extent depends heavily on arbitrary thresholding. If fMRI signal is driven by task difficulty that is not related to similarity modulation, then participants who performed more poorly (as indexed by longer RT and more errors) should also produce greater signal. In other words, correlations between fMRI signal and performance should be positive if task difficulty drives responses in a given region. Spearman correlations were conducted for each ROI in Table 2. In featural ROIs, this correlation was conducted using only the subjects in the featural condition and likewise for the first- and second-order ROIs. Only the right superior frontal gyrus showed a negative correlation between face percent signal change and face logRT in the second-order condition. Higher fMRI signal was associated with faster responding, which indicates that this region was associated with better performance, not task difficulty. Therefore, greater fMRI signal in similarity-modulated regions reflects perceptual differentiation rather than greater effort or resources devoted to processing, apart from the similarity manipulation.

DISCUSSION

This study examined the degree to which face and object processing regions exhibit domain specificity (i.e., perceptual differentiation of faces or houses but little sensitivity to processing type), process specificity (i.e., perceptual differentiation for processing type but little sensitivity for a given category), category-optimized processing (i.e., perceptual differentiation for configural face processing or featural house processing), or generalized perceptual differentiation (i.e., perceptual differentiation crossing category and processing type distinctions). Similarity was parametrically varied based on three different processing types, which directly manipulated the component process of perceptual differentiation of featural, first-order or second-order information in faces and houses. Evidence for a strong domain-specific account of perceptual differentiation was minimal whereas evidence for generalized perceptual differentiation was more abundant. Each of the different accounts is discussed below.

Evidence for Domain Specificity

The right FFA, right IFG, and aggregate occipito-temporal region showed evidence for perceptual differentiation of faces but not houses, consistent with domain specificity. However, the right FFA/occipito-temporal region also showed first-order perceptual differentiation, but not specifically for faces, consistent with process specificity. In addition, the right IFG from the localizer task showed featural face processing, but this did not persist in the aggregate analysis. These findings suggest that the right FFA/occipito-temporal cortex is involved in both face and first-order processing and the right IFG is involved in both featural and face processing. Maurer, Le Grand, and Mondloch (2002) suggested that the right FFA is sensitive to first-order face information and may be involved in making the basic distinction between faces and objects. Following this, we suggest that the right FFA and surrounding occipito-temporal regions may process first-order information in a stimulus en route to making this face/nonface determination. In other words, this aggregate region processes first-order information in both faces and houses but also accumulates information that a stimulus is a face based on this first-order information, which leads to a bias toward processing faces over houses.

However, other findings have suggested that the right FFA is involved in featural and second-order in addition to first-order processing (Liu et al., 2010; Maurer et al., 2007; Yovel & Kanwisher, 2004). One reason that the present study did not find evidence for both featural and second-order processing in the FFA is that the present simultaneous matching task may have emphasized analytical processing of the elements of a face more than a sequential matching or 1-back task, as used in these other studies. With simultaneous matching, the discrepant features can be directly compared and perceptually analyzed within the same time interval. With sequential matching, the first stimulus must be briefly remembered to compare with the second stimulus, which may have engaged holistic processing in that remembering the individual features may have been easier to encode as an integrated percept. Consequently, with sequential matching, both the configural and featural conditions may have engaged holistic strategies, so that the lack of a differential response in the right FFA may have been driven by holistic processing rather than featural or second-order processing. The right FFA shows a stronger response to holistic than to parts-based processing of faces (e.g., Axelrod & Yovel, 2010; Harris & Aguirre, 2010; Schiltz & Rossion, 2006; Rossion et al., 2000). Although the present tasks did not require holistic processing, the current findings are consistent with the idea that the right FFA shows a weaker response to featural or analytical processing.

The right and left AMG showed a preference for faces over houses in the matching task but no sensitivity to processing type and no similarity modulation. Because the faces used in this study varied little in terms of facial expression, the AMG activation was not likely related to processing emotion or expression. The AMG has been described as a salience detector (Santos, Mier, Kirsh, & Meyer-Lindenberg, 2011) rather than a region that responds only to threatening stimuli. All faces, regardless of emotional content, are salient to humans and may be given attentional priority for processing (Palermo & Rhodes, 2007). We suggest that the AMG may be involved in detecting the presence of faces. Differential processing of featural and configural information in the AMG did not emerge, but Sato, Kochiyama, and Yoshikawa (2011) reported that the AMG showed a reduced response to inversion, thereby implicating a role in configural processing. However, inversion is an indirect test of configural processing and may make a stimulus less face-like and, consequently, less salient, thereby reducing the AMG response.

Evidence for Process Specificity

Consistent with process specificity, some regions showed evidence for processing only first-order information, regardless of category (right FFA, right oLOC, and left tLOC). However, as discussed, sensitivity that was exclusive to first-order processing did not persist in the aggregate analysis and the right FFA and occipito-temporal cortex also showed evidence for domain specificity. Although the evidence was somewhat weak for first-order specificity, we suggest that the general function of regions sensitive to first-order processing is to initially determine whether a stimulus is a face or nonface. The present task did not require this determination, but fMRI signal modulation by first-order information suggests sensitivity to disruptions in first-order processing, which implies that these regions normally process first-order information. The left lateral frontal cortex showed a preference for featural processing, but not specific for faces. This appears to be the only region that showed evidence for process specificity, but the hemispheric modulation did not interact with similarity. Nevertheless, preference for featural processing in the left lateral frontal cortex is consistent with Maurer et al. (2007) for face stimuli, but the present study showed that this preference is not face-specific.

Evidence for Category-optimized Processing

Evidence for category-optimized processing was minimal—no regions showed perceptual differentiation of configural face or featural house information. This is surprising given the importance of second-order processing for faces (e.g., Diamond & Carey, 1986). Potentially, the lack of evidence for category-optimized processing was due to using a perceptual matching task rather than face identification or emotion recognition, which may preferentially emphasize different types of perceptual information in faces. Kadosh, Henson, Kadosh, Johnson, and Dick (2010) examined changes in identity, expression, and gaze and found that fusiform and inferior occipital activation was highly overlapping for identity and expression processing. They suggested that this overlap was due to demands on featural and configural processing. Similarly, psychophysical studies have shown that featural and configural information processing are not as separable as once thought (Sekuler, Gaspar, Gold, & Bennett, 2004). The present results similarly showed that featural and second-order face processing are not very separable in terms of neural substrates.

Evidence for Generalized Perceptual Differentiation

Nearly all regions showed processing of more than one category and processing type despite the fact that the voxel-wise analyses isolated regions that either preferred faces (in the localizer run) or showed perceptual processing of faces in the task runs. Regions involved in differentiation of faces almost always differentiated houses. This is not surprising given that perceptual differentiation is a component process of discriminating items within visually homogenous categories. Many of the regions typically attributed to face processing may instead reflect a process of making fine distinctions among stimuli that are highly similar in shape. The use of house stimuli that were well matched with the face stimuli in terms of number of features and the spatial relations of those features revealed very few category differences. Instead, the degree of perceptual similarity was a stronger influence on performance and fMRI signal in most regions. In addition, the influence of perceptual similarity was not driven by task difficulty because many regions that showed differentiation in the more difficult conditions (e.g., second-order or featural processing) also showed differentiation in the easier condition (first-order processing).

The right OFA showed generalized perceptual differentiation that was not qualified by higher-order interactions. Others have suggested that the right OFA is as essential as the right FFA in face processing (Rossion et al., 2003), shows more face specialization than the right FFA in adults (Joseph et al., 2011), acts early in the face processing stream (Harris & Aguirre, 2008) by passing along information to the FFA (Fairhall & Ishai, 2007; Haxby et al., 2000), and builds up a face representation in a hierarchical manner by analytically processing features (Pitcher, Walsh, Yovel, & Duchaine, 2007). This study did not show that the right OFA was preferentially sensitive to featural face information, in contrast to Pitcher et al.'s study in which rTMS disrupted 1-back matching of faces that differed in featural but not in second-order information. They also showed that the disruption of featural processing occurred only in an earlier (60–110 msec following stimulus onset) but not in later time windows. Because fMRI cannot resolve processing at the same temporal resolution as double-pulse TMS, this effect could not be detected in this study. However, the preference for featural face processing was not dominant enough to drive the responding in the right OFA in this study.

The present finding of generalized perceptual differentiation in the right OFA is consistent with another study (Haist et al., 2010) showing that the right OFA was involved in differentiating stimuli from visually homogenous categories (faces or watches). The right OFA was slightly more sensitive to perceptual differentiation than the right FFA. On the basis of that finding and the present results, we suggest that the right OFA is involved in perceptual differentiation of items within the same category, as opposed to making a face versus nonface distinction which relies on first-order information (as in the right FFA). Generalized perceptual differentiation in the service of making fine within-category distinctions is consistent with the idea that the right OFA acts early in processing (Fairhall & Ishai, 2007; Haxby et al., 2000; cf., Kadosh et al., 2010) and is as essential as the right FFA in face discrimination (Rossion et al., 2003).

Cortical Distribution of Information Processing

The matching task was associated with only minimal activation in occipito-temporal regions (except the right fusiform region from first-order face matching), which may be surprising in light of many studies that have isolated functional regions like the FFA, OFA, and LOC. However, face localizer tasks (which strongly implicate occipito-temporal regions) do not necessarily isolate processing that is relevant for higher-level face processing (Berman et al., 2010; Ng, Ciaramitaro, Anstis, Boynton, & Fine, 2006). In addition, others have noted that face processing relies on an extended network (Haxby et al., 2000), including frontal regions (Chan & Downing, 2011; Fairhall & Ishai, 2007; Maurer et al., 2007). Prior studies have also demonstrated that superior parietal cortex is involved in perceptual discrimination of nonface items that are highly similar in shape (e.g., Joseph & Gathers, 2003), consistent with the present findings.

The heavy involvement of frontal and parietal regions in perceptual differentiation of two visually homogenous categories, coupled with the finding that category and processing type effects were not purely additive in most brain regions, suggests that processing different kinds of perceptual information likely occurs in a distributed brain system. Interestingly, the aggregate analysis showed that information processing in the AMG was described only by a main effect of category, but in the LOC/occipital cortex, the category effect was further qualified by two 2-way interactions whereas in the occipito-temporal region, all three 2-way interactions were significant. In parietal and frontal regions the higher-order three-way interactions were significant. This suggests that information processing in regions associated with “early” processing stages (AMG, occipital, or occipito-temporal cortex) is driven by category or processing type but not by the integration of that information. Regions associated with higher-order processing, however, show more complex integration of information (as indexed by the three-way interactions). Potentially, a process of evidence accumulation occurs simultaneously in multiple brain regions during the matching task before making a final perceptual decision, as described by Ploran et al. (2007). In other words, many regions are involved in perceptual differentiation, but these regions interact and further qualify the perceptual differentiation based on category or processing type in other regions. Face processing, then, may rely on some of the same cognitive operations (and neural substrates) that are engaged for object processing, but face processing is distinguished from object processing by the interaction of multiply activated regions that accumulate perceptual evidence in favor of faces. The distributed nature of the information processing may be due to the fact that perceptual differentiation is a component process of many higher-order face tasks such as identification or emotion recognition. Had these other tasks been employed, there may have been greater evidence for domain-specific or category-optimized processing.

Acknowledgments

This publication was supported by NIH grant R01 HD 052724-04 and by a pilot grant from Autism Speaks. The contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH or Autism Speaks.

Reprint requests should be sent to Jane E. Joseph, Department of Neurosciences, Medical University of South Carolina, 19 Hagood Avenue, Harborview Office Tower Suite 806, P.O. Box 250212, MSC 212, Charleston, SC 29425, or via e-mail: josep@musc.edu.

REFERENCES

Axelrod
,
V.
, &
Yovel
,
G.
(
2010
).
External facial features modify the representation of internal facial features in the fusiform face area.
Neuroimage
,
52
,
720
725
.
Berman
,
M. G.
,
Park
,
J.
,
Gonzalez
,
R.
,
Polk
,
T. A.
,
Gehrke
,
A.
,
Knaffla
,
S.
,
et al
(
2010
).
Evaluating functional localizers: The case of the FFA.
Neuroimage
,
50
,
56
71
.
Chan
,
A. W. Y.
, &
Downing
,
P. E.
(
2011
).
Faces and eyes in human lateral prefrontal cortex.
Frontiers in Human Neuroscience
,
5
,
51
.
Diamond
,
R.
, &
Carey
,
S.
(
1986
).
Why faces are and are not special: An effect of expertise.
Journal of Experimental Psychology: General
,
115
,
107
117
.
Drucker
,
D. M.
, &
Aguirre
,
G. K.
(
2009
).
Different spatial scales of shape similarity representation in lateral and ventral LOC.
Cerebral Cortex
,
19
,
2269
2280
.
Fairhall
,
S. L.
, &
Ishai
,
A.
(
2007
).
Effective connectivity within the distributed cortical network for face perception.
Cerebral Cortex
,
17
,
2400
2406
.
Farkas
,
L. G.
(
1994
).
Anthropometry of the head and face in medicine
(2nd ed.).
New York
:
Elsevier
.
Gauthier
,
I.
,
Tarr
,
M. J.
,
Anderson
,
A. W.
,
Skudlarski
,
P.
, &
Gore
,
J. C.
(
1999
).
Activation of the middle fusiform “face area” increases with expertise in recognizing novel objects.
Nature Neuroscience
,
2
,
568
573
.
Gauthier
,
I.
,
Tarr
,
M. J.
,
Moylan
,
J.
,
Skudlarski
,
P.
,
Gore
,
J. C.
, &
Anderson
,
A. W.
(
2000
).
The fusiform “face area” is part of a network that processes faces at the individual level.
Journal of Cognitive Neuroscience
,
12
,
495
504
.
Haist
,
F.
,
Lee
,
K.
, &
Stiles
,
J.
(
2010
).
Individuating faces and common objects produces equal responses in putative face-processing areas in the ventral occipitotemporal cortex.
Frontiers in Human Neurosciences
,
4
,
181
.
Harris
,
A. M.
, &
Aguirre
,
G. K.
(
2008
).
The effects of parts, wholes, and familiarity on face-selective responses in MEG.
Journal of Vision
,
8
,
1
12
.
Harris
,
A. M.
, &
Aguirre
,
G. K.
(
2010
).
Neural tuning for face wholes and parts in human fusiform gyrus revealed by fMRI adaptation.
Journal of Neurophysiology
,
104
,
336
345
.
Haxby
,
J. V.
,
Hoffman
,
E. A.
, &
Gobbini
,
M. I.
(
2000
).
The distributed human neural system for face perception.
Trends in Cognitive Sciences
,
4
,
223
233
.
Hertzog
,
C.
, &
Rovine
,
M.
(
1985
).
Repeated measures analysis of variance in developmental research: Selected issues.
Child Development
,
56
,
787
809
.
Joseph
,
J. E.
(
2001
).
Functional neuroimaging studies of category specificity in object recognition: A critical review and meta-analysis.
Cognitive, Affective, & Behavioral Neuroscience
,
1
,
119
136
.
Joseph
,
J. E.
, &
Gathers
,
A. D.
(
2002
).
Natural and manufactured objects activate the “fusiform face area”.
NeuroReport
,
13
,
935
938
.
Joseph
,
J. E.
, &
Gathers
,
A. D.
(
2003
).
Effects of structural similarity on neural substrates for object recognition.
Cognitive, Affective, & Behavioral Neuroscience
,
3
,
1
16
.
Joseph
,
J. E.
,
Gathers
,
A. D.
, &
Bhatt
,
R.
(
2011
).
Progressive and regressive developmental changes in neural substrates for face processing: Testing specific predictions of the Interactive Specialization account.
Developmental Science
,
14
,
227
241
.
Joseph
,
J. E.
,
Gathers
,
A. D.
,
Liu
,
X.
,
Corbly
,
C. R.
,
Whitaker
,
S. K.
, &
Bhatt
,
R. S.
(
2006
).
Neural developmental changes in processing inverted faces.
Cognitive, Affective, & Behavioral Neuroscience
,
6
,
223
235
.
Joseph
,
J. E.
,
Partin
,
D. J.
, &
Jones
,
K. M.
(
2002
).
Hypothesis testing for selective, differential, and conjoined brain activation.
Journal of Neuroscience Methods
,
118
,
129
140
.
Kadosh
,
K. C.
,
Henson
,
R. N.
,
Kadosh
,
R. C.
,
Johnson
,
M. H.
, &
Dick
,
F.
(
2010
).
Task-dependent activation of face-sensitive cortex: An fMRI adaptation study.
Journal of Cognitive Neuroscience
,
22
,
903
917
.
Kahn
,
D. A.
,
Harris
,
A. M.
,
Wolk
,
D. A.
, &
Aguirre
,
G. K.
(
2010
).
Temporally distinct neural tuning of perceptual similarity and prototype bias.
Journal of Vision
,
10
,
1
12
.
Kanwisher
,
N.
,
McDermott
,
J.
, &
Chun
,
M. M.
(
1997
).
The fusiform face area: A module in human extrastriate cortex specialized for face perception.
Journal of Neuroscience
,
17
,
4301
4311
.
Keppel
,
G.
, &
Zedeck
,
S.
(
1989
).
Data analysis for research designs.
New York
:
Freeman
.
Leube
,
D. T.
,
Yoon
,
H. W.
,
Rapp
,
A.
,
Erb
,
M.
,
Grodd
,
W.
,
Bartels
,
M.
,
et al
(
2003
).
Brain regions sensitive to the face inversion effect: A functional magnetic resonance imaging study in humans.
Neuroscience Letters
,
342
,
143
146
.
Liu
,
J.
,
Harris
,
A.
, &
Kanwisher
,
N.
(
2010
).
Perception of face parts and face configurations: An fMRI study.
Journal of Cognitive Neuroscience
,
22
,
203
211
.
Liu
,
X.
,
Steinmetz
,
N. A.
,
Farley
,
A. B.
,
Smith
,
C. D.
, &
Joseph
,
J. E.
(
2008
).
Mid-fusiform activation during object discrimination reflects the process of differentiating structural descriptions.
Journal of Cognitive Neuroscience
,
20
,
1711
1726
.
Lobmaier
,
J. S.
,
Klaver
,
P.
,
Loenneker
,
T.
,
Martin
,
E.
, &
Mast
,
F. W.
(
2008
).
Featural and configural face processing strategies: Evidence from a functional magnetic resonance imaging study.
NeuroReport
,
19
,
287
291
.
Malach
,
R.
,
Reppas
,
J.
,
Benson
,
R.
,
Kwong
,
K.
,
Jiang
,
H.
,
Kennedy
,
W.
,
et al
(
1995
).
Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex.
Proceedings of the National Academy of Sciences, U.S.A.
,
92
,
8135
8139
.
Maurer
,
D.
,
Craven
,
K. M.
,
Le Grand
,
R.
,
Mondloch
,
C. J.
,
Springer
,
M. V.
,
Lewis
,
C. L.
,
et al
(
2007
).
Neural correlates of processing facial identity based on features versus their spacing.
Neuropsychologia
,
45
,
1438
1451
.
Maurer
,
D.
,
Le Grand
,
R.
, &
Mondloch
,
C. J.
(
2002
).
The many faces of configural processing.
Trends in Cognitive Sciences
,
6
,
255
260
.
Ng
,
M.
,
Ciaramitaro
,
V. M.
,
Anstis
,
S.
,
Boynton
,
G. M.
, &
Fine
,
I.
(
2006
).
Selectivity for the configural cues that identify the gender, ethnicity, and identity of faces in human cortex.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
19552
19557
.
Palermo
,
R.
, &
Rhodes
,
G.
(
2007
).
Are you always on my mind? A review of how face perception and attention interact.
Neuropsychologia
,
45
,
75
92
.
Pitcher
,
D.
,
Walsh
,
V.
,
Yovel
,
G.
, &
Duchaine
,
B.
(
2007
).
TMS evidence for the involvement of the right occipital face area in early face processing.
Current Biology
,
17
,
1568
1573
.
Ploran
,
E. J.
,
Nelson
,
S. M.
,
Velanova
,
K.
,
Donaldson
,
D. I.
,
Petersen
,
S. E.
, &
Wheeler
,
M. E.
(
2007
).
Evidence accumulation and the moment of recognition: Dissociating perceptual decision processes using fMRI.
Journal of Neuroscience
,
27
,
11912
11924
.
Rhodes
,
G.
,
Byatt
,
G.
,
Michie
,
P. T.
, &
Puce
,
A.
(
2004
).
Is the fusiform face area specialized for faces, individuation, or expert individuation?
Journal of Cognitive Neuroscience
,
16
,
189
203
.
Rossion
,
B.
,
Caldara
,
R.
,
Seghier
,
M.
,
Schuller
,
A.
,
Lazeyras
,
F.
, &
Mayer
,
E.
(
2003
).
A network of occipito-temporal face sensitive areas also the right middle fusiform gyrus is necessary for normal face processing.
Brain
,
126
,
2381
2395
.
Rossion
,
B.
,
Dricot
,
L.
,
Devolder
,
A.
,
Bodart
,
J. M.
,
Crommelinck
,
M.
,
De Gelder
,
B.
,
et al
(
2000
).
Hemispheric asymmetries for whole-based and part-based face processing in the human fusiform gyrus.
Journal of Cognitive Neuroscience
,
12
,
793
802
.
Santos
,
A.
,
Mier
,
D.
,
Kirsh
,
A.
, &
Meyer-Lindenberg
,
A.
(
2011
).
Evidence for a general face salience signal in human amygdala.
Neuroimage
,
54
,
3111
3116
.
Sato
,
W.
,
Kochiyama
,
T.
, &
Yoshikawa
,
S.
(
2011
).
The inversion effect for neutral and emotional facial expressions on amygdala activity.
Brain Research
,
1378
,
84
90
.
Schiltz
,
C.
, &
Rossion
,
B.
(
2006
).
Faces are represented holistically in the human occipito-temporal cortex.
Neuroimage
,
32
,
1385
1394
.
Sekuler
,
A. B.
,
Gaspar
,
C. M.
,
Gold
,
J. M.
, &
Bennett
,
P. J.
(
2004
).
Inversion leads to quantitative, not qualitative, changes in face processing.
Current Biology
,
14
,
391
396
.
Worsley
,
K. J.
(
2001
).
Statistical analysis of activation images.
In P. Jezzard, P. M. Matthews, & S. M. Smith (Eds.)
,
Functional MRI: An introduction to methods
(pp.
251
270
).
Oxford
:
Oxford University Press
.
Xu
,
Y.
(
2005
).
Revisiting the role of the fusiform face area in visual expertise.
Cerebral Cortex
,
15
,
1234
1242
.
Yovel
,
G.
, &
Kanwisher
,
N.
(
2004
).
Face perception: Domain specific, not process specific.
Neuron
,
44
,
889
898
.