A fundamental question for social cognitive neuroscience is how and where in the brain the identities and actions of others are represented. Here we present a replication and extension of a study by Kable and Chatterjee [Kable, J. W., & Chatterjee, A. Specificity of action representations in the lateral occipito-temporal cortex. Journal of Cognitive Neuroscience, 18, 1498–1517, 2006] examining the role of occipito-temporal cortex in these processes. We presented full-cue movies of actors performing whole-body actions and used fMRI to test for action- and identity-specific adaptation effects. We examined a series of functionally defined regions, including the extrastriate and fusiform body areas, the fusiform face area, the parahippocampal place area, the lateral occipital complex, the right posterior superior temporal sulcus, and motion-selective area hMT+. These regions were analyzed with both standard univariate measures as well as multivoxel pattern analyses. Additionally, we performed whole-brain tests for significant adaptation effects. We found significant action-specific adaptation in many areas, but no evidence for identity-specific adaptation. We argue that this finding could be explained by differences in the familiarity of the stimuli presented: The actions shown were familiar but the actors performing the actions were unfamiliar. However, in contrast to previous findings, we found that the action adaptation effect could not be conclusively tied to specific functionally defined regions. Instead, our results suggest that the adaptation to previously seen actions across identities is a widespread effect, evident across lateral and ventral occipito-temporal cortex.
Humans are extremely social creatures and so the identities and actions of others are highly significant to us. One of the fundamental socio-cognitive functions of the brain is to work out who others are, what they are doing, and what their intentions are. This information is extracted from the face, the body, and their characteristic movements. How the brain areas supporting the perception of other individuals are organized is an important question for social cognitive neuroscience. In this context, the focus of the current article is on the role of ventrolateral occipito-temporal brain areas in action perception.
fMRI studies in humans have revealed a number of ventral brain regions involved in our perception of other people. Much of the work has focused on face perception, and there is growing evidence for face-specific brain mechanisms in the fusiform gyrus (fusiform face area [FFA]; Kanwisher & Yovel, 2006; Grill-Spector, Knouf, & Kanwisher, 2004; Kanwisher, McDermott, & Chun, 1997) and in the inferior occipital gyrus (occipital face area [OFA]; Large, Cavina-Pratesi, Vilis, & Culham, 2008; Rotshtein, Henson, Treves, Driver, & Dolan, 2005; Haxby, Hoffman, & Gobbini, 2000). Additional evidence for face selectivity in these areas comes from studies using transcranial magnetic stimulation (TMS) (e.g., Pitcher, Walsh, Yovel, & Duchaine, 2007), neuropsychological lesion studies (e.g., Rossion et al., 2003), and intracranial EEG recordings (Krolak-Salmon, Henaff, Vighetto, Bertrand, & Mauguiere, 2004; Klopp, Marinkovic, Chauvel, Nenov, & Halgren, 2000; McCarthy, Puce, Belger, & Allison, 1999).
Furthermore, comparisons of the brain responses to images of human bodies and body parts to control images have revealed focal regions in occipito-temporal cortex that respond strongly and selectively to static images of human bodies and body parts, but weakly to other object categories (including faces). These regions have been labeled the extrastriate body area (EBA; Downing, Jiang, Shuman, & Kanwisher, 2001) and fusiform body area (FBA; Peelen & Downing, 2005; Schwarzlose, Baker, & Kanwisher, 2005), respectively (for reviews, see also Minnebusch & Daum, 2009; Peelen & Downing, 2007a). Both body-selective regions have been shown to be modulated by socially relevant cues such as the emotion expressed by body posture (Grezes, Pichon, & de Gelder, 2007; Peelen, Atkinson, Andersson, & Vuilleumier, 2007; de Gelder & Hadjikhani, 2006; Hadjikhani & de Gelder, 2003). Also, FBA appears to be modulated by familiarity: A recent study by Hodzic, Kaas, Muckli, Stirn, and Singer (2009) found that FBA responded more to familiar (self and familiar other) than to unfamiliar bodies, whereas EBA showed no differential activation. Most of the work on these regions has been done using static images (e.g., photographs) of faces and bodies. However, these regions are also reliably activated by dynamic displays such as point-light animations. For example, Grossman, Blake, and Kim (2004) reported activations to point-light animations in the fusiform gyrus (see also Santi, Servos, Vatikiotis-Bateson, Kuratate, & Munhall, 2003; Grossman & Blake, 2002). Point-light animations have also been reported to activate the posterior inferior temporal sulcus/middle temporal gyrus (Michels, Lappe, & Vaina, 2005; Peuskens, Vanrie, Verfaillie, & Orban, 2005; Saygin, Wilson, Hagler, Bates, & Sereno, 2004) where EBA is typically located.
A recent study (Peelen, Wiggett, & Downing, 2006) examined these responses to point-light animations in the inferior temporal sulcus more closely. The main question was whether the response to point-light stimuli is driven by the presence of body-selective voxels from EBA or motion-selective voxels from nearby motion-selective areas (human MT complex or hMT+; Kourtzi & Kanwisher, 2000; Tootell et al., 1995). The results showed that only body selectivity was correlated, on a voxel-by-voxel basis, with biological motion selectivity, whereas motion selectivity was uncorrelated with biological motion selectivity. This suggests that the lateral-temporal activations to point-light animations are mainly due to activation of EBA. As EBA also responds highly selectively to images that contain very limited visual cues, for example, stick figures (Peelen & Downing, 2005; Downing et al., 2001), it seems likely that the presence of the form of the body (defined by structure-from-motion) in the point-light displays is crucial to EBA. Hence, we have suggested that the region may be largely “naïve” about the patterns of changing posture that comprise biological actions (Peelen & Downing, 2007a; Peelen et al., 2006; Giese & Poggio, 2003; but see Jastorff & Orban, 2009; Lange & Lappe, 2006).
The dynamic aspects of action perception are commonly considered to take place elsewhere in the “social” brain: Firstly, regions of the right hemisphere STS respond selectively to human biological motions (presented as point-light walkers, movies, or animations) as well as face/eye movements (Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005; Hooker et al., 2003; Pelphrey et al., 2003; Grezes et al., 2001; Grossman et al., 2000; Hoffman & Haxby, 2000). Secondly, frontal and parietal areas are activated by both the perception and the production of simple actions. These areas are commonly referred to as the “mirror neuron system” (Buccino et al., 2004; Rizzolatti & Craighero, 2004; Saygin et al., 2004; Rizzolatti, Fogassi, & Gallese, 2001; but see Lingnau, Gesierich, & Caramazza, 2009; Dinstein, Gardner, Jazayeri, & Heeger, 2008).
There is evidence to suggest that body-selective areas in extrastriate cortex are functionally dissociated from the posterior STS (Saxe, Xiao, Kovacs, Perrett, & Kanwisher, 2004) and from frontal and parietal “mirror” areas. For example, in one study (Kontaris, Wiggett, & Downing, 2009), we presented participants with a “live” view of their own manual movements. This situation appears to suppress the BOLD response in the pSTS, compared to a control condition in which the same movements are viewed while the participant performs different (mismatching) actions. In contrast, EBA and FBA respond equally under these conditions, and at the same level as a condition in which no movements are executed. These findings suggest a bias for the movements of other individuals in the pSTS, and a visual representation of body parts in EBA and FBA that is decoupled from the motor system.
Similarly, we have also used fMRI to examine the brain response in a number of cortical regions to short animations of whole-body actions (Downing, Peelen, Wiggett, & Tew, 2006). The actions were presented as static frames in either the correct or incorrect sequence. Frontal and parietal “mirror” areas, and the pSTS, showed a higher response to coherent versus incoherent sequences. However, EBA and FBA showed a greater response to the incoherent compared to the coherent sequences, perhaps due to the increased frame-to-frame differences in body posture in the incoherent sequences, which could lead to a higher response in areas that process individual snapshots of the body posture, rather than the whole action.
Converging evidence supports this view. Urgesi, Candidi, Ionta, and Aglioto (2007) tested whether the processing of bodies was impaired by the disruption of activity in EBA or ventral premotor cortex (vPMC). Subjects had to discriminate the identity or the implied action of a body part. The authors found a dissociation: “Identity” (form) discrimination was impaired when EBA was disrupted by TMS, whereas “action” discrimination was impaired after disruption of vPMC. Furthermore, by combining psychophysical studies with lesion-mapping techniques, Moro et al. (2008) have recently shown body form perception deficits to be associated with lesions to EBA, whereas action recognition deficits were associated with ventral premotor lesions, further supporting the idea of a functional dissociation between EBA and the “mirror neuron system.” Finally, ERP evidence from 8-month-old infants shows distinct neural signatures underpinning the detection of violations of body structure and of dynamic body movements (Reid, Hoehl, Landt, & Striano, 2008).
In contrast to the above picture, a recent paper (Kable & Chatterjee, 2006) proposes direct involvement of EBA in dynamic aspects of action perception. The authors used an adaptation approach (cf. Epstein, Higgins, & Thompson-Schill, 2005; Vuilleumier, Henson, Driver, & Dolan, 2002; Kourtzi & Kanwisher, 2001; Henson, Shallice, & Dolan, 2000; Buckner et al., 1998) to measure the brain responses to short, full-cue movies of actors performing simple actions. By manipulating both the identity of the actor and the action being performed, Kable and Chatterjee tested which ventral areas are involved in processing these two dimensions. The stimulus set comprised movies of many different people performing many kinds of simple transitive and intransitive actions (e.g., picking up a book, kicking). First, the participants were exposed to a subset of movies. Movies of this subset were shown several times during the first three scans. In later scans, four different types of movies were presented: the old movies; an entirely new set comprising of actions and actors that had not been seen before; a “new action” condition where previously encountered actors performed new actions; and a “new actor” condition where previously encountered actions were performed by new actors. Participants performed a speeded judgment task indicating how often they typically see each action being performed in daily life.
Kable and Chatterjee (2006) found widespread repetition effects: The response to the previously seen movies was significantly lower across many brain areas compared to the response to the new movies. Furthermore, the results revealed significant action-specific adaptation in EBA, MT/MST,1 and pSTS: Lower responses to previously seen actions were found also when those were performed by new actors. There was no evidence for identity-specific adaptation in any of the regions of interest tested (EBA, FFA, STS, MT/MST and lateral occiptal complex [LO]; Kourtzi & Kanwisher, 2001; Grill-Spector et al., 1999; Malach et al., 1995). Kable and Chatterjee interpreted these findings as “evidence that representations in pSTS, MT/MST, and EBA abstract actions from the agents involved and distinguish between different particular actions” (p. 1498).
Here, we present a replication and extension of Kable and Chatterjee's study in an attempt to better understand the (apparently conflicting) relationship between their findings and other previous work reviewed above. There are two main points we wish to highlight: (a) the participants' task, and (b) the limitations of univariate region-of-interest (ROI) analyses. In Kable and Chatterjee's study, participants were instructed to decide, on each trial, whether the action presented was an action they performed often or rarely. The advantage of this task is that it requires a response on every trial and so helps to maintain the participants' attention. However, a shortcoming of the task may be that it requires participants to pay attention only to the action, not to the identity of the person performing the action. Thus, it is possible that the action-specific priming found in Kable and Chatterjee (2006) was a result of the task directing attention to only one dimension of the stimulus. Attention has been shown to modulate activation of early visual areas (Gandhi, Heeger, & Boynton, 1999) as well as extrastriate visual areas (Chawla, Rees, & Friston, 1999; O'Craven, Downing, & Kanwisher, 1999; Wojciulik, Kanwisher, & Driver, 1998), and it therefore seems important to employ a task that requires attention to be allocated to both dimensions of a stimulus. Accordingly, in the present study, participants were shown a specific action/actor combination (the “target”) before going in the scanner. The task was to press a response button whenever the target movie was presented. In order to ensure attention to this combination specifically, foils were also shown, that is, movies showing the target action but performed by a different person, and the target actor performing a different action.
The second point concerns the type of analyses performed. Kable and Chatterjee (2006) conclude that both EBA and MT/MST are functionally involved in action perception. However, caution is needed when interpreting activations as a reflection of the activity in a specific region when this interpretation is simply based on the average peak activation coordinates. This is especially true when functional regions occupy neighboring or even overlapping cortex, as is the case for EBA and hMT+ (LO is also found in this area of cortex; Grill-Spector et al., 1999). Due to this close cortical proximity, each of these ROIs, depending on how they are defined, may contain a mixture of voxels responding to visual motion, body parts, and object form. In this sense, although the ROIs are defined functionally and within individual participants, they may not purely capture a single selective population. In a previous study (Downing, Wiggett, & Peelen, 2007), we addressed this point by using a simple analysis of patterns of selectivity in this region. hMT+, LO, and EBA were functionally defined in individual subjects, and were located in highly similar cortical locations within subjects. We found, in each ROI, significant selectivity for the other stimulus types, that is, EBA voxels were, on average, also motion-selective, and hMT+ voxels were also body-selective and so on. Multivoxel correlation analyses, however, revealed that the patterns of body, motion, and object selectivity were uncorrelated across voxels, suggesting that there are three relatively independent patterns of neural activity in lateral occipital–temporal cortex.
Accordingly, in the present study, we adapted similar MVPA methods in order to test whether priming effects seen in “crowded” regions such as lateral occipito-temporal cortex can be attributed to specific underlying neural populations. If, for example, body-selective representations are selectively primed by observed actions, then there should be a direct relationship between the strength of body selectivity (but not other kinds of selectivity) with the strength of the priming effect, across the voxels within a given ROI. Because of the power required to detect subtle relationships between priming effects and category selectivity at a voxelwise level, we also doubled the number of subjects (relative to Kable & Chatterjee's 2006 study).
Eighteen healthy adult volunteers (mean age = 24 years; range = 20–43 years; 11 women, 7 men) were recruited from the Bangor University community. Participants satisfied all requirements in volunteer screening and gave informed consent. All experimental procedures were approved by the Ethics Committee of the School of Psychology at Bangor University. Participation was compensated at £20 per session. A further five volunteers were tested, but were excluded from the study due to excessive head movement and/or equipment failure.
Design and Procedure
Participants were scanned on a series of blocked-design localizer scans to identify functional ROIs with respect to individual brain anatomy. We used four localizer experiments; the basic design was identical for all four. Each experiment consisted of twenty-one 16-sec blocks in which Blocks 1, 6, 11, 16, and 21 were fixation-only baseline epochs.
The localizer used to select body-, face-, and scene-selective areas consisted of blocks of images of human bodies (without heads), unfamiliar faces, outdoor scenes, and chairs. Each condition was presented in four 16-sec blocks within one scan. In each block, 20 different images from one category were presented (300 msec on/500 msec off). Twice during each stimulus block, the same image was presented two times in succession. Participants were instructed to detect these immediate repetitions and report them with a button press (1-back task). Each participant was scanned on two versions of this localizer, counterbalancing for the order of blocks. Within each scan, the assignment of blocks to conditions was also counterbalanced so that the mean serial position in the scan of each condition was matched.
In the LO localizer scan (Malach et al., 1995), images of everyday objects and scrambled versions of the same images were presented. Image scrambling was performed by dividing each image into 20 × 20 blocks, and scrambling the arrangement of blocks randomly, separately within the center, middle ring, and outer ring of the image. There were eight blocks of each condition. Participants again performed a “1-back” repetition detection task.
The localizer scan for area hMT+ consisted of low-contrast concentric rings that either slowly oscillated inwards and outwards, or, in separate blocks, remained static (Tootell et al., 1995). As for the LO localizer, each condition was presented for eight 16-sec blocks. The stimuli in this experiment were passively viewed.
In order to localize the biological motion-sensitive region of the right posterior STS, point-light animations of whole-body actions and phase-scrambled versions of the same animations were presented (Shipley & Brumberg, n.d.). Each point-light animation lasted 1 sec (30 frames/sec), followed by a 1-sec fixation. The 12 dots defining each point-light stimulus were white against a black background. Scrambled motion sequences were produced from the exact same motion vectors found in the biological animations, but the starting position of each dot was randomized and controlled so as to keep the density of the stimulus constant and comparable to the original biological motion stimuli in terms of local motion. Participants performed a “1-back” repetition detection task.
In the main experiment, participants watched short movies (2 sec, 30 frames/sec) of actors performing recognizable whole-body actions (the faces/heads of the actors were visible). The stimuli were created by Kable and Chatterjee (2006) and used with the authors' permission. The set of movies comprised 48 actions and 48 actors, producing a total of 288 movies. These were divided into nine sets of 32 movies. These subsets of movies consisted of 16 actions, each performed by two actors. The actors wore the same clothes across the two actions (see Appendix for an example action and a list of all actions presented). (For more information on the stimulus set, example frames from each action, and counterbalancing procedures, see Kable & Chatterjee, 2006.)
Each participant was scanned on four scans of the main experiment. In the first and second scans, participants were pre-exposed to a set of 28 movies, randomly picked from a possible set of 32. In addition, three other critical movies were presented: a target (a specific combination of actor with action) and two foils. One of the foils showed the same action as the target movie, but performed by a different person; the other was a movie of the same person performing a different action. These were also randomly picked for each participant. Movies were presented every 4 sec. Across the first and second scans, each of the 28 movies was presented six times. In addition, there were 24 target and 24 foil trials (12 of each foil).
In the critical third and fourth scans, each participant was presented with four sets comprising 28 movies each: (1) the same set of movies presented in the first two scans (“Old Person, Old Action”); (2) a set of movies showing the same people as in the first two scans, but performing different actions (“Old Person, New Action”); (3) a set of movies showing new people performing the same actions seen in scans one and two (“New Person, Old Action”); and (4) a set of movies in which both the people and the actions performed were new (“New Person, New Action”). Participants were shown a total of 112 different movies; each of these movies was shown only once across the two scans. In addition, there were 28 target, 28 foil (14 of each), and 28 fixation trials. In total there were seven trial types and the order of presentation was n − 1 counterbalanced. The nine sets of 32 movies were counterbalanced so that each set was used twice in each of the four conditions (18 participants). As in Scans 1 and 2, movies were presented every 4 sec.
Participants were shown their target movie before going in the scanner. They were instructed that their task was to press a button whenever a movie showing that specific actor/action combination was shown. This task ensured that participants paid attention to the identity of the person performing the action as well as the action being performed. The movies were shown in gray scale as opposed to full color in order to reduce the presence of perceptual features confounded with actor identity that might, upon repetition, elicit adaptation in visual areas.
The data were acquired using a 3-T Philips MRI scanner with a SENSE phased-array head coil. For functional imaging, a single-shot echo-planar imaging sequence was used (T2*-weighted, gradient-echo sequence; echo time, 35 msec; flip angle, 90°). The scanning parameters were as follows: repetition time 2000 msec; 25 off-axial slices; voxel dimensions 2 × 2 mm; 3 mm slice thickness, FOV 192 (LR) × 224 (AP), matrix 96 × 112, phase encoding direction A–P. The slices were positioned to include, bilaterally, the temporal and occipital lobes including the fusiform gyrus. The inferior frontal gyrus, the supramarginal gyrus, the angular gyrus, as well as most of the superior parietal lobule, were also covered. Seven dummy scans were acquired before each functional scan to reduce possible effects of T1 saturation. Parameters for T1-weighted anatomical scans were: 288 × 232 matrix; 1 mm isotropic voxels; TR = 8.4 msec, TE = 3.8 msec; flip angle = 8°.
Preprocessing and statistical analyses of the MRI data were performed using BrainVoyager QX 1.9 (Brain Innovation, Maastricht, The Netherlands). Functional data were motion corrected, and low-frequency drifts were removed with a temporal high-pass filter (0.006 Hz). No spatial or temporal smoothing was performed. Functional data were manually coregistered with the anatomical T1 scans. The three-dimensional anatomical scans were transformed into Talairach and Tournoux (1988) space, and the parameters from this transformation were subsequently applied to the coregistered functional data.
For each participant, general linear models (GLMs) were created. One boxcar predictor, convolved with a two-gamma HRF function to model the hemodynamic response, modeled each condition of interest. Regressors of no interest were also included to account for differences in the mean MR signal across scans. Regressors were fitted to the MR time series in each voxel and the resulting beta parameter estimates were used to estimate the magnitude of response to each experimental condition.
In each participant, EBA and FBA were defined by contrasting the response to human bodies with that to images of chairs. The FFA was defined by contrasting faces to chairs. The PPA was defined by contrasting scenes to faces. We localized LO by contrasting the response to intact objects with that to scrambled objects. Area hMT+ was identified by contrasting the response to moving concentric rings with that to static rings. Finally, a posterior STS ROI was defined by comparing the response to intact point-light animations with that to phase-scrambled versions of the same stimuli.
For each ROI in each participant, the most significantly activated voxel was identified within a restricted part of cortex on the basis of previously reported anatomical locations (EBA/FBA: Peelen & Downing, 2005; FFA: Kanwisher et al., 1997; PPA: Epstein & Kanwisher, 1998; hMT+: Dumoulin et al., 2000; pSTS: Grossman et al., 2000). For LO, we focused on the subregion in the posterior part of the inferior temporal sulcus (not the more anterior focus in the fusiform gyrus; see Kourtzi, Erb, Grodd, & Buelthoff, 2003). ROIs were defined as the set of contiguous voxels that were significantly activated (all p < .0001, uncorrected, except pSTS: p < .005) within a 12-mm3 surrounding the peak voxel. Within each ROI in each subject, a further GLM was then applied, modeling the response of the ROI voxels to the four conditions of the final two runs of the main experiment (Old Person, Old Action; Old Person, New Action; New Person, Old Action; New Person, New Action). The beta weights from this GLM provide the basis for the ROI analysis described below.
Voxelwise Correlation Analyses
For each subject, we created an additional GLM that contained both runs of the four-condition (bodies, faces, chairs, and scenes) localizer, the LO localizer, and the hMT+ localizer. We used this GLM to contrast the response to the average of (human bodies, intact objects, and moving rings) versus the average of (chairs, scrambled objects, and static rings). This allowed us to define, in each subject, general, unbiased left and right hemisphere lateral occipito-temporal activations that are responsive to bodies, intact objects, and motion, but that have not been parcelled into discrete functional areas. We identified the most significantly activated voxel of this activation in each subject and hemisphere and defined a “general ROI” as all voxels around this peak that were significantly activated (p < .005, uncorrected) in the contrast. We restricted the region to a 30 × 30 × 30 mm3 around the peak voxel.
We performed a whole-brain, random effects analysis for each of the three adaptation effects (repetition adaptation, action adaptation, identity adaptation). The uncorrected voxelwise threshold was set at p < .01. Using the cluster-size threshold plug-in for BrainVoyager QX, Monte Carlo simulations showed that for cluster size > 7 acquired voxels, the effective corrected threshold is p < .05.
During scanning, the participants performed a target detection task. The response data suggest that 17 out of the 18 subjects were reliably performing the task. Out of a possible 28 targets, these participants, on average, detected 27. The average false alarm rate was 0. The response data from the remaining participant suggests that she or he had not understood the task properly and was attending only to the identity of the actor rather than the actor/action conjunction (this participant detected 28/28 targets, but also responded to all 16 foils that contained the same actor as the target). However, the participant's fMRI data closely match the average pattern in all ROIs and, excluding the data, do not significantly change the overall results. We therefore present the data from all participants in the following analyses.
In all participants, we were able to define, in both hemispheres, EBA, FFA, LO, and PPA. In 17 out of 18 subjects, we were also able to define hMT+ bilaterally. Furthermore, we defined FBA (17/18) and pSTS (14/18) in the right hemisphere only. The average Talairach coordinates (with SEM) of the ROI peak voxels were: right EBA: 46.6 (1.1), −67 (1.3), −1 (1.4); left EBA: −47.8 (1), −68.2 (1.6), 2.6 (1.5); right FFA: 37.6 (0.7), −47.8 (1.9), −16 (0.9); left FFA: −38.2 (0.7), −46.2 (1.8), −18.2 (1.2); right LO: 43.6 (1.5), −69.8 (1.2), −8.8 (1.5); left LO: −45 (1), −69.8 (1.6), −6.5 (1.6); right PPA: 25 (0.7), −44.9 (1.4), −9.9 (0.7); left PPA: −24.2 (0.8), −46 (1.2), −10.2 (0.9); right hMT+: 43.6 (1.2), −64.7 (1.2), 0.1 (1.3); left hMT+: −43.9 (0.9), −67.5 (1.2), −0.4 (1.2); right FBA: 39.8 (0.9), −47.2 (1.4), −17.2 (1.1); right pSTS: 48.6 (1.2), −43.8 (2.6), 13.8 (2). Subsequent analyses on these ROIs and their responses to the four primary conditions of the main experiment were performed to address a series of questions, as enumerated below.
1. Is there an effect of hemisphere within each ROI?
For each bilateral ROI (EBA, LO, hMT+, FFA, PPA), we tested for an effect of hemisphere on adaptation condition. The Hemisphere × Condition interaction was not significant in all ROIs [LO: F(3, 52) = 2.5, p > .05; hMT+: F(3, 48) = 1.08, p > .05; FFA: F(3, 51) = 2.57, p > .05; PPA: F(3, 51) = 0.17, p > .05], except for EBA [F(3, 51) = 3.08, p = .04]. Further inspection of EBA revealed that left and right EBA mainly differed in the magnitude of some effects2 rather than the qualitative pattern. As such, in the following analyses, we collapse across hemisphere in all ROIs, except in pSTS and FBA, as these were localized in the right hemisphere only. Average parameter estimates for each ROI are shown in Figure 1; corresponding time courses of percent signal change relative to fixation are shown in Figure 2.
2. Which areas show an effect of adaptation condition?
We tested for a general effect of adaptation condition in each ROI, with a one-way ANOVA contrasting the responses to the four key conditions. The main effect of condition was highly significant in each ROI [EBA: F(3, 51) = 11.12, p < .001; LO: F(3, 51) = 7.73, p < .001; hMT+: F(3, 48) = 7.41, p < .001; FFA: F(3, 51) = 8.76, p < .001; FBA: F(3, 48) = 14.39, p < .001; pSTS: F(3, 39) = 3.61, p < .05; PPA: F(3, 51) = 2.86, p < .05]. Analyzing across all ROIs, we found a main effect of condition [F(3, 36) = 11.22, p < .001], but no ROI × Condition interaction [F(18, 216) = 1.08, p > .05].
3. Do all areas show the basic priming effect?
To test whether all areas show a basic repetition priming effect (i.e., [New Person, New Action] > [Old Person, Old Action]), t tests were carried out for each ROI individually. The adaptation effect was significant in EBA [t(17) = 7.31 , p < .001], LO [t(17) = 5.8, p < .001], hMT+ [t(16) = 5.04, p < .001], FFA [t(17) = 4.77, p < .001], FBA [t(16) = 4.51, p < .001], and pSTS [t(13) = 2.66, p < .05]. The priming effect approached significance in PPA [t(17) = 2.01, p = .06].
4. Are there areas that show action priming?
We tested whether any of the ROIs showed action priming across a change in identity ([New Person, New Action] > [New Person, Old Action]). There was significant action priming in EBA [t(17) = 2.58, p < .05], FFA [t(17) = 2.83, p < .05], FBA [t(16) = 2.76, p < .05], pSTS [t(13) = 2.54, p < .05], and hMT+ [t(16) = 2.11, p = .05]. The effect was marginally significant in the remaining ROIs: LO [t(17) = 1.58, p = .13] and PPA [t(17) = 1.69, p = .11].
5. Is action priming focal?
We then tested, for two broad regions of extrastriate cortex—lateral and ventral occipito-temporal cortex—whether the action priming effects revealed in the previous analyses were significantly different among ROIs. In lateral occipito-temporal cortex, a 3 (ROI: EBA, LO, hMT+) × 2 (Condition: [New Person, New Action]; [New Person, Old Action]) repeated measures ANOVA showed significant main effects of ROI [F(2, 32) = 10.66, p < .001] and condition [F(1, 16) = 5.11, p < .05], but these variables did not interact significantly [F(2, 32) = 0.96, p > .05]. Hence, there is no evidence from this analysis for an action-priming effect specific to one of the lateral occipito-temporal ROIs. In ventral occipito-temporal cortex, a 3 (ROI: FFA, FBA, PPA) × 2 (Condition) ANOVA showed the same pattern: There were significant main effects of ROI [F(2, 32) = 74.03, p < .001] and Condition [F(1, 16) = 11.34, p < .05], but the interaction was not significant [F(2, 32) = 1.77, p > .05].
6. Are there any areas that show identity priming?
To test whether any of the ROIs showed identity priming, across changes in the actions performed, we tested for regions that showed ([New Person, New Action] > [Old Person, New Action]). There was no evidence for identity priming in any of our ROIs: EBA [t(17) = 1.76, p > .05], FFA [t(17) = 1.22, p > .05], LO [t(17) = 0.41, p > .05], hMT+ [t(16) = 1.14, p > .05], PPA [t(17) = −0.51, p > .05], FBA [t(16) = −0.37, p > .05], pSTS [t(13) = 0.27, p > .05].
7. Are these results due to overlapping ROIs?
EBA, hMT+, and LO occupy neighboring and often overlapping regions in occipito-temporal cortex (Downing et al., 2007), whereas in the fusiform gyrus, FBA and FFA are often found to spatially overlap (Peelen & Downing, 2005; Schwarzlose et al., 2005). It is therefore possible that the similar response patterns across ROIs in the above analyses are at least partly due to some of the same voxels contributing to multiple ROIs. To rule out this possibility, we redefined our ROIs to only include nonshared voxels (cf. Schwarzlose et al., 2005). These are referred to below as EBA*, FBA*, FFA*, LO*, and hMT+*. We then tested for the response of the nonoverlapping ROIs to the four conditions in the main experiment.
There was overlap between ROIs in approximately 50% of cases: In the right hemisphere, there was overlap between EBA and LO or hMT+ in 11 subjects, LO overlapped EBA or hMT+ in 6 subjects, and hMT+ overlapped EBA or LO in 11 subjects. In the left hemisphere, EBA overlapped LO or hMT+ in 9 subjects, LO overlapped EBA or hMT+ in 11 subjects, and hMT+ overlapped with EBA or LO in 11 subjects. In nine subjects, right FBA and right FFA overlapped.
We repeated the analysis steps listed above for EBA*, LO*, hMT+*, FBA*, and FFA*. Removing any overlap between ROIs did not appear to meaningfully change the pattern of results (see Figure 3). The only differing result was revealed in Analysis 5 (“Is action priming focal?”). Although in the lateral ROIs (EBA, LO, hMT+) the Condition × ROI interaction remained nonsignificant [F(2, 32) = 0.87, p > .05], the interaction reached significance [F(2, 32) = 5.21, p < .05] in the more ventral ROIs (FBA, FFA, PPA). This was due to the action adaptation effect being slightly more pronounced in FBA* compared to FBA (New Person, New Action minus New Person, Old Action: FBA*, 1.33; FBA, 1.04), while the effect size remained the same in FFA*. PPA remained the same as no voxels were removed in the overlap analysis. Furthermore, the identity priming effect in Analysis 6 (“Are there any areas that show identity priming?”) approached significance in EBA* [t(13) = 1.99, p = .06], although it remained nonsignificant in all other ROIs. Average parameter estimates (collapsed across hemisphere) for the ROI* analysis are presented in Figure 3.
8. Are priming effects related to body selectivity on a voxel-by-voxel basis?
The previous analyses revealed an action priming effect in lateral occipito-temporal cortex that did not differ significantly among the ROIs we tested (hMT+, EBA, and LO), whether these overlapped or not. One possibility is that because these ROIs occupy neighboring cortex, they contain a mixture of populations responding to visual motion, object form, and body parts. In this sense, although the ROIs are defined functionally and within individual participants, they are unlikely to purely capture a single selective population. This is likely to be the case even in nonoverlapping ROIs (Downing et al., 2007).
Here we attempted to address this point by using an analysis of patterns of selectivity in this region. We reasoned that if, for example, body-selective representations are selectively primed by observed actions, then there should be a direct relationship between the strength of body selectivity and the strength of the priming effect, across the voxels within a given ROI. That is, across voxels, body selectivity, but not motion or object-form selectivity, should be positively related to the strength of the priming effect (see also Downing et al., 2007; Peelen & Downing, 2007b; Peelen et al., 2006).
To test this, we calculated, for each participant individually, the selectivity of the response to several different contrasts in an unbiased lateral occipito-temporal ROI. Each voxel in each region was characterized with a t value describing its selectivity for: bodies (bodies vs. chairs); faces (faces vs. chairs); motion (moving vs. static rings); object form (intact vs. scrambled objects); and biological motion (intact vs. scrambled point-light animations). For the same voxels, we also similarly calculated the size of the action priming effect ([New Person, New Action] − [New Person, Old Action]) from the main experiment. Using these data, we performed a regression analysis for each participant in each ROI separately. Across voxels, the regression model attempted to explain the variance in the action priming effect with a linear combination of the voxelwise selectivities for bodies, faces, motion, object form, and biological motion. This approach has the advantage in that it tests whether any of these kinds of selectivity significantly predicts the priming effects, above and beyond the contributions of the other kinds of selectivity. To test this statistically, we collected beta values from each of the predictors for each Subject × ROI combination. For each ROI, we then tested first whether there were significant differences among the betas, and second whether any beta was significantly different from zero. Finally, this analysis was repeated for the identity priming effect.
The mean betas for each of the predictors in each of the ROIs for both types of adaptation effects are shown in Figure 4. In the right hemisphere lateral occipito-temporal ROI, there was a significant difference among the five predictors in their relationship to the action priming effect [F(4, 68) = 2.84, p < .05], but not in its left homologue [F(4, 68) = 1.58, p > .05], nor in the right fusiform [F(4, 68) = 0.90, p > .05]. t tests followed up the significant results in the right occipito-temporal ROI to determine whether any of the betas individually was significantly greater than zero. This held true only for the beta weight on body selectivity [t(17) = 2.91, p < .01].
In the analysis of identity priming ([New Person, New Action] − [Old Person, New Action]), there was no significant difference among the five predictors in their relationship to the identity priming effect in any ROI [all F(4, 68) < 1.5, all p > .05]. Body selectivity was the only predictor significantly different from zero in the right occipito-temporal ROI [t(17) = 2.28, p < .05]. None of the predictors were significantly above zero in the other two regions.
Across the three ROIs, we determined the mean correlation across participants for each unique pair of regression predictors. These tended to be low: left occipito-temporal, mean = −0.02, min = −0.32, max = 0.41; right occipito-temporal, mean = 0.04, min = −0.18, max = 0.27; right fusiform, mean = 0.13, min = −0.02, max = 0.36. Thus, the analyses in this section were unlikely to be compromised by strong correlations among the regressors.
In addition to the ROI analysis, we performed group-average whole-brain, random effects analyses to test for adaptation. Figure 5 shows the results of three separate contrasts: repetition adaptation (New Person, New Action > Old Person, Old Action), action adaptation (New Person, New Action > New Person, Old Action), and identity adaptation (New Person, New Action > Old Person, New Action). The activation maps are thresholded at p < .01 (uncorrected). Activations significant at a cluster size threshold of p < .05 (corresponding to 7 voxels) are further listed in Table 1. We observed significant clusters of repetition and action adaptation in a number of regions. However, none of the activations from the identity adaptation contrast survived cluster thresholding.
|Repetition Adaptation (New Person, New Action > Old Person, Old Action)|
|R Inferior frontal||8||48||23||13||5.17|
|L Lateral occipital||45||−45||−70||1||6.12|
|Action Adaptation (New Person, New Action > New Person, Old Action)|
|R Middle temporal||23||45||−52||10||4.28|
|R Lateral occipital||7||51||−64||4||4.12|
|Repetition Adaptation (New Person, New Action > Old Person, Old Action)|
|R Inferior frontal||8||48||23||13||5.17|
|L Lateral occipital||45||−45||−70||1||6.12|
|Action Adaptation (New Person, New Action > New Person, Old Action)|
|R Middle temporal||23||45||−52||10||4.28|
|R Lateral occipital||7||51||−64||4||4.12|
Each row gives the location of the cluster, the cluster size (number of acquired voxels [2 × 2 × 3 mm3]), the location of the peak voxel of that activation in Talairach coordinates, and the maximum t value for the region.
The aims of the current study were twofold: firstly, to replicate Kable and Chatterjee's (2006) study using a task that required participants to attend to the identity of the actor as well as the action, and secondly, to test a novel application of multivariate pattern analysis methods by asking whether MVPA can help disentangle the source of neural adaptation effects measured by BOLD. We found that using a different task did not grossly alter the pattern of results; in line with Kable and Chatterjee's result, we found adaptation to actions (across identities) but no evidence for adaptation to identities (across actions). However, these analyses do not pin action adaptation to specific functional regions, as we found similar effects across many functional ROIs. MVPA revealed preliminary evidence that adaptation effects in lateral occipital cortex are driven by body-selective rather than motion-, face-, or object-selective neurons. Below we outline the main results of the current study and discuss them in relation to Kable and Chatterjee's findings and the broader relevant literature.
Action-specific Priming: Functional ROIs
In our measures of action-specific adaptation, we found significant (in EBA, FBA, FFA, pSTS, and hMT+) or marginally significant (in LO and PPA) action adaptation across all ROIs. Kable and Chatterjee (2006) report significant action adaptation only in EBA, MT/MTS, and pSTS, however, there was a trend toward action adaptation also in LOC and PPA. There was no evidence for significant identity-specific adaptation in any ROI in either study. The differences between studies in terms of which ROIs showed significant basic or action-specific adaptation effects when considered individually is most likely due to difference in power: We scanned 18 participants, while there were nine participants in Kable and Chatterjee's study.
However, on the basis of comparisons between ROIs, our results suggest that the action-specific adaptation is less focal than suggested by Kable and Chatterjee (2006), who concluded that pSTS, MT/MST, and EBA generalize actions across identities and distinguish between different actions. In the current study, action-specific adaptation was not only found in body- and motion-selective cortical areas but also in other regions that would not be expected to be involved in processing of human actions (such as FFA, PPA, and LO). Across all the ROIs we tested, there was no significant interaction between ROI and adaptation condition. Furthermore, when testing separately the ROIs in lateral occipital cortex (EBA, hMT+, LO) and the more ventromedial ROIs (FFA, FBA, PPA), the ROI × Condition interactions were nonsignificant in both cases. Thus, the reduced response to repeated actions is not only present within body- and motion-selective ROIs, but appears to be a very generalized effect that is evident throughout ventral cortical regions. We would therefore argue, contrary to Kable and Chatterjee, that neither our results, nor theirs, provide strong evidence for a region-specific involvement of EBA (or hMT+ and pSTS) in action perception.
We made two additional attempts to further refine our measures of whether action-specific priming could be attributed to specific underlying neuronal populations. First, we analyzed functional regions in which voxels that were shared between localizers (at the predetermined threshold) were excluded from the analyses. At high resolution, this approach has been used to segregate body- from face-selective voxels in the posterior fusiform gyrus (Schwarzlose et al., 2005). In the present study, however, although we gained some increased precision in the fusiform gyrus (increased specificity of the action-priming effect to FBA as opposed to FFA), these analyses did not change the overall picture.
Second, we attempted to capitalize on voxel-by-voxel variation in selectivity (e.g., for bodies, objects, and motion) in order to attribute the action priming effect to a specific population (cf. Downing et al., 2007; Peelen & Downing, 2007b). Here, the results were noisy, but we did find that in the right lateral occipito-temporal region only body selectivity (and not other kinds of selectivity) had a significant predictive relationship with the action priming effect, across voxels and participants. However, there was a similar weighting on body selectivity in the analysis of identity priming, which was not significant in any ROI in the univariate analyses. Thus, although this approach has been used successfully to distinguish areas on the basis of gross selectivity (Downing et al., 2007), it may lack sufficient sensitivity to disentangle the source of subtle adaptation effects.
Identity-specific Adaptation: Functional ROIs
Although we found evidence for widespread action adaptation, we did not find any evidence for adaptation to the identity of the actor in our ROI analyses. One possible explanation is that the actor and action changes varied in the amount of perceptual change at a low level. Over the duration of the 2-sec movie, there is likely to be more low-level feature variation when the action changes (e.g., the same person walking and doing a star jump) than when the identity changes (two different people walking). However, we think it is unlikely that the adaptation effects found are purely perceptual. The current study (and Kable & Chatterjee) used an event-related design with a pre-adaptation block (for a recent review of the different experimental designs used in fMRI adaptation, see Weigelt, Muckli, & Kohler, 2008). With this design the repetition lag between the presentation of the initial stimulus and the repeated presentation is typically many seconds or minutes (Bunzeck, Schutze, & Duzel, 2006; Simons, Koutstaal, Prince, Wagner, & Schacter, 2003; Buckner et al., 1998), or even days (van Turennout, Bielamowicz, & Martin, 2003; van Turennout, Ellmore, & Martin, 2000). In the present study, this lag was on the order of 3 to 15 min. Reviewing the evidence from studies using such long-lag adaptation designs, Weigelt et al. (2008) conclude that these repetition effects are more likely to be cognitive rather than perceptual in that they reflect aspects of semantic processing rather than purely stimulus-driven adaptation. Furthermore, our whole-brain analysis did not appear to reveal adaptation effects in early visual areas, although this is not conclusive as we did not localize these areas separately.
Action- and Identity-specific Adaptation: Whole-brain Analyses
In the whole-brain analyses, we detected repetition adaptation in a number of occipital and temporal brain regions. Although we also detected action-specific adaptation in three regions corresponding approximately to EBA/hMT+, pSTS, and FBA/FFA, there were no regions showing significant identity adaptation at the threshold used. Kable and Chatterjee (2006) reported more widespread repetition adaptation as well as action adaptation, including activations in medial and ventral frontal areas for both contrasts. Furthermore, the authors reported significant identity adaptation in ventromedial prefrontal cortex. In the current study, we did not find any evidence for significant adaptation effects in frontal regions.
It is interesting to note that Kable and Chatterjee (2006) discuss the possibility that the action adaptation effects in frontal regions were due to motor adaptation to the task response rather than adaptation to the observed action. Thus, the differing results in the whole-brain analyses across studies could be due to the different tasks used (discussed below). Furthermore, Kable and Chatterjee suggest that the ventromedial prefrontal identity activation may reflect conceptual representations of other individuals. Although all actors were unfamiliar to all participants in the current study, some of the actors were familiar to some participants in Kable and Chatterjee's study (J. Kable, personal communication, 15 February 2010). The issue of familiarity is further discussed below.
The task used by Kable and Chatterjee required participants to rate the action on each trial in terms of whether the action was one they performed often or not. This meant that the identity of the actor performing the action was task irrelevant and it is possible that the action-specific adaptation found was, in part, due to this attentional manipulation. There is ample evidence in the literature of top–down modulation of cortical responses by attention (Corbetta, Miezin, Dobmeyer, Shulman, & Petersen, 2000; Kanwisher & Wojciulik, 2000; Kastner & Ungerleider, 2000; Wojciulik & Kanwisher, 1999). Adaptation responses are also likely to be affected by attention (Yi & Chun, 2005; Murray & Wojciulik, 2004). In an fMRI study, Henson and Mouchlianitis (2007) recently looked at the effect of attention on face-selective responses in the FFA and place-selective responses in the PPA. Two face or house stimuli were presented, one in each hemifield, and participants were instructed to attend to the left or right stimulus. The authors found evidence for repetition suppression in these category-selective regions only when both the initial and the repeated stimuli were in the attended hemifield.
In light of this, we used a task that required attention to both the action and the identity of the person performing the action. However, our results suggest that this manipulation did not significantly change the pattern of results—and specifically, did not introduce identity priming effects. Thus, we can conclude that Kable and Chatterjee's results cannot be explained in terms of the participants attending to the actions only. One possibility is that via object-based attention mechanisms (e.g., O'Craven et al., 1999), attention to one feature of the depicted individuals (their actions) led automatically to attention to other features (including identity). In a 2-sec presentation, there would be sufficient time to extract the identity of the actor along while also processing the action.
Given the above considerations, why do we (and Kable & Chatterjee, 2006) fail to find adaptation to actor identity in these regions? A number of recent studies of adaptation effects (generally on faces) have highlighted the importance of preexisting familiarity with the stimuli as a modulator of adaptation. For example, Ewbank and Andrews (2008) found adaptation to familiar faces across different view points in the FFA, however, a release from adaptation across viewing angle was found for unfamiliar faces. Henson et al. (2000) presented familiar and unfamiliar faces (and symbols) and found that a region in the right fusiform gyrus showed a reduced response to the repetition of familiar stimuli but an enhanced response to the repetition of unfamiliar stimuli. These effects were further modulated by the lag between presentations as well as the number of repetitions: The response to the second presentation of a familiar stimulus increased with lag, whereas the response to an unfamiliar stimulus decreased with lag. Furthermore, the stimuli were presented five times throughout the scanning session; the response to familiar stimuli decreased across the five presentations, whereas the response to unfamiliar stimuli increased.
In the present study, the actions shown were highly familiar (e.g., walking, skipping, picking up an object, knocking on a door), whereas the actors performing the actions were unfamiliar to the participants. Thus, it is possible that the different patterns of action- and identity-specific adaptation in the ROIs tested here are due, at least in part, to familiar and unfamiliar stimuli evoking different patterns of repetition priming. (Note that some actors were familiar to some participants in Kable and Chatterjee's study and the authors report some evidence for identity priming, e.g., in the whole-brain analysis.) Interestingly, familiarity appears to selectively affect responses to static body stimuli: Hodzic et al. (2009) found that EBA showed no differential activation for familiar and unfamiliar bodies, but FBA responded more to familiar (self and familiar other) than to unfamiliar bodies. However, that study did not measure adaptation effects. Future adaptation studies looking at action and identity should attempt to match (or manipulate) the familiarity of the stimuli along both dimensions.
An additional possibility (raised by an anonymous reviewer) is that there are basic differences between action and identity dimensions of the stimuli used here that may account for the pattern of adaptation effects. The actions tested were inherently multidimensional, potentially tapping multiple perceptual representations, that is, body-selective areas may have adapted due to the repetition of body postures (cf. Downing et al., 2006), motion-selective areas due to the repetition of motion patterns, and object-selective areas due to the repetition of objects that were manipulated by the actors in some of the actions. In contrast, identity may be tied to a narrower range of cues having to do with the texture of skin and clothing and the gross outline shape of the actors, hence, tapping a potentially smaller and/or anatomically less focal representation which led to weaker neural activity and smaller adaptation effects. In this sense, it may be difficult to directly compare the effects of repeating actions and actors in the kinds of stimuli used here.
Finally, we note that replication studies are relatively rare in the neuroimaging literature (e.g., Kozel, Padgett, & George, 2004; Zarahn, Aguirre, & D'Esposito, 2000; Chawla, Buechel, et al., 1999). Careful replication and extension of existing published results can valuably clarify our interpretations of those findings and our understanding of basic questions in cognitive neuroscience. More specifically, where replications with higher statistical power are performed (as in the present case), it may be expected that effects previously understood to be focal are found to be more widespread, as small but reliable differences are detected in additional areas.
An example of one of the actions presented (jumping jack). The example shows a subset (18 out of 60) of frames from the full movie (Figure A1).
List of all 48 actions used in the experiment:
This work was supported by the BBSRC, the ESRC, and the Wales Institute of Cognitive Neuroscience. We thank J. Kable and A. Chatterjee for providing the stimuli and for helpful comments on an earlier draft of this manuscript, N. Oosterhof and J. Taylor for helpful discussions, and P. Mullins and S. Johnston for technical support.
Reprint requests should be sent to Dr. Alison J. Wiggett, School of Psychology, Bangor University, Brigantia Building, Bangor, Gwynedd LL57 2AS, UK, or via e-mail: email@example.com.
We refer to this motion-selective region as hMT+ in the current study.
Mean parameter estimates for (1) Old Actor, Old Action; (2) Old Actor, New Action; (3) New Actor, Old Action; and (4) New Actor, New Action: right EBA—(1) 8.22, (2) 9.79, (3) 9.45, (4) 10.3; left EBA—(1) 5.39, (2) 6.24, (3) 5.95, (4) 6.69.