Abstract
When we observe an action, we know almost immediately what goal is pursued by the actor. Strikingly, this applies also to pretend action (pantomime), which provides relevant information about the manipulation itself but not about the manipulated objects. The present fMRI study addressed the issue of goal inference from pretend action as compared with real action. We found differences as well as commonalities for the brain correlates of inferring goals from both types of action. They differed with regard to the weights of the underlying action observation network, indicating the exploitation of object information in the case of real actions and manipulation information in the case of pretense. However, goal inferences from manipulation information resulted in a common network for both real and pretend action. Interestingly, this latter network also comprised areas that are not identified by action observation and that might be due to the processing of scene gist and to the evaluation of fit of putative action goals. These findings suggest that observation of pretense emphasizes the requirement to internally simulate the observed act but rule out fundamental differences of how observers cope with real and pretend action.
INTRODUCTION
When we witness a pretend action, that is, pantomime, we have an immediate idea about what goal is pursued by the actor. How is this achieved? Although several imaging studies have investigated the performance of pantomime (Hermsdörfer, Terlinden, Mühlau, Goldenberg, & Wohlschläger, 2007; Imazu, Sugio, Tanaka, & Inui, 2007; Króliczak, Cavina-Pratesi, Goodman, & Culham, 2007; Buxbaum, Kyle, & Menon, 2005; Ohgami, Matsuo, Uchida, & Nakai, 2004; Moll et al., 2000) and the observation of real action (e.g., Newman-Norlund, van Schie, van Zuijlen, & Bekkering, 2007; Calvo-Merino, Grèzes, Glaser, Passingham, & Haggard, 2006; Costantini et al., 2005; Schubotz & von Cramon, 2004; Johnson-Frey et al., 2003; Manthey, Schubotz, & von Cramon, 2003; Buccino et al., 2001; for a topical review, cf. Vogt & Thomaschke, 2007), we are still ignorant about the neural correlates of understanding goals in observed pantomime. The only study implementing observation of pretend action (German, Niehaus, Roarty, Giesbrecht, & Miller, 2004) used a covert instruction, that is, participants were not directed to attend to the actors' intention or goal but rather to the occurrence of a screen interrupting the action.
In the present study, we tested the hypothesis that the brain correlates of inferring goals from pretend and real action are partly comparable, as specified in more detail in the following. Apparently there are good reasons to assume differences as well as commonalities for the brain correlates of inferring goals from pretend and real action, respectively.
On the one hand, although we are able to tell apart real from pretend action, it is not plausible to assume that neural processes subserving the interpretation of pretend actions should be fundamentally different from those subserving the interpretation of real actions. That is not only because both entail a multitude of perceptual, mnemonic, and cognitive processes that are triggered by a complex and socially relevant stimulus, but more specifically because both crucially entail, under natural conditions, the intense analysis of the hands' posture and movements, that is, manipulation information. Models from motor control theory have been recently used to describe how we analyze observed actions to infer our conspecifics' goals (Grush, 2004; Miall, 2003; Wolpert & Flanagan, 2001). These models, designed to describe how we continuously adapt our movements to changing environmental conditions and on-line error correction, state that multiple forward models are set up to predict upcoming events from an unfolding action, no matter whether performed by ourselves or merely observed. Accordingly, the processes underlying goal inference computationally amount to the running of a simulation of several action scripts in parallel until the best fitting script wins. For instance, observing an actor grasping a cup, we predict him to either bring it to his mouth, or clean it, or move it onto a shelf, or pass it to someone else, and so on. Testing hypotheses about currently valid goal options, no matter whether based on real or pretend actions, thus calls for sensorimotor transformation (for internal simulation of action scripts), working memory (for the selection of currently valid scripts), and internal reward evaluation (for the motivational driving of the ongoing estimation of script fit and incremental reduction of currently tested goal options).
On the other hand, although the analysis of observed action no matter whether real or pretend entails testing hypotheses about currently valid goals, these hypotheses are derived from at least partially different sources for real and pretend action. As object information cannot be exploited to infer the goal from pretend action, one would expect that components of the action observation network that are engaged in the analysis of hand postures and motions, that is, manipulation, should be particularly enhanced in pretend action. Conversely, those engaged in the processing of object information should be more active for real actions. Even when isolated, both sources of information, manipulations and objects, are known to provide excellent hints for action goals, as demonstrated by the early emergence of pretend or symbolic play in child development on the one hand (Fein, 1981) and experimental investigations of object affordance on the other hand (Helbig, Graf, & Kiefer, 2006).
The present fMRI study addressed the issue of goal inference from pretend action as compared with real action. To this end, we presented short video clips that showed either pretend or real actions (two-level factor, Type). We expected both pretend and real action to engage the action observation network reported in the literature, comprising, among others, the ventral premotor cortex (PMv), the anterior intraparietal sulcus (aIPS), and the posterior superior temporal sulcus (pSTS) (Rizzolatti & Craighero, 2004). However, due to the different significance of manipulation and object information in the analysis of pretend and real action, respectively, we expected higher signals in the extrastriate body area (EBA; Taylor, Wiggett, & Downing, 2007), the human motion-selective area (hMT; Greenlee, 2000; cf. also Peuskens, Vanrie, Verfaillie, & Orban, 2005), and the pSTS (Puce & Perrett, 2003) for pretend as compared with real action. Conversely, the lateral occipital complex (LOC; Grill-Spector, Kourtzi, & Kanwisher, 2001) was expected to be elevated for real as compared with pretend action.
Although this action observation network was expected to be weighted differently for pretend and real actions, the network reflects diverse perceptual, mnemonic, and cognitive processes not all of which necessarily contribute to goal inference. For instance, the attentive analysis of the observed action is expected to continue even after the goal has been successfully recognized. Therefore, in an attempt to exclusively tap goal inference processes, we implemented a switching protocol. The rationale of this protocol was related to the so-called repetition attenuation or suppression effect. Repetition suppression refers to the fact that the repetition of a stimulus leads to a decreased BOLD signal in areas that encode that stimulus (Hamilton & Grafton, 2006; Grill-Spector & Malach, 2001; Naccache & Dehaene, 2001; Thompson-Schill, D'Esposito, & Kan, 1999). Here we contrasted trials with new information (switch trials, hereafter) with those containing no new information (repetition trials, hereafter), relative to the preceding trial. Areas that are engaged in processing a particular type of information should be more engaged in switch trials than in repetition trials.
We used three types of trials (the three-level factor, Switch): (a) “goal switch” trials (G) in that both the manipulations and the objects of the presented action in the current trial n differed from those in the preceding trial n − 1; (b) “object switch” trials (O) in that only the object of the presented action in trial n differed from the object used in the preceding trial n − 1 (while the manipulations were repeated in either case); and (c) “manipulation switch” trials (M) in that only the manipulations of the presented action in trial n differed from the manipulation in the preceding trial n − 1 (while the objects in use were repeated in either case). By contrasting goal switch trials with object switch trials (G > O), we aimed to identify brain areas that contribute to goal inference on the basis of manipulation information. Note that because G trials provided both new manipulation and new object information, the contrasts G > O (and G > M, see below) did not identify a relative difference between two different types of information but rather the relative difference between new and old (=repeated) information of the same type.
Note that we consider this contrast to highlight goal inference processes according to the experimental operationalization of goal inference that we choose in the present study. Particularly, we took the approach that goal inference amounts to a set of different cognitive subprocesses, not to a moment of unitary aha experience. Thus, for inferring goals on the basis of manipulation information, no matter whether from real or from pretend actions, we expected PMv, aIPS, and adjacent supramarginal gyrus (SMG) as areas relevant for sensorimotor transformation (Rizzolatti & Luppino, 2001), lateral prefrontal cortex (lPFC) subserving working memory in adaptive goal-directed behavior (Watanabe, 2007; Petrides, 2005), and OFC as an area known to be engaged in reward evaluation (Wallis, 2007).
Manipulation switch (M) trials were employed to balance the probability of novel information being provided either by objects or by manipulations. Object information, in contrast to manipulation information, was expected to be exploited only in real actions and largely ignored in pretend actions. Therefore, we expected the contrast G > M that reflected goal inference on the basis of object information to yield no common activations for real and pretend actions.
METHODS
Participants
Eighteen right-handed, healthy volunteers (eight women; age range = 21–32 years; mean age = 26.4 years) participated in the study. After being informed about potential risks and screened by a physician of the institution, subjects gave informed consent before participating. The experimental standards were approved by the local ethics committee of the University of Leipzig. Data were handled anonymously.
Stimuli and Tasks
Subjects were presented with movies showing actions and with short verbal action descriptions referring to these actions. Each trial (6 sec) started with a movie (2 sec) followed by a fixation phase. The length of the fixation phase (2.5–4 sec) depended on the variable jitter times (0, 500, 1000, or 1500 msec) that were inserted before the movie to enhance the temporal resolution of the BOLD response. Actions were either performed on appropriate objects (e.g., pouring water from a bottle into a glass) or on inappropriate objects (e.g., making the same movements with a bin and a key). These two classes of actions will hereafter be referred to as “real actions” and “pretend actions,” respectively. Note that to generate rich informational content from both manipulations and objects, each movie clip we presented showed a chain of specific manipulations (e.g., grasping, turning, and opening) and combinations of two objects (e.g., a cup and a spoon).
Subjects were instructed to attend to the presented movies. They were informed that some of the movies were followed by a trial that started with an action description that either matched or did not match the content of the preceding movie. It was emphasized that it did not play any role whether actions to that the action description referred to were real or pretend actions. In case a trial containing an action description was presented, subjects immediately delivered their responses on a two-button response box using their index finger for affirmative responses and their middle finger for rejections. Fifty percent of the action descriptions were to be affirmed and 50% to be rejected.
In addition to the two-level stimulus factor Type [real action (R) and pretend action (P)], a three-level factor Switch [goal (G), objects (O), and manipulations (M)] was implemented. The trial succession was implemented such that trials were either switch trials or repetition trials with respect to the manipulations, to the physical objects in use, or both (see Figure 1). All combinations of these two factors were possible except an identical repetition of trial n − 1. Moreover, the transition frequencies of real and pretend action were counterbalanced. Twenty-five percent of the movies (i.e., 21 of 84 real actions and 21 of 84 pretend actions) were followed by an action description that had the length of a regular trial (2 sec description, including response phase, plus 4 sec fixation phase), resulting in 42 additional trials. Each action description was followed by a dummy trial that was a regular movie of either a real or a pretend action but neither a regular switch nor a repetition trial. Accordingly, these dummy trials (n = 42) entered the analysis contrasting real and pretend actions (adding up to 84 + 21 = 105 trials for real actions and 84 + 21 = 105 trials for pretend actions) but not the analyses on switch or repetition effects of manipulations and objects. Finally, 20 empty trials (resting state) were presented intermixed with the experimental trials.
Experimental design. Examples for goals (e.g., “writing with pen”) are given in colored boxes, in which photos indicate the physical object actually presented in the corresponding movie clip. Levels of the experimental factor Switch (object, manipulation, and goal) correspond to columns, with columns 1–3 (solid frames) and 4–6 (dashed frames) corresponding to the two levels of the experimental factor Type (real and pretend). The first row of boxes represents the goal in a trial n − 1, whereas the residual rows represent examples of actions, in trial n, either repeating (second row) or switching (third row) the manipulated objects (first and fourth column), manipulations (second and fifth column), or goals (third and sixth column). The fields for goal repetition are empty because this trial type was not part of the experimental design.
Experimental design. Examples for goals (e.g., “writing with pen”) are given in colored boxes, in which photos indicate the physical object actually presented in the corresponding movie clip. Levels of the experimental factor Switch (object, manipulation, and goal) correspond to columns, with columns 1–3 (solid frames) and 4–6 (dashed frames) corresponding to the two levels of the experimental factor Type (real and pretend). The first row of boxes represents the goal in a trial n − 1, whereas the residual rows represent examples of actions, in trial n, either repeating (second row) or switching (third row) the manipulated objects (first and fourth column), manipulations (second and fifth column), or goals (third and sixth column). The fields for goal repetition are empty because this trial type was not part of the experimental design.
Altogether, 272 trials were presented: 84 real actions plus 21 real action dummies, 84 pretend actions plus 21 pretend action dummies, 21 action descriptions following real actions, 21 action descriptions following pretend actions, and 20 empty trials.
MRI Data Acquisition
Imaging was carried out on a 3-T Bruker (Ettlingen, Germany) Medspec 30/100 system equipped with the standard birdcage head coil. Participants were placed on the scanner bed in a supine position with their right index and middle fingers positioned on the appropriate response buttons of a response box. Form-fitting cushions were utilized to prevent head, arm, and hand movements. Participants were provided earplugs so that scanner noise would be attenuated. Twenty-two axial slices (192 mm field of view; 64 × 64 pixel matrix; 4 mm thickness; 1 mm spacing; in-plane resolution of 3 × 3 mm) parallel to bicommissural line (AC–PC) covering the whole brain were acquired using a single-shot gradient EPI sequence (2000 msec repetition time; 30 msec echo time; 90° flip angle; 100 kHz acquisition bandwidth) sensitive to BOLD contrast. Prior to the functional imaging, 22 anatomical T1-weighted MDEFT images (Norris, 2000; Ugurbil et al., 1993) and 22 T1-weighted EPI images with the same spatial orientation as the functional data were acquired. In a separate session, high-resolution whole-brain images were acquired from each subject to improve the localization of activation foci using a T1-weighted 3-D-segmented MDEFT sequence covering the whole brain.
MRI Data Analysis
Data were processed using the software package LIPSIA (Lohmann et al., 2001). Functional data were first motion-corrected using a matching metric based on linear correlation. To correct for the temporal offset between the slices acquired in one image, a cubic-spline interpolation was employed. Low-frequency signal changes and baseline drifts were removed using a temporal high-pass filter with a cutoff frequency of 1/85 Hz. Spatial smoothing was performed with a Gaussian filter of 5.65 mm FWHM. To align the functional data slices with a 3-D stereotactic coordinate reference system, a rigid linear registration with six degrees of freedom (three rotational, three translational) was performed. The rotational and the translational parameters were acquired on the basis of the MDEFT and the EPI-T1 slices to achieve an optimal match between these slices and the individual 3-D reference dataset. The MDEFT volume dataset with 160 slices and 1-mm slice thickness was standardized to the Talairach stereotactic space (Talairach & Tournoux, 1988). The rotational and the translational parameters were subsequently transformed by linear scaling to a standard size. The resulting parameters were then used to transform the functional slices using trilinear interpolation, so that the resulting functional slices were aligned with the stereotactic coordinate system, thus generating output data with a spatial resolution of 3 × 3 × 3 mm (27 mm3).
The statistical evaluation was based on a least-squares estimation using the general linear model for serially autocorrelated observations (Friston et al., 1995; Worsley & Friston, 1995). The design matrix was generated with a box-car function, convolved with the hemodynamic response function and its first derivative. Brain activations were analyzed time-locked to onset of the movies, and the analyzed epoch comprised the full duration (2 sec) of the presented movies. The model equation, including the observation data, the design matrix, and the error term, was convolved with a Gaussian kernel of dispersion of 4 sec FWHM to account for the temporal autocorrelation (Worsley & Friston, 1995). In the following, contrast images, that is, beta value estimates of the raw-score differences between specified conditions, were generated for each participant. As all individual functional datasets were aligned to the same stereotactic reference space, the single-subject contrast images were entered into a second-level random effects analysis for each of the contrasts. One-sample t tests were employed for the group analyses across the contrast images of all subjects that indicated whether observed differences between conditions were significantly distinct from zero. The t values were subsequently transformed into Z scores. To correct for false-positive results, in a first step, an initial voxelwise z-threshold was set to Z = 2.33 (p = .01, uncorrected). In a second step, the results were corrected for multiple comparisons using cluster-size and cluster-value thresholds obtained by Monte Carlo simulations at a significance level of p = .005, that is, the reported activations are significantly activated at p < .005, corrected for multiple comparisons at the cluster level.
To investigate more thoroughly the comparability of brain responses in the areas identified by contrasts, percentage signal change analyses of the BOLD response were carried out where the mean signal change over a 6-sec epoch, starting 4 sec after movie onset, were extracted from selected voxels within significantly activated brain areas for the experimental and the resting baseline conditions. The mean signal change of a voxel for each condition was calculated in relation to the mean signal intensity of that voxel across all time steps.
RESULTS
Behavioral Results
Performance was assessed by error rates and reaction times. Repeated measures ANOVAs were performed for each of these measures with the two-level factor Type (pretend and real) and the three-level factor Switch (goal, objects, and manipulations). Regarding reaction times, a main effect for the factor Type [F(1,17) = 13.791, p < .005] and an interaction Type × Switch [F(2,34) = 4.344, p < .05] was found. These effects reflected that responses to the action descriptions were slower for pretend (mean ± standard error, 421 ± 32 msec) as compared with real actions (393 ± 26 msec). The t tests showed that in case of pretend actions, responses to trials in that only objects switched were faster (386 ± 26 msec) than when only manipulations were switched (446 ± 32 msec; t18 = 5.059, p < .001) and were marginally faster than when both manipulations and objects were switched (431 ± 39 msec; t18 = 1.858, p = .081). For error rates, a significant main effect was found for the factor Switch [F(2,34) = 5.023, p < .01] and for the factor Type [F(1,17) = 9.305, p < .05]. In particular, subjects made more errors when matching action descriptions with pretend actions (7.4 ± 3.3%) than with real actions (1.1 ± 0.8%). The t tests showed that action descriptions for trials in that only objects switched were easier (1.4 ± 1.1% errors) than for trials in that only manipulations (5.3 ± 2.2%, t18 = 3.487, p < .005) or both manipulations and objects (6.0 ± 2.8%, t18 = 2.955, p < .01) switched. Overall, behavioral performance implicated that inferring goals was slightly more demanding from pretend than from real actions, and that the inspection of manipulation information was more demanding than the inspection of object information. However, as all effects remained below differences of 60 msec and about 6% errors, we did not expect them to account for activation differences in our BOLD contrasts.
fMRI Results
Observing Real and Pretend Action
The network commonly activated by either the observation of real action and that of pretend action [conjunction (real > rest) ∩ (pretend > rest)] comprised the bilateral PMv [Broca's area (BA) 6/44], the left inferior frontal sulcus (IFS; BA 9/46), the aIPS, the left SMG, the left dorsal premotor cortex (PMd), the left presupplementary motor area (pre-SMA), and the left superior intraparietal sulcus. Extensive activation was also found in the fusiform gyrus (FG) and in the occipital gyri including probably the LOC as well as an area we will hereafter refer to as EBA/hMT, as hMT overlaps closely with EBA (Downing, Wiggett, & Peelen, 2007; Figure 2, Table 1).
The network conjointly activated by the observation of real and pretend actions (as compared with rest) comprised the areas that are typically seen for action observation, including PMv extending from BA 6 into BA 44, anterior parietal regions (aIPS and SMG) as well as the pSTS. For further abbreviations, see Results section.
The network conjointly activated by the observation of real and pretend actions (as compared with rest) comprised the areas that are typically seen for action observation, including PMv extending from BA 6 into BA 44, anterior parietal regions (aIPS and SMG) as well as the pSTS. For further abbreviations, see Results section.
Action Observation Network Common to Pretend and Real Action: Conjunction of Observation of Real Action as Compared with Rest and Observation of Pretend Action as Compared with Rest
Area . | x . | y . | z . | Z . |
---|---|---|---|---|
Conjunction of Real Action versus Rest and Pretend Action versus Rest | ||||
PMv | 37 | 6 | 30 | 5.25 |
−47 | 6 | 33 | 5.9 | |
PMd | −26 | −8 | 51 | 5.16 |
pre-SMA | −5 | 3 | 51 | 4.14 |
IFS (BA 9/46) | −38 | 21 | 24 | 5.05 |
Anterior IPS | −35 | −35 | 42 | 5.862 |
31 | −35 | 45 | 5.178 | |
Superior IPS | −29 | −71 | 27 | 5.984 |
SMG (BA 40) | −59 | −23 | 34 | 6.127 |
EBA/hMT/pSTS | 46 | −54 | 3 | 6.789 |
−41 | −63 | −3 | 6.93 | |
FG | 43 | −45 | −6 | 6.536 |
LOC | −35 | −84 | 3 | 5.98 |
31 | −81 | 3 | 6.622 |
Area . | x . | y . | z . | Z . |
---|---|---|---|---|
Conjunction of Real Action versus Rest and Pretend Action versus Rest | ||||
PMv | 37 | 6 | 30 | 5.25 |
−47 | 6 | 33 | 5.9 | |
PMd | −26 | −8 | 51 | 5.16 |
pre-SMA | −5 | 3 | 51 | 4.14 |
IFS (BA 9/46) | −38 | 21 | 24 | 5.05 |
Anterior IPS | −35 | −35 | 42 | 5.862 |
31 | −35 | 45 | 5.178 | |
Superior IPS | −29 | −71 | 27 | 5.984 |
SMG (BA 40) | −59 | −23 | 34 | 6.127 |
EBA/hMT/pSTS | 46 | −54 | 3 | 6.789 |
−41 | −63 | −3 | 6.93 | |
FG | 43 | −45 | −6 | 6.536 |
LOC | −35 | −84 | 3 | 5.98 |
31 | −81 | 3 | 6.622 |
Anatomical specification, Talairach coordinates, maximum Z value (volume is not given as all activations were local maxima of a common activation).
Abbreviations: PMv = ventral premotor cortex; PMd = dorsal premotor cortex; pre-SMA = presupplementary motor area; IFS = inferior frontal sulcus; IPS = intraparietal sulcus; SMG = supramarginal gyrus; EBA/hMT = extrastriate body area/human motion-selective area; pSTS = posterior superior temporal sulcus; FG = fusiform gyrus; LOC = lateral occipital complex.
Observing Pretend versus Real Action, and Vice Versa
Among these areas, the observation of pretend action yielded significantly more activation than real action ((pretend > real) ∩ ((real > rest) ∩ (pretend > rest))) in the left PMv, the left aIPS extending into SMG, the left IFS (BA 9/46), the left pSTS, and the right EBA/hMT (Figure 3A, Table 2). In contrast, the observation of real action yielded significantly more activation than pretend action ((real > pretend) ∩ ((real > rest) ∩ (pretend > rest))) in the FG and/or LOC bilaterally (hereafter LOC), comprising anterior and posterior compartments in the right superior parietal lobule (BA 7) and in the right postcentral gyrus (Figure 3B, Table 2).
Direct contrasts between observation of real and pretend actions revealed different weights of the action observation network. (A) Areas elevated by the observation of pretend as compared with real action included left IFS (BA 9/46), left PMv, left aIPS, and EBA/hMT extending into pSTS in the left hemisphere. (B) Conversely, areas more engaged in processing real as compared with pretend actions were primarily found in LOC bilaterally.
Direct contrasts between observation of real and pretend actions revealed different weights of the action observation network. (A) Areas elevated by the observation of pretend as compared with real action included left IFS (BA 9/46), left PMv, left aIPS, and EBA/hMT extending into pSTS in the left hemisphere. (B) Conversely, areas more engaged in processing real as compared with pretend actions were primarily found in LOC bilaterally.
Different Weights of the Action Observation Network: Direct Contrasts between Observation of Real and Pretend Action ((Pretend > Real) ∩ ((Real > Rest) ∩ (Pretend > Rest))) and ((Real > Pretend) ∩ ((Real > Rest) ∩ (Pretend > Rest)))
Area . | x . | y . | z . | Z . | mm3 . |
---|---|---|---|---|---|
Observation of Pretend versus Real Action | |||||
PMv | −45 | 3 | 33 | 4.35 | 3672 |
IFS (BA 9/46) | −42 | 30 | 15 | 3.50 | 1296 |
aIPS | −42 | −39 | 54 | 2.99 | 2025 |
SMG | −53 | −23 | 36 | 2.95 | l.m. |
EBA/hMT/pSTS | −54 | −54 | 12 | 3.32 | 3078 |
EBA/hMT | 48 | −57 | 3 | 3.80 | 1728 |
Observation of Real versus Pretend Action | |||||
LOC | −27 | −66 | −9 | 4.12 | 9504 |
−29 | −92 | 3 | 4.05 | l.m. | |
21 | −69 | −6 | 4.82 | 18738 | |
22 | −83 | −3 | 4.63 | l.m. | |
Superior parietal lobule (SPL) | 21 | −48 | 66 | 3.77 | 2322 |
Postcentral gyrus (SII) | 63 | −12 | 27 | 4.32 | 3294 |
Area . | x . | y . | z . | Z . | mm3 . |
---|---|---|---|---|---|
Observation of Pretend versus Real Action | |||||
PMv | −45 | 3 | 33 | 4.35 | 3672 |
IFS (BA 9/46) | −42 | 30 | 15 | 3.50 | 1296 |
aIPS | −42 | −39 | 54 | 2.99 | 2025 |
SMG | −53 | −23 | 36 | 2.95 | l.m. |
EBA/hMT/pSTS | −54 | −54 | 12 | 3.32 | 3078 |
EBA/hMT | 48 | −57 | 3 | 3.80 | 1728 |
Observation of Real versus Pretend Action | |||||
LOC | −27 | −66 | −9 | 4.12 | 9504 |
−29 | −92 | 3 | 4.05 | l.m. | |
21 | −69 | −6 | 4.82 | 18738 | |
22 | −83 | −3 | 4.63 | l.m. | |
Superior parietal lobule (SPL) | 21 | −48 | 66 | 3.77 | 2322 |
Postcentral gyrus (SII) | 63 | −12 | 27 | 4.32 | 3294 |
l.m. = local maximum.
Inferring Goals
Goals can be inferred from observed action on the basis of manipulation and object information. To identify brain areas involved in goal inference on the basis of manipulations, we analyzed the effect of providing subjects with new manipulation information (manipulation switch effect), and subsequently to identify brain areas involved in goal inference on the basis of object information, we analyzed the effect of providing subjects with new object information (object switch effect). Note that in a direct comparison between the M and the O trials, the effects of manipulation switches and object repetition would have been inextricably confounded. Accordingly, manipulation switch effects were tested while controlling for object switches by contrasting G with O trials and object switch effects by contrasting G with M trials.
Inferring Goals by Analyzing New Manipulations
For the observation of actions in that both manipulations and objects switched as compared with those in that only objects switched (G > O, manipulation switch effect), activations were located in the left central OFC (BA 11/10), left anterior IFS (BA 9/46), left PMv (BA 6), slightly extending into the opercular part of the inferior frontal gyrus (BA 44), and in a right inferior temporal region around the collateral sulcus, probably reflecting the parahippocampal place area (PPA; Epstein & Kanwisher, 1998) (Figure 4, Table 3). A signal change analysis in the identified areas corroborated that manipulation switch effects were not statistically different for real and pretend actions. There was a main effect for the factor Type (real action and pretend action) due to higher signals in pretend as compared with real actions in PMv [F(2,34) = 29.332, p < .000], BA 44 [F(2,34) = 6.113, p < .05], and IFS [F(2,34) = 10,016, p < .01]. However, there were no significant interactions of Type × Switch, underlining that the considered brain areas showed a comparable manipulation switch effect in both real as well as pretend actions. We found a main effect for the factor Switch (object, manipulation, or both) in OFC [F(2,34) = 10.325, p < .005] due to significant differences between G > O (t18 = 6.304, p < .001) and G > M (t18 = 2.485, p < .05); the same was true for BA 44 [Switch, F(2,34) = 10.708, p < .001; G > O, t18 = 3.977, p < .001; G > M, t18 = 3.169, p < .01], for IFS [Switch, F(2,34) = 11.414, p < .001; G > O, t18 = 5.203, p < .001; G > M, t18 = 3.708, p < .005], and for PMv [Switch, F(2,34) = 6.598, p < .01; G > O, t18 = 4.22, p < .001; G > M, t18 = 2.876 p < .01]; only for PPA, G and M did not differ though showing the same trend [Switch, F(2,34) = 5.15, p < .05; G > O, t18 = 4.018 p < .001].
Manipulation switch. Contrast between trials showing new versus repeated manipulations (G > O) and corresponding signal changes in (1) OFC, (2) PPA, (3) anterior, (4) opercular inferior frontal sulcus (BA 44), and (5) PMv. These areas were elevated when goals were to be inferred on the basis of changed manipulation information. Bar charts indicate percentages signal changes due to new versus old objects (blue), manipulations (yellow), or both (i.e., goals; red). White bars show the signal during rest.
Manipulation switch. Contrast between trials showing new versus repeated manipulations (G > O) and corresponding signal changes in (1) OFC, (2) PPA, (3) anterior, (4) opercular inferior frontal sulcus (BA 44), and (5) PMv. These areas were elevated when goals were to be inferred on the basis of changed manipulation information. Bar charts indicate percentages signal changes due to new versus old objects (blue), manipulations (yellow), or both (i.e., goals; red). White bars show the signal during rest.
Inferring Goals: Contrasts between New and Repeated/Old Information about Manipulations (G > O) and Objects (G > M)
Area . | x . | y . | z . | Z . | mm3 . |
---|---|---|---|---|---|
New versus Repeated Manipulations (Contrast G > O) | |||||
OFC (BA 11/10) | −24 | 45 | −3 | 4.80 | 2619 |
PMv | −42 | 6 | 27 | 3.71 | 3132 |
IFS pars opercularis (BA 44) | −50 | 18 | 21 | 3.36 | l.m. |
IFG/IFS | −41 | 33 | 9 | 3.83 | 864 |
PPA | 36 | −33 | −6 | 5.12 | 756 |
New versus Repeated Objects (Contrast G > M) | |||||
LOC | −33 | −66 | −3 | 3.31 | 1188 |
33 | −81 | −3 | 3.72 | 3375 | |
−33 | −48 | −9 | 4.33 | 7101 | |
33 | −48 | −9 | 3.58 | 2781 | |
Sensorimotor cortex | 33 | −18 | 54 | 3.83 | 6939 |
Area . | x . | y . | z . | Z . | mm3 . |
---|---|---|---|---|---|
New versus Repeated Manipulations (Contrast G > O) | |||||
OFC (BA 11/10) | −24 | 45 | −3 | 4.80 | 2619 |
PMv | −42 | 6 | 27 | 3.71 | 3132 |
IFS pars opercularis (BA 44) | −50 | 18 | 21 | 3.36 | l.m. |
IFG/IFS | −41 | 33 | 9 | 3.83 | 864 |
PPA | 36 | −33 | −6 | 5.12 | 756 |
New versus Repeated Objects (Contrast G > M) | |||||
LOC | −33 | −66 | −3 | 3.31 | 1188 |
33 | −81 | −3 | 3.72 | 3375 | |
−33 | −48 | −9 | 4.33 | 7101 | |
33 | −48 | −9 | 3.58 | 2781 | |
Sensorimotor cortex | 33 | −18 | 54 | 3.83 | 6939 |
We finally also calculated the interaction contrast G > O × Pretend > Real. In line with the signal change analyses, this contrast did not yield any significant activations, thereby corroborating that pretend and real actions were associated with the same activation pattern with respect to switching.
Inferring Goals by Analyzing New Objects
As stated in the beginning, we did not expect common activations for real and pretend actions for trials presenting new objects versus repeated objects, as object information was expected to be largely ignored in pretend actions. Unexpectedly, however, for object switch versus object repetition trials (G > M), activation was found bilaterally in four subregions of the LOC (Figure 5, Table 3). To further explore this result, a signal change analysis was calculated, revealing a pattern that was comparable for all four analyzed areas: the signal change was comparably high for all conditions except manipulation switch trials in pretend action that induced a lower signal. For left and right anterior LOC (aLOC) and right posterior LOC (pLOC), there was a main effect Switch [left aLOC, F(2,34) = 7.272, p < .01; right aLOC, F(2,34) = 12,177, p < .001; right pLOC: F(2,34) = 6.802, p < .005] but also an interaction Switch × Type [left aLOC, F(2,34) = 4.008, p < .05; right aLOC, F(2,34) = 10.597, p < .001; tendency in right pLOC, F(2,34) = 3.187, p = .054] that was due to a higher signal in all pretend actions that contained an object switch [left aLOC: OP > MP (t18 = 4.215, p < .001), GP > MP (t18 = 3.933, p < .001); right aLOC: OP > MP (t18 = 6.815, p < .001), GP > MP (t18 = 4.149, p < .001); right pLOC: OP > MP (t18 = 5.403, p < .001), GP > MP (t18 = 4.106, p < .001)]. Left pLOC showed also a main effect Switch [F(2,34) = 7.453, p < .005] that was due to a higher signal in all actions that contained an object switch [O > M (t18 = 3.558, p < .005), G > M (t18 = 3.104, p < .01)] and a main effect for Type due to a higher signal in real as compared with pretend action [F(2,34) = 18.327, p < .001] but no interaction Switch × Type.
Object switch. Contrast between trials showing new versus repeated objects (G > M) yielded extended activation in the LOC (aLOC = anterior, pLOC posterior). Signal changes in the four local maxima show that repetition of objects caused attenuation only in pretend actions, whereas object information was processed in real actions no matter whether repeated (yellow bars) or new (red and blue bars).
Object switch. Contrast between trials showing new versus repeated objects (G > M) yielded extended activation in the LOC (aLOC = anterior, pLOC posterior). Signal changes in the four local maxima show that repetition of objects caused attenuation only in pretend actions, whereas object information was processed in real actions no matter whether repeated (yellow bars) or new (red and blue bars).
DISCUSSION
The present fMRI study investigated goal inference from pretend action as compared with real action. The inspection of real and pretend actions was considered to differ with regard to the weighting of specific components of the action observation network, pointing toward an attentional focus on manipulation information for pretend relative to real actions and an attentional focus on object information for real relative to pretend actions. In contrast, goal inference as an internal simulation of the currently observed manipulations was expected to yield comparable activations for real and pretend actions in regions related to sensorimotor integration, working memory, and internal reward evaluation.
Observing Pretend versus Real Actions: Relying on Manipulation Information
Relative to the resting baseline, observation of real and pretend actions yielded highly similar brain responses in the typical action observation network, emphasizing commonalities rather than fundamental differences in the way we cope with real and pretend action. However, among the areas that were activated for observation of real and pretend actions, four areas showed relatively enhanced response to pretend as compared with real actions: left PMv, left lPFC, left aIPS (extending into SMG), and right and left EBA/hMT, the latter extending into left pSTS as well (for anatomical connections, cf. Schmahmann et al., 2007). As the two frontal areas, left PMv and left lPFC, were also found to respond to goal switches, they will be considered separately below.
Enhanced activation in EBA/hMT and pSTS had been hypothesized for pretend versus real action observation due to their function in body, motion, and biological motion processing, respectively (Downing et al., 2007; Taylor et al., 2007; Peelen, Wiggett, & Downing, 2006; Puce & Perrett, 2003; Downing, Jiang, Shuman, & Kanwisher, 2001). Because the presence of biological motion as well as body parts was balanced between pretend and real actions, we suggest that the inspection of motion and body information was intensified during pretend actions, that is, when goal inference had to rely solely on manipulation information, whereas in real actions, object information could also be exploited for the same purpose.
However, the fact that the left aIPS/SMG was more active for pretend as compared with real action observation adds a very interesting facet as this region was reported to be relevant for the performance of pantomime (Johnson-Frey, 2004; Ohgami et al., 2004; Moll et al., 2000) and was suggested for the explicit retrieval of tool-related hand movements for different behavioral purposes (Imazu et al., 2007). Moreover, Hamilton and Grafton (2006) reported activation in IPS to systematically attenuate by the repetition of reached objects (but not reaching trajectories) during action observation. Our findings particularly corroborate results from patients with left inferior parietal lesions demonstrating a strong relationship between the recognition and the imitation (performance) of object-related pantomime (Buxbaum, Kyle, et al., 2005) that can be considered to reflect expressions of deficits in internal models for planning object-related actions (Buxbaum, Johnson-Frey, & Bartlett-Williams, 2005). Along these lines, elevation of inferior parietal activation for observing pretend as compared with real action can be interpreted as manifestations of higher demands on this internal modeling due to missing (external) object information.
Notably, inferior parietal activation comprised both SMG and aIPS in the present study. Macaque research implicates two fairly different processes in the putative homologues of SMG (macaque area PF) and aIPS (macaque area AIP; cf. Committeri et al., 2007; McGeoch, Brang, & Ramachandran, 2007). The former contains parietal mirror neurons and mediates between PMv and pSTS in a network for both action observation and action execution (Keysers & Perrett, 2004); the latter is suggested to provide the PMv with a pragmatic description of objects (Fagg & Arbib, 1998). Functionally, it is more plausible to interpret our findings along the lines of parietal mirror neurons and the analysis of observed action. However, additional involvement of aIPS could be explained in two ways. As object information pointing toward the currently valid goal was not available in pretend action, aIPS may either reflect the imagery of tested classes of objects matching the currently observed manipulations, driven by top–down modulation from ventral premotor areas. Indeed, imagery is known to lead to higher BOLD responses than perception in many cases (e.g., Imazu et al., 2007). Alternatively, aIPS may reflect the suppression of currently invalid pragmatic object information stemming from the wildcard objects presented in the pretend action condition. Note in this context that one could suggest that the presence of inappropriate objects in the pretend condition might lead to activation of brain regions involved in dealing with incongruity. However, the two areas mostly suggested in enhanced cognitive control during resolution of incongruity, that is, the ACC and the DLPF cortex (Carter & van Veen, 2007), were not part of the pretend versus real action contrast. Therefore, it appears that incongruity effects and resulting increase of cognitive control did not play a significant role for the pretend action condition.
Observing Real versus Pretend Actions: Relying on Object Information
The contrast between real and pretend action observation was expected to stress the exploitation of object information. It revealed extended LOC activity, an area known to play an important role in human object recognition. In the context of the present study, LOC's involvement is highly plausible as an area that represents the shapes of objects independent of low-level visual cues such as color, motion, or texture (Grill-Spector et al., 2001). As an object's shape rather than its color or texture determines/rules hand posture and motion during action, recognition of object shape was of primary task relevance.
Inferring Goals: Inspecting New Manipulations
Using a switching protocol, we set out to more specifically investigate goal inference on the basis of manipulation information. Although this information was considered to be especially relevant for the understanding of pretend actions, it was expected to be exploited for goal inference in both real and pretend actions, and it was taken to be particularly elevated in trials in that this type of information was altered relative to the preceding trial. The G > O contrast revealed enhanced activation in four areas two of that were part of the action observation network and also enhanced for pretend versus real action, namely, the left ventral premotor cortex (PMv), here extending into the pars opercularis of the inferior frontal gyrus (BA 44) and the left lPFC. In contrast, the left OFC (BA 11/10, OFC hereafter) and the right PPA were not primarily identified in the action observation network but exclusively found by contrasting G > O.
As to the functions attributed to the OFC, a recent review suggests that OFC holds information about the value of reward outcomes in working memory when we formulate action plans and predict and monitor expected outcomes (Wallis, 2007). To fully appreciate this interpretation with regard to the present findings, it is important to consider that, firstly, we found OFC (as well as PPA) only for manipulation switches but not for object switches (see below), and secondly, this activity was observed when contrasting trials in that both manipulation and objects were switched (goal switch) with trials in that only objects switched, thereby ruling out unspecific switching effects. In close keeping with Wallis (2007), we suggest that the OFC subserves the assessment of trade-offs when a scenario allows for alternative action goals and determines how well the actually observed outcome satisfies currently tested forward models. Concurrently with OFC, manipulation switches enhanced activity in the right PPA and in two areas of the action observation network: left lPFC (BA 9/46) and left PMv. These areas are interconnected: OFC has some connections to PM (Morecraft, Geula, & Mesulam, 1992) and intense connections with the lPFC (see Wallis, 2007) that in turn has connections with the PMv (Lu, Preston, & Strick, 1994; Barbas & Pandya, 1987; Matelli, Camarda, Glickstein, & Rizzolatti, 1986); moreover, PPA projects to both OFC (Barbas, 1988) and lPFC (Goldman-Rakic et al., 1984). How do these areas functionally interact in the context of goal switches?
It is largely agreed upon that activation of our motor system during action observation is due to an internal simulation of the observed action; that is, the system is activated similarly as if we would perform the observed action ourselves (Jeannerod, 2001). Computationally, the notion of multiple forward models running in parallel has been used to explain the efficiency with which we engage in goal inference when observing actions (Miall, 2003; Wolpert & Flanagan, 2001). Putting these perspectives together, the picture emerging from the present findings is that OFC activation reflects the calculation of the value of a reward outcome of the currently tested forward models of the observed action.
Wallis (2007) elaborates that the lPFC uses the reward signal from the OFC to plan behavior toward obtaining the goal. In the context of the presently used action observation task, the lPFC hence may select currently potentially relevant goals to be subjected to a reward analysis by OFC. This includes also holding these alternative plans in working memory. Potentially relevant action goals are bottom–up provided by input from PMv that runs sensorimotor simulations in connection with its parietal projection sites. On the basis of the reward signals from OFC, lPFC may in turn alter the selection of action goals currently tested (simulated) in PMv. Noteworthy, the parietal projection site of the PMv, the aIPS, was missing in the G > O contrast. Thus, activity in aIPS was not significantly enhanced by the observation of new as compared with repeated manipulations, possibly because novelty of object information was controlled for in this contrast.
With respect to the functional contribution of the PPA in this context, an exciting explanation refers to this area's role in processing scene gist (Epstein, 2005). Just as for OFC, PPA was only seen for goal switch effects but not for action observation per se, indicating that OFC and PPA may provide a transient input, possibly being top–down in the case of OFC and bottom–up in that of PPA, to lPFC, thereby modulating its influence on PMv.
Inferring Goals: Inspecting New Objects
In contrast to the inspection of new versus repeated manipulations, which resulted in the same pattern of activations for real and pretend actions, the inspection of new versus repeated objects was found to modulate LOC exclusively for pretend actions (see Figure 5). The pattern of activations implicates that although the processing of object shape was suppressed or canceled early in case of object repetition in pretend actions, new manipulations triggered a reconsideration of object information, even when repeated, in the case of real actions.
“Theory of Mind” versus “Mirror Neuron System”
Contrasting the observation of pretend with real action, we found activation in areas that have been discussed to belong to the so-called “mirror neuron system” (MNS) network (Rizzolatti & Craighero, 2004). In contrast, the only other study comparing the observation of pretend and real action (German et al., 2004) reported activations that relate to mental state attribution, which is also referred to as “theory of mind” (ToM) (cf. Gallagher & Frith, 2003). There is an ongoing debate on the relationship between MNS and ToM and their respective roles in understanding observed actions (cf. Saxe, 2005). Although we are not in the position to bridge the puzzling gap between the neural correlates of goal inference and mind reading in general, considerable differences in the experimental design of the German et al. (2004) study and ours have to be considered, which may account for discrepant results. Firstly, the sight of whole persons may much more provoke ToM processes than the sight of hands on objects. Secondly, because objects were entirely missing in the pretend actions employed by German et al., subjects may have been more engaged in active considerations and inferences about the class of objects that was pretended to be manipulated. Finally, short movies may have biased a direct matching of the observed manipulations onto one's own action repertoire, whereas in case of longer observation times, additional cognitive processes may evolve, even automatically, that relate to the actor's intentions and mental states.
Inferring Goals from Pretense: General Concluding Remarks
The picture emerging from the present findings sheds new light on the mechanisms driving the inference of goals from observed pretend as well as observed real action. We found clear evidence for both commonalities and differences between the neural correlates of understanding pretend and real actions. Strikingly, our findings show that not all of the components of the action observation network are engaged in the inference of new goals in observed actions and that, conversely, not all areas that are engaged in the inference of a new action goal are integral part of the action observation network. This dissociation may be due to the fact that goal inference processes are more subtle and phasic than the massive and tonic activity triggered by the observation of an ongoing action with all its facets of perceptual analysis and mnemonic demands.
The present results implicate that the requirement to infer a new goal draws inter alia on a selection of those areas that subserve the analysis of manipulation rather than object information. The bias toward manipulation is probably linked to the fact that we operationalized goals as chains of manipulations of objects and not, as would have been certainly possible, as object targets (e.g., a disk, a cookie), spatial targets (e.g., left, right), or combinations of both (cf. Hamilton & Grafton, 2006). Our findings implicate that goals, if operationalized this way, are more closely linked to how somebody is moving toward objects than to the objects themselves, possibly because a chain of hand postures and movements is less ambiguous with respect to the intended goal, whereas in contrast objects provide diverse options for goal-directed manipulations.
Acknowledgments
We cordially thank Anna Abraham, Uta Wolfensteller, and Kirsten Volz for their very helpful comments on the manuscript, Gaby Lohmann and Karsten Mueller for support in MRI statistics, Andrea Gast-Sandmann and Kerstin Flake for support in graphic layout and stimulus materials, and Marcel Muecke for experimental assistance.
Reprint requests should be sent to Ricarda I. Schubotz, Motor Cognition Group, Max Planck Institute for Neurological Research, Gleueler Str. 50, 50931 Köln, Germany, or via e-mail: [email protected].