Abstract

According to embodied theories of language, people understand a verb like throw, at least in part, by mentally simulating throwing. This implicit simulation is often assumed to be similar or identical to motor imagery. Here we used fMRI to test whether implicit simulations of actions during language understanding involve the same cortical motor regions as explicit motor imagery. Healthy participants were presented with verbs related to hand actions (e.g., to throw) and nonmanual actions (e.g., to kneel). They either read these verbs (lexical decision task) or actively imagined performing the actions named by the verbs (imagery task). Primary motor cortex showed effector-specific activation during imagery, but not during lexical decision. Parts of premotor cortex distinguished manual from nonmanual actions during both lexical decision and imagery, but there was no overlap or correlation between regions activated during the two tasks. These dissociations suggest that implicit simulation and explicit imagery cued by action verbs may involve different types of motor representations and that the construct of “mental simulation” should be distinguished from “mental imagery” in embodied theories of language.

INTRODUCTION

According to embodied theories of semantics, we use our motor system to understand language about actions. For instance, on reading “he throws the ball,” embodied accounts postulate that the reader mentally simulates this action, using some of the same motor areas that are activated when executing actual throwing (e.g., Pulvermuller, 2005). Implicit simulation during language understanding is often assumed to be the same as explicitly imagining linguistic content. As Gallese and Lakoff (2005, p. 456) put forward, “the same neural substrate used in imagining is used in understanding.” They argue that imagination is necessary to understand action-related sentences such as “Harry picked up the glass” and write that “if you cannot imagine picking up a glass or seeing someone picking up a glass, then you cannot understand that sentence” (ibid, p. 456). Here we aim to directly test and refine the relationship between imagining actions and understanding action language.

Several neuroimaging studies support the conjecture that motor areas play some role in understanding action verbs. For instance, Hauk, Johnsrude, and Pulvermuller (2004) found overlap in premotor cortex between movement of foot and fingers and during reading of foot- or hand-related action verbs (e.g., “kick”, “pick”). Likewise, areas in premotor cortex activated during observation of actions done with different effectors are also activated on reading of sentences describing these actions (Aziz-Zadeh, Wilson, Rizzolatti, & Iacoboni, 2006; see also Raposo, Moss, Stamatakis, & Tyler, 2009; Boulenger, Hauk, & Pulvermuller, 2009; Tettamanti et al., 2005; but see Postle, McMahon, Ashton, Meredith, & de Zubicaray, 2008; Sato, Mengarelli, Riggio, Gallese, & Buccino, 2008). From these and other findings, it has been concluded that understanding action language involves activating parts of premotor cortex in a somatotopic way, as is also observed during motor control (e.g., Woolsey, 1963). This is to be expected if understanding action language involves implicitly simulating an action (for reviews, see Aziz-Zadeh & Damasio, 2008; Kemmerer & Gonzalez-Castillo, 2008; Mahon & Caramazza, 2008; Willems & Hagoort, 2007; Pulvermuller, 2005).

In addition to supporting action word understanding, a host of studies implicate premotor cortex in supporting motor imagery of hand movements (e.g., Helmich, de Lange, Bloem, & Toni, 2007; de Lange, Helmich, & Toni, 2006; de Lange, Hagoort, & Toni, 2005; Cisek & Kalaska, 2004; Johnson et al., 2002; Gerardin et al., 2000; Bonda, Petrides, Frey, & Evans, 1995). Yet, the relationship between the premotor cortex correlates of motor imagery and action language understanding is not well understood.

In this study, we aimed to elucidate the relationship between motor imagery and action semantics by directly comparing neural activity during action verb understanding with activity during explicit mental imagery of actions cued by the same verbs. In one fMRI run, participants performed a lexical decision task on action verbs, and in a second run they actively imagined performing the actions described by these verbs. To gain specificity of neural responses and for reasons of experimental control (see Methods), we contrasted action verbs related to hand actions (e.g., to throw) with nonmanual action verbs (e.g., to kneel).

Gallese and Lakoff's (2005) conjecture makes the clear prediction that understanding an action verb and imagining performing that same action should rely on the same neural tissue, most notably premotor cortex. This finding would be in line with the idea that through Hebbian learning, cell assemblies of neurons firing together during execution and observation of actions come to constitute the semantic representation of an action verb (Pulvermuller, 2005). Alternatively, it is possible that distinct representations in motor cortex support action verb understanding and explicit motor imagery. This finding would require a refinement to theories of embodied semantics, suggesting that activation of the motor system during action verb understanding should be distinguished from motor imagery.

Before we move on to describing the experiment, we will first clarify what we mean by simulation and by imagery. Implicit motor simulations are often characterized as partial reenactments of prior actions (e.g., Barsalou, 1999, 2009). However, the computational function that such reenactments could serve is not clear. When we use the term simulation in this article, we do not refer to a reenactment of prior experiences, which seem functionally unmotivated. Rather, we posit that motor simulations are preenactments of potential future experiences. A word like grasp can serve as a cue to activate neural circuits involved in partial preparation for grasping (for compatible proposals, see Barsalou, 2009; Zwaan, 2004). This schematic, unconscious, prospective activation of effector-specific regions in premotor cortex presumably facilitates further action planning if subsequent cues call for grasping to be executed or to be imagined explicitly.

Motor imagery, by contrast, can be understood as covert enactment of an action. Like overt motor execution, motor imagery may entail the generation of an action plan (inverse model) as well as a prediction of the action' sensory consequences (forward model) (e.g., Grush, 2004; Wolpert & Ghahramani, 2000). The generation of the forward model can be described as a kind of simulation, but this is not the way we use the term here.

METHODS

Subjects

We tested 20 healthy participants (14 women; mean age = 22.7 years, range = 19–28 years) with no known history of neurological problems, dyslexia, or other language-related problems or hearing complaints and with normal or corrected-to-normal vision. All participants were right-handed (Oldfield, 1971; mean Edinburgh Handedness Inventory score = 97, range = 82–100) and gave written informed consent in accordance with the declaration of Helsinki. The study was approved by the local ethics committee.

Materials

Stimuli were 96 Dutch verbs expressing concrete actions. Half of these were related to manual actions (man), half of them were not related to manual actions (nonman). This distinction was pretested with a larger number of verbs, in a group of raters who did not participate in the fMRI experiment (n = 16), who scored for each verb how much they associated that action with their hand(s), and, if applicable, whether they preferred to act out the action with their left, right, or both hands. Man words were significantly more associated with hand actions than nonman words, t(94) = 23.60, p < .001. On average, 79% of raters indicated that they tend to perform the action with their dominant hand (SD = 11.8%, median = 81%, mode = 88%), that is, unimanually. Man and nonman word lists did not differ in imageability (assessed by the same group of raters), t(94) < 1, number of phonemes, t(94) < 1, lexical frequency (taken from the CELEX database; Baayen, Piepenbrock, & Rijn, 1993), t(94) < 1), and number of letters, t(94) = 1.51, p = .13. From the materials that were rejected on the basis of the pretest, 16 filler items were created. In addition, 16 phonotactically legal pseudowords were created, all with the suffix typical of the regular infinitive form in Dutch (“-en”).

Experimental Procedure

Stimuli were presented using Presentation software (www.nbs.com, version 10.2) through a projector from outside the scanner room onto a screen at the back of the scanner bore and were visible to the participants through a mirror attached to the head coil. There were two separate task runs: lexical decision (LD) and imagery (IM) (Figure 1). In the LD run, participants were instructed to indicate as quickly and accurately as possible whether a word was an existing word or not on 25% of the trials (fillers and nonwords). After presentation of fillers and pseudowords, participants saw a response screen with the question whether the previous word was an existing word with answer options “yes” and “no” on the left or right side of the screen, which could be indicated by pressing a button with the left or right index finger. Response side was nonpredictably balanced across trials to prevent a biased motor response to the left or right hand. Participants had 1500 msec to respond and got feedback on the screen when they were too slow. A stimulus list of 128 stimuli (48 man + 48 nonman + 16 fillers + 16 pseudowords) was created and pseudorandomized with the constraint that the same condition was not repeated more than three times in a row. A mirrored version of this list was presented to half of the participants. Participants were familiarized with the procedure by means of 10 practice items containing different words than used in the remainder of the experiment.

Figure 1. 

Example of a trial in the LD task (A) and in the imagery (IM) task (B). (A) In the LD run, words were presented for 1500 msec, followed by a variable intertrial interval (ITI; between 2 and 6 sec, mean = 4 sec). On 25% of the trials, an LD response screen was shown and participants had to indicate whether the immediately preceding word was an existing word or not by pressing the left or the right button. Response side was unpredictably balanced between left and right so that no response could be prepared. A fixation cross indicated start of a new trial. (B) In the imagery run, man and nonman words were presented for 1500 msec. After reading the word, participants closed their eyes and imagined performing the action and opened their eyes to indicate that they were ready. Opening and closing of the eyes was monitored with an infrared eye-tracker. After a variable ITI (2–6 sec, mean = 4 sec), a fixation cross indicated start of a new trial. All materials were in Dutch. In Dutch, the infinitive form is indicated by a nonseparable suffix (“-en”), which means that only one word was presented per trial (and not two as in the English example).

Figure 1. 

Example of a trial in the LD task (A) and in the imagery (IM) task (B). (A) In the LD run, words were presented for 1500 msec, followed by a variable intertrial interval (ITI; between 2 and 6 sec, mean = 4 sec). On 25% of the trials, an LD response screen was shown and participants had to indicate whether the immediately preceding word was an existing word or not by pressing the left or the right button. Response side was unpredictably balanced between left and right so that no response could be prepared. A fixation cross indicated start of a new trial. (B) In the imagery run, man and nonman words were presented for 1500 msec. After reading the word, participants closed their eyes and imagined performing the action and opened their eyes to indicate that they were ready. Opening and closing of the eyes was monitored with an infrared eye-tracker. After a variable ITI (2–6 sec, mean = 4 sec), a fixation cross indicated start of a new trial. All materials were in Dutch. In Dutch, the infinitive form is indicated by a nonseparable suffix (“-en”), which means that only one word was presented per trial (and not two as in the English example).

In the IM run, the same words (except for filler and nonwords, which means that there were 96 trials) were presented, and participants were instructed to read the word, close their eyes, imagine performing the action, and open their eyes to indicate that they had finished motor imagery. Closing and opening of the eyes was monitored by an infrared IviewX eyetracker (www.smi.de) with custom-built shielding and coded on-line by one of the experimenters. We used opening and closing of the eyes to be able to measure imagining time on each trial while at the same time avoiding hand action interference from button presses. Performing motor imagery with eyes closed probably entails similar processes as motor imagery with eyes open (Heremans, Helsen, & Feys, 2008) and has been successfully used before in neuroimaging studies (Bakker et al., 2008; Szameitat, Shen, & Sterr, 2007a, 2007b). A stimulus list of 96 stimuli (man and nonman words) was created, pseudorandomized with the constraint that the same condition was not repeated more than three times in a row. A mirrored version of this list was presented to half of the participants. Participants were familiarized with the procedure by means of 10 practice items containing different words than used in the remainder of the experiment.

Stimuli were presented for 1500 msec, and stimulus onset was effectively jittered with respect to onset of volume acquisition by varying the intertrial interval between 2 and 6 sec (mean = 4 sec) in steps of 250 msec (Dale, 1999) in both runs. A fixation cross (250 msec) indicated the start of a new trial. The LD run always preceded the IM run to prevent a bias for participants to engage in motor imagery during the LD run.

Finally, at the end of the session, participants engaged in an action execution localizer in which they performed simple hand movements (opening and closing of the hand) with either the left or the right hand. The localizer was a blocked design, and participants were cued to perform actions with the left or the right hand by means of the words “left” or “right” presented on the screen. Each block lasted 15 sec and there were eight blocks per condition. These action execution blocks were intermingled with five rest blocks of the same duration in which participants did not execute hand actions. Compliance with the task was checked visually from outside the scanner room.

Data Acquisition and Analysis

EPIs covering the whole-brain were acquired with an eight-channel head coil on a Siemens MR system with 3-T magnetic field strength (repetition time = 2060 msec; echo time = 30 msec; flip angle = 85°, 31 transversal slices; voxel size = 3.5 × 3.5 × 3 mm, 0.5-mm gap between slices). Data analysis was done using SPM5 (http://www.fil.ion.ucl.ac.uk/spm/software/spm5/). Preprocessing involved realignment through rigid body registration to correct for head motion, slice timing correction to the onset of the first slice, normalization to Montreal Neurological Institute (MNI) space, interpolation of voxel sizes to 2 × 2 × 2 mm, and spatial smoothing (8-mm FWHM kernel). First-level analysis involved a multiple regression analysis with regressors describing the expected hemodynamic responses during observation of man words and nonman words as well as fillers words and pseudowords (fillers and pseudowords in the LD run only). Responses (button presses) were modeled separately as stick functions. Stimuli in the LD run were modeled with 1500-msec duration, and in the IM run, the actual imagining times were used. MR disturbances due to small head movements were accounted for by a series of nuisance regressors, namely, linear and exponential changes in the scan-by-scan estimated head motion, scan-by-scan average signals from outside the brain, white matter, and cerebrospinal fluid (Verhagen, Grol, Dijkerman, & Toni, 2006). Stimuli in the action execution localizer were modeled as blocks of 15 sec. The same nuisance regressors as described above were included.

A second-level whole-brain group analysis with subjects as a random factor (“random effects analysis”) was carried out. First, we tested which regions were activated by man as well as nonman words during each task in isolation. This was done by means of conjunction analyses testing the conjunction null as defined by Nichols, Brett, Andersson, Wager, and Poline (2005), testing for LDman > 0 ∩ LDnonman > 0 and for IMman > 0 ∩ IMnonman > 0. Second, we looked for regions which were more strongly activated to the man > nonman comparison in either task (i.e., LDman > nonman/IMman > nonman). This is a much more specific analysis, which asks whether there are areas during LD or IM that are sensitive to the effector with which an action is typically associated. Finally, the crucial analysis involved looking for regions sensitive to man > nonman comparison in both tasks by doing a conjunction analysis (LDman > nonman ∩ IMman > nonman), again testing the conjunction null hypothesis (Nichols et al., 2005). Correction for multiple comparisons was applied by thresholding group maps at p < .005 uncorrected and subsequently taking the cluster extent into account by using the theory of Gaussian random fields (Friston, Holmes, Poline, Price, & Frith, 1996) to correct maps at p < .05 corrected for multiple comparisons (Poline, Worsley, Evans, & Friston, 1997). Subsequently, in regions activated in the whole-brain analysis to man > nonman in the one task, it was tested whether a comparable effect was present in the other task. We took the mean parameter estimates from areas activated to LDman > nonman in the whole-brain analysis and tested whether there was an IMman > nonman effect in these areas and vice versa.

For the main analysis, we created subject-specific ROIs in which we selected voxels from cytoarchitectonically defined left Brodmann's area (BA) 6 (i.e., premotor cortex; Eickhoff et al., 2005) and left BA 4 (primary motor cortex, combining maps 4a and 4p; Geyer et al., 1996). We choose BA 6 and BA 4 because these have been implicated in action verb reading as well as in motor imagery (for reviews, see Munzert, Lorey, & Zentgraf, 2009; Willems & Hagoort, 2007). For each participant, voxels that were sensitive to the man > nonman contrast (p < .05 uncorrected) separately for the IM run and for the LD run were selected using the Marsbar toolbox (Brett, Anton, Valabregue, & Poline, 2002). Subsequently, we tested whether a man > nonman effect was also present in the data from the other run. The rationale for this analysis was that we selected for every subject the voxels that were most sensitive to the man > nonman contrast in one task and subsequently tested whether there was a similar effect in the other task. This is to be expected if LD and IM lead to overlapping neural correlates.

In another subject-specific ROI analysis, we tested for a man > nonman effect in left BA 6/BA 4 for the LD and IM run separately. For this analysis, we used a split-half approach, splitting the data in odd- and even-numbered trials.1 First we created subject-specific 4-mm spherical ROIs around the maximally activated voxel in left BA 6/BA 4 in response to man words (thresholded at p < .001). This ROI creation was based on half of the data (odd trials). Second we extracted contrast values for the man > nonman contrast from these ROIs, using the other half of the data (even trials).2 Man > nonman contrast values were extracted for each participant, and group statistics were performed by means of one-sample t test on these contrast values. With this analysis, it was ensured that ROI creation involved different data than the data in which we subsequently tested for a man > nonman effect. This procedure was repeated for the LD and IM runs separately. The rationale for this analysis was to test whether BA 6 and BA 4 were sensitive to the man > nonman comparison in each run in isolation. We have employed this subject-specific ROIs procedure before (Willems, Hagoort, & Casasanto, 2010) and found it to be more sensitive as compared with standard whole-brain analysis (see also Aziz-Zadeh et al., 2006).

Finally, we performed multivoxel pattern analysis (Downing, Chan, Peelen, Dodds, & Kanwisher, 2006; Peelen, Wiggett, & Downing, 2006) on the voxels from left BA 6 and from left BA 4 separately. In multivoxel pattern analysis, the pattern of responses across voxels in a given area is taken into account instead of statistically thresholding voxels. The rationale of this analysis is that if two conditions lead to a similar spatial pattern of responses in a given region, the activations across voxels in that region should be correlated between the two conditions. Imagine all voxels from left BA 6 as a vector in which each value represent one voxel's contrast value on the man > nonman contrast. What we did is construct two such vectors, one for LDman > nonman and one for IMman > nonman. Subsequently, the correlation coefficient between these two vectors was computed for each participant separately. The correlation coefficients were converted to Fisher's z to comply with the normality assumption (Kleinbaum, Kupper, Muller, & Nizam, 1998) and tested for a difference from mean zero in a one-sample t test (for a comparable approach, see Downing et al., 2006; Peelen et al., 2006). If man > nonman during LD and during IM lead to similar response patterns in BA 6/BA 4, we expect to find high correlations in this analysis.

Multivoxel pattern analysis also allowed us to investigate the internal consistency of patterns of activation within one run. Using a split half approach, we investigated whether the pattern of response in BA 6/BA 4 for man > nonman was the same for each half of the data in each task setting. Thus, we correlated patterns of voxels in BA 6/BA 4 LDman > nonman from odd-numbered trials with the activation pattern during LDman > nonman from even-numbered trials. This was similarly done for the IM data. If the pattern of responses in BA 6/BA 4 is robust and stable, we expect high correlations between the man > nonman contrast values from the one half of the data as compared with the other half of the data. We choose to not perform pattern correlation analysis on unsmoothed data because of the spatial normalization procedure that is inherently imperfect across sessions. The danger with unsmoothed data is that correlations are artificially lower in across-session comparisons as compared with within-session comparisons due to imperfect normalization. Spatial smoothing essentially eliminates that problem, and we therefore conducted the correlation analysis on spatially smoothed data.

We never compared a single condition directly between the two tasks (e.g., LDman > IMman), given the interpretational problems arising from a direct comparison between tasks with different trial durations and occurring in different scanning runs (McGonigle et al., 2000). Furthermore, given the heterogeneity of effectors involved in the nonman trials, we did not contrast nonman > man.

The results from the action execution localizer task were used to investigate whether ROIs defined in the LD or IM data were also activated during actual hand action execution. For this end, we extracted contrast values to the right hand > rest comparison from the ROIs described above and statistically tested them in a one-sample, one-sided t test against mean zero.

RESULTS

Behavioral

Lexical Decision

Participants answered correctly to the LD catch trials in the large majority of trials (mean = 95.8%, SD = 3.5%, range = 87.5–100%). There were few incorrect responses (mean = 3.9%, SD = 3.64, range = 0–9.4%) and misses (mean = 0.3%, SD = 0.99%, range = 0–3.1%)

Imagery

Participants on average took 5.6 sec (SD = 1.79) to imagine doing the man actions compared with 5.5 sec (SD = 1.67) to imagine the nonman actions. This difference was not statistically significant, t(19) < 1.

Neural

For technical reasons, in one subject no action execution localizer was measured. Moreover, because of excessive head motion, the action execution data from one other participant were not analyzed. This means that 18 data sets entered the analysis for the action execution localizer. The analysis of LD and IM data involved data from all 20 participants. Head movement never exceeded 2 mm or 2° in any rotation or translation in any of the runs that were included in the analysis.

Whole-brain Analysis

We performed exploratory whole-brain analysis, testing for task-specific activations as well as for overlap in response patterns between the LD and the IM tasks. First, activations to reading of man and nonman words was compared with baseline in the LD run (LDMAN > 0 ∩ LDNONMAN > 0). This comparison led to wide-spread overlapping activations in bilateral precentral sulci and inferior frontal gyri, bilateral superior and inferior parietal sulci, bilateral superior and middle temporal sulci, bilateral inferior occipital and calcarine sulci, and left anterior cingulate sulcus and left hippocampus (Figure 2A; Table 1). A similar activation pattern was observed for this analysis in the IM data (IMMAN > 0 ∩ IMNONMAN > 0), encompassing bilateral inferior frontal gyri, bilateral precentral sulci, bilateral central sulci, bilateral anterior cingulate sulci, bilateral calcarine/inferior occipital sulci, bilateral middle and superior temporal gyri, and bilateral cerebellum (Figure 2B; Table 1).

Figure 2. 

Overlapping activation to man and nonman words during LD (A) and motor imagery (B). Displayed are the conjunction analyses (Nichols et al., 2005) LDman > 0 ∩ LDnonman > 0 (in red, A) and IMman > 0 ∩ IMnonman > 0 (in green, B). Reading of all word types led to strong bilateral occipital cortex activation as well as bilateral (but more left-lateralized) primary and premotor cortex activation. Moreover, for all conditions inferior frontal cortex was activated bilaterally.

Figure 2. 

Overlapping activation to man and nonman words during LD (A) and motor imagery (B). Displayed are the conjunction analyses (Nichols et al., 2005) LDman > 0 ∩ LDnonman > 0 (in red, A) and IMman > 0 ∩ IMnonman > 0 (in green, B). Reading of all word types led to strong bilateral occipital cortex activation as well as bilateral (but more left-lateralized) primary and premotor cortex activation. Moreover, for all conditions inferior frontal cortex was activated bilaterally.

Table 1. 

Overlapping Regions Activated to Presentation of Man as Well as to Nonman Words in the LD Run and in the IM Run

Comparison
Region
x
y
z
Tmax
nr Voxels
LDman > 0 ∩ LDnonman > 0 L precentral sulcus/inferior frontal gyrus −52 −6 48 7.11 3708 
−36 −2 60 5.64 
L insula −30 24 6.04 
R precentral sulcus/inferior frontal gyrus 44 56 5.19 977 
36 48 4.79 
46 32 4.73 
L superior/inferior parietal sulcus −28 −68 30 6.59 1989 
−34 −56 52 6.11 
−42 −32 44 5.39 
R superior/inferior parietal sulcus 30 −60 44 6.45 1114 
30 −64 32 6.29 
R superior temporal gyrus 46 −32 6.02 717 
L inferior occipital/fusiform gyrus/middle temporal sulcus −22 −96 −8 16.93 5153 
−40 −72 −14 16.59 
−36 −88 −12 14.64 
L calcarine sulcus −18 −72 10 4.42 
R inferior occipital/fusiform gyrus 30 −90 −12 19.28 4170 
44 −72 −14 12.06 
46 −58 −16 10.76 
R calcarine sulcus 20 −68 12 3.81 
L anterior cingulate sulcus −6 60 7.48 2713 
−4 10 54 7.45 
−8 28 34 4.57 
L hippocampus −18 −28 −2 7.35 446 
−28 −12 −4 3.03 
IMman > 0 ∩ IMnonman > 0 L inferior frontal gyrus/precentral sulcus −50 16 −4 8.81 23,650 
−44 −8 56 7.28 
R inferior frontal gyrus/precentral sulcus 50 16 −2 7.33 
54 −2 48 6.85 
L central sulcus −44 −36 58 4.78 
R central sulcus 46 −30 56 2.90 
R anterior cingulate sulcus 66 10.22 
L anterior cingulate sulcus −4 −4 70 11.87 
R calcarine sulcus 24 −58 9.22 3256 
20 −66 12 8.38 
R middle temporal gyrus 54 −56 4.74 
L calcarine sulcus −22 −62 9.10 1881 
−10 −72 18 6.01 
L middle temporal gyrus −52 −62 4.22 
Right cerebellum 26 −62 −22 5.59 
Left cerebellum −22 −60 −20 3.1 
Comparison
Region
x
y
z
Tmax
nr Voxels
LDman > 0 ∩ LDnonman > 0 L precentral sulcus/inferior frontal gyrus −52 −6 48 7.11 3708 
−36 −2 60 5.64 
L insula −30 24 6.04 
R precentral sulcus/inferior frontal gyrus 44 56 5.19 977 
36 48 4.79 
46 32 4.73 
L superior/inferior parietal sulcus −28 −68 30 6.59 1989 
−34 −56 52 6.11 
−42 −32 44 5.39 
R superior/inferior parietal sulcus 30 −60 44 6.45 1114 
30 −64 32 6.29 
R superior temporal gyrus 46 −32 6.02 717 
L inferior occipital/fusiform gyrus/middle temporal sulcus −22 −96 −8 16.93 5153 
−40 −72 −14 16.59 
−36 −88 −12 14.64 
L calcarine sulcus −18 −72 10 4.42 
R inferior occipital/fusiform gyrus 30 −90 −12 19.28 4170 
44 −72 −14 12.06 
46 −58 −16 10.76 
R calcarine sulcus 20 −68 12 3.81 
L anterior cingulate sulcus −6 60 7.48 2713 
−4 10 54 7.45 
−8 28 34 4.57 
L hippocampus −18 −28 −2 7.35 446 
−28 −12 −4 3.03 
IMman > 0 ∩ IMnonman > 0 L inferior frontal gyrus/precentral sulcus −50 16 −4 8.81 23,650 
−44 −8 56 7.28 
R inferior frontal gyrus/precentral sulcus 50 16 −2 7.33 
54 −2 48 6.85 
L central sulcus −44 −36 58 4.78 
R central sulcus 46 −30 56 2.90 
R anterior cingulate sulcus 66 10.22 
L anterior cingulate sulcus −4 −4 70 11.87 
R calcarine sulcus 24 −58 9.22 3256 
20 −66 12 8.38 
R middle temporal gyrus 54 −56 4.74 
L calcarine sulcus −22 −62 9.10 1881 
−10 −72 18 6.01 
L middle temporal gyrus −52 −62 4.22 
Right cerebellum 26 −62 −22 5.59 
Left cerebellum −22 −60 −20 3.1 

Reported are a description of the activated region, the coordinates of the local maxima in MNI space, the t value of the maximally activated voxel in a cluster, and the number of 2 × 2 × 2 mm voxels of the activated cluster. Results are corrected for multiple comparisons at p < .05. Only a limited amount of peak voxels per cluster are reported (e.g., in occipital cortex; see Figure 2).

Second, comparing the man > nonman conditions in the LD task (LDman > nonman) led to increased activation in left superior frontal sulcus (Figure 3; Table 2). The same comparison in the IM task (IMman > nonman) revealed increased activation levels in left dorsal precentral sulcus stretching into middle frontal sulcus, left central and postcentral sulcus, and left inferior temporal sulcus (Figure 3; Table 2). To confirm the specificity of the response in each of these areas, we computed the man > nonman contrast for the other task in the areas activated in the whole-brain analysis to man > nonman either in the LD task or in the IM task. Put differently in the areas showing IMman > nonman in the whole-brain analysis, we tested whether a LDman > nonman effect was similarly present. Similarly, in the one area showing an LDman > nonman effect in the whole-brain analysis, we tested whether a man > nonman effect was also present during the IM task. The results confirm that these areas are not sensitive to the man > nonman contrast from the other task (see parameter estimates in Figure 3). That is, if an area was sensitive to man > nonman in the IM task, it did not show a man > nonman effect in the LD task (Figure 3). Of the areas showing a man > nonman effect in the whole-brain analysis (Figure 3), all but the left superior frontal sulcus activation cluster were significantly activated during action execution: LD area superior frontal sulcus, t(17) = −1.89, p = .074; IM area dorsal precentral sulcus, t(17) = 2.39, p = .028; IM area left central/postcentral sulcus, t(17) = 6.85, p < .001; IM area left inferior temporal sulcus, t(17) = 2.86, p = .010.

Figure 3. 

Results of whole-brain analysis. Results are displayed on a rendered image. Displayed are the LDman > nonman (in yellow) and the IMman > nonman (in blue) contrasts. As can be seen in the figure, there were no overlapping areas in both contrast maps. This was confirmed by a conjunction analysis as well as by informal inspection of both contrast maps at p < .01 uncorrected. The bar graphs show mean responses (beta weights expressed as percent signal change) for the LDman > nonman (white bars) and the IMman > nonman (black bars) contrasts in each of the areas activated in the whole-brain analysis. Note that we only tested man > nonman in the task in which the activation cluster was not activated to avoid circularity. That is, if an area shows an IMman > nonman effect in the whole-brain analysis, we only tested whether there was a similar man > nonman effect in the LD task and vice versa. We do display the parameter estimates from both task runs for the sake of clarity and ease of reading. Error bars represent SEM. ns = not significant at the p < .05 level.

Figure 3. 

Results of whole-brain analysis. Results are displayed on a rendered image. Displayed are the LDman > nonman (in yellow) and the IMman > nonman (in blue) contrasts. As can be seen in the figure, there were no overlapping areas in both contrast maps. This was confirmed by a conjunction analysis as well as by informal inspection of both contrast maps at p < .01 uncorrected. The bar graphs show mean responses (beta weights expressed as percent signal change) for the LDman > nonman (white bars) and the IMman > nonman (black bars) contrasts in each of the areas activated in the whole-brain analysis. Note that we only tested man > nonman in the task in which the activation cluster was not activated to avoid circularity. That is, if an area shows an IMman > nonman effect in the whole-brain analysis, we only tested whether there was a similar man > nonman effect in the LD task and vice versa. We do display the parameter estimates from both task runs for the sake of clarity and ease of reading. Error bars represent SEM. ns = not significant at the p < .05 level.

Table 2. 

Results from Whole-brain Analysis Showing Areas More Strongly Activated to Manual (man) as Compared with Nonmanual (nonman) Action Verbs, in the LD Task or in the IM Task (IM)


Region
x y z
Tmax
nr Voxels
LDman > nonman Left superior frontal sulcus −18 16 50 4.12 624 
−8 40 50 
−14 34 54 
IMman > nonman Left dorsal precentral sulcus −26 −8 54 4.77 922 
−26 −8 68 
Left central sulcus/postcentral sulcus −52 −26 38 4.37 2169 
−16 −72 54 
−32 −32 40 
Left inferior/middle temporal sulcus −48 −64 2 4.96 1467 
−42 −50 −16 
−32 −62 18 

Region
x y z
Tmax
nr Voxels
LDman > nonman Left superior frontal sulcus −18 16 50 4.12 624 
−8 40 50 
−14 34 54 
IMman > nonman Left dorsal precentral sulcus −26 −8 54 4.77 922 
−26 −8 68 
Left central sulcus/postcentral sulcus −52 −26 38 4.37 2169 
−16 −72 54 
−32 −32 40 
Left inferior/middle temporal sulcus −48 −64 2 4.96 1467 
−42 −50 −16 
−32 −62 18 

Reported are a description of the activated region, the coordinates of the local maxima in MNI space, the t value of the maximally activated voxel in a cluster, and the number of 2 × 2 × 2 mm voxels of the activated cluster. Maximal 3 local maxima are reported which are more than 8 mm apart. Results are corrected for multiple comparisons at p < .05.

Finally, there were no clusters showing overlapping responses across the two tasks (LDman > nonman ∩ IMman > nonman), even at a very lenient statistical threshold (p < .01 uncorrected).

Subject-specific Regions of Interest Analysis

Our main analyses compared effector-specific activation during the two tasks (i.e., LDman > nonman and IMman > nonman) in subject-specific ROIs in left BA 6 and left BA 4. Consistent with the whole-brain analysis, these ROI analyses also showed no overlap in effector-specific response patterns across the two tasks. In this analysis, subject-specific ROIs consisted of voxels sensitive to man > nonman in the one task session, and we subsequently tested for a man > nonman effect in the other task session. Voxels were thresholded at p < .05 uncorrected to increase chances of finding overlap between LDman > nonman and IMman > nonman. If IM and LD lead to overlapping neural correlates, we should observe IMman > nonman effects in ROIs on the basis of LDman > nonman and vice versa. Note that creating ROIs on the basis of IMman > nonman and subsequently testing for this same effect (IMman > nonman) is a biased measure leading to significant but uninformative results due to “overfitting” (Kriegeskorte, Simmons, Bellgowan, & Baker, 2009). We hence do not report the results of such comparisons.

The left BA 6 ROIs taken from the IM task (subject-specific voxels sensitive to IMman > nonman at p < .05 uncorrected) were not sensitive to the LDman > nonman contrast, t(19) = −1.83, p = .082, note negative t value. Conversely, taking the BA 6 ROIs from the voxels activated to the LDman > nonman contrast revealed that there was no such effect for IMman > nonman (t < 1). A similar pattern of responses was observed in subject-specific ROIs in left BA 4. The ROIs taken from the IMman > nonman contrast showed no LDman > nonman effect, t(19) = −1.62, p = .12, note negative t value. The ROIs taken from the LDman > nonman contrast revealed no IMman > nonman effect, t(19) < 1.

To gain better insight into the differential localization of parts of left BA 6 and left BA 4 sensitive to LDman > nonman and IMman > nonman, we extracted the coordinates of the maximally activated voxel from each subject-specific ROI (Figure 4). The following are the mean coordinates: for the BA 6 LD ROIs, MNI [−22 −5 56], SD = [19 16 13]; for the BA 6 IM ROIs, [−34 −5 52], SD = [17 10 16]; for the BA 4 LD ROIs, [−20 −29 58], SD = [17 13 14]; and for the BA 4 IM ROIs, [−30 −23 50], SD = [18 17 16]. In both BA 6 and BA 4, the LD maxima tended to be located more medially than the IM maxima, although there was considerable variability in the locations of maxima across subjects (see also Fernandino & Iacoboni, in press; Kemmerer & Gonzalez-Castillo, 2008; Aziz-Zadeh et al., 2006). We also computed the percentage overlap between LDman > nonman and IMman > nonman ROIs. We took the voxels for each subject at a threshold of p < .05 uncorrected for LDman > nonman and for IMman > nonman and computed the percentage of voxels represented in both ROIs. The results show that overlap was nearly absent: For the left BA 6, the mean percentage of voxels overlapping in a given participant was 1.26% (SD = 3.3%, median = 0%, range = 0–13.4%). For the left BA 4, the mean percentage of overlapping voxels was 1.13% (SD = 3.8%, median = 0%, range = 0–16.5%).

Figure 4. 

Local maxima for subject-specific ROIs in left BA 4 (upper panel) and in left BA 6 (lower panel). Displayed is the maximally activated voxel for each participant to the LDman > nonman comparison (white circles) and to the IMman > nonman comparison (filled circles). The local maxima for each participant are connected with a line. Participants that did not have a local maximum for the LDman > nonman or for the IMman > nonman comparison are represented as isolated (nonconnected dots). This was the case for two participants (IMman > nonman) and four (LDman > nonman) in BA 4 and for two (LDman > nonman) and one (IMman > nonman) participants in BA 6. The mean coordinates are indicated by the bigger circles. The LD maximally activated voxels were more medial than the IM maximally activated voxels, but note the large spread around the mean coordinates. Axes represent x-coordinate (x-axis) and z-coordinates (y-axis) in MNI space. Mean coordinates in BA 6: LD ([−22 −5 56, SD [19 16 13]), IM ([−34 −5 52, SD: [17 10 16]); mean coordinates in BA 4: LD: ([−20 −29 58, SD 17 13 14]), IM: ([−30 −23 50, SD [18 17 16]).

Figure 4. 

Local maxima for subject-specific ROIs in left BA 4 (upper panel) and in left BA 6 (lower panel). Displayed is the maximally activated voxel for each participant to the LDman > nonman comparison (white circles) and to the IMman > nonman comparison (filled circles). The local maxima for each participant are connected with a line. Participants that did not have a local maximum for the LDman > nonman or for the IMman > nonman comparison are represented as isolated (nonconnected dots). This was the case for two participants (IMman > nonman) and four (LDman > nonman) in BA 4 and for two (LDman > nonman) and one (IMman > nonman) participants in BA 6. The mean coordinates are indicated by the bigger circles. The LD maximally activated voxels were more medial than the IM maximally activated voxels, but note the large spread around the mean coordinates. Axes represent x-coordinate (x-axis) and z-coordinates (y-axis) in MNI space. Mean coordinates in BA 6: LD ([−22 −5 56, SD [19 16 13]), IM ([−34 −5 52, SD: [17 10 16]); mean coordinates in BA 4: LD: ([−20 −29 58, SD 17 13 14]), IM: ([−30 −23 50, SD [18 17 16]).

We also tested whether there were man > nonman effects within each task in left BA 6/4. We did by means of subject-specific ROIs defined as spherical 4-mm ROIs around the maximally activated voxel to man words (thresholded at p < .001). Recall that ROI construction was based on one half of the data, and subsequent testing was done on the other half of the data (see Methods section). This analysis was included to determine whether there were man > nonman effect in BA 6/4 in the two runs separately. That is, now we do not look at overlapping neural correlates, but we ask whether there is an LDman > nonman or IMman > nonman effect in BA 6/4 at all. We have observed before that testing this in subject-specific ROIs is much more sensitive than in whole-brain analysis, given the relatively large spread of activations across participants (Willems et al., 2010; see also Aziz-Zadeh et al., 2006). The results show that there is an LDman > nonman effect in the subject-specific ROIs on the basis of one half of the data from the LD run in BA 6, t(19) = 2.80, p = .011. There was no IMman > nonman effect in these ROIs, t(19) < 1. Similarly, there was an IMman > nonman effect in BA 6 in the ROIs on the basis of one half of the data from the IM run, t(19) = 2.38, p = .028, but there was no LDman > nonman effect in these ROIs, t(19) = 1.023, p = .319. In BA 4, there was an IMman > nonman effect in the subject-specific IM ROIs, t(19) = 2.97, p = .008, but there was no LDman > nonman effect in these ROIs (t < 1). There was no LDman > nonman effect in the BA 4 ROIs from the LD run, t(19) < 1, and a marginally significant negative effect for IMman > nonman, t(19) = −1.89, p = .073. This is in line with previous studies showing premotor but not primary motor activation during action language understanding (Tomasino, Werner, Weiss, & Fink, 2007; Tettamanti et al., 2005).

All the subject-specific ROIs described above were also activated above baseline during action execution: LD ROIs BA 4, t(17) = 2.04, p = .026; LD ROIs BA 6, t(17) = 3.03, p = .004; IM ROIs BA 4, t(17) = 1.75, p = .048; IM ROIs BA 6, t(17) = 1.83, p = .042.

Multivoxel Pattern Analysis

To test for overlap in the pattern of response in BA 6/BA 4 using a technique that is not susceptible to statistical thresholding effects, we performed multivoxel pattern analysis. In this analysis, the man > nonman contrast values for each voxel from a given area (left BA 6/left BA 4) during LD and during IM were taken and a correlation coefficient was computed between them. That is, we took two vectors representing contrast values from voxels from BA 6 or BA 4 and correlated values during LDman > nonman with the values during IMman > nonman. This was done for each participant separately, yielding 20 correlation coefficients, which were converted to Fisher's z and tested in a one-sample t test against mean zero. There was no correlation between patterns of activation to the man > nonman contrast in the LD and the IM tasks, neither in left BA 6, t(19) < 1, nor in left BA 4, t(19) < 1.This type of analysis also allowed us to do an additional check on the stability of man > nonman differences within each session separately. We correlated the man > nonman contrast values from the one half of the LD session with those of the other half of the LD session. The same was done for the IM data (IM_oddman > nonman was correlated with IM_evenman > nonman). These correlations were significant in BA 6—LD, t(19) = 2.61, p = .017; IM, t(19) = 3.15, p = .005—as well as for IM in BA 4, t(19) = 2.73, p = .01, and marginally so for LD in BA 4, t(19) = 1.96, p = .065. The latter may come as a surprise given that in the subject-specific ROI analysis we did not observe a man > nonman effect in the LD run. However, we want to stress that these within-session correlations might be inflated because of the high temporal correlation in the data and we therefore refrain from drawing strong conclusions based on them.

In summary, we observed that reading of action verbs as well as explicit imagination led to activation in motor areas compared with baseline. However, the parts of both primary motor and premotor cortex that distinguished manual from nonmanual action verbs during one task were not activated during the other, even in subject-specific ROIs that were constructed to maximize potential overlap between tasks. In unbiased subject-specific ROIs, primary motor cortex showed effector-specific activation during IM, but not during LD. Premotor cortex showed effector-specific activation during both tasks, but the areas activated during LD did not overlap with areas activated during IM. No overlap was observed even when we employed MVPA, which is not susceptible to artifacts due to statistical thresholding. The more exploratory whole-brain analysis also showed no overlap between man > nonman in the two tasks. We found left dorsal premotor cortex, left primary motor cortex and left inferior temporal cortex to be sensitive to IMman > nonman (but not to LDman > nonman), whereas an area in left superior frontal sulcus was sensitive to LDman > nonman (but not to IMman > nonman).

DISCUSSION

In this study, we investigated whether understanding action verbs involves the same tissues in cortical motor regions as explicit motor imagery. Left premotor cortex (BA 6) showed effector-specific activation (i.e., stronger responses to manual compared with nonmanual verbs) during both the LD and the IM tasks. Crucially, there was no overlap in the effector-specific response patterns in subject-specific ROIs in premotor cortex across the two tasks. More precisely, portions of BA 6 and BA 4 that were defined on the basis of effector-specific activity during the IM task showed no such activity during LD. Conversely, BA 6/BA 4 ROIs based on effector-specific activity during the LD task showed no effector-specific activity during IM. This lack of overlap cannot be attributed to thresholding effects because multivoxel pattern analysis on unthresholded contrast maps showed that there was no correlation between effector-specific responses across tasks in BA 4/6. Rather, these double dissociations show that implicit motor simulation and explicit motor imagery do not necessarily engage the same neural tissues in premotor and primary motor cortices and by inference may not involve the same cognitive processes.

A double dissociation between action verb understanding and mental imagery of actions was also found in the exploratory whole-brain analysis. There were no regions that showed effector-specific activation in both tasks. That is, there was no overlap between regions activated significantly in the man > nonman contrast during both LD and IM. Rather, a region of left dorsal premotor cortex distinguished between manual and nonmanual verbs during motor imagery but not during LD. Conversely, an area in left superior frontal sulcus distinguished manual and nonmanual verbs during LD but not during motor imagery (Figure 3). It is not clear why this region of superior frontal sulcus should show effector-specific activation during LD.3 For the present purposes, the findings from the whole-brain analysis underscore the dissociation between the neural substrates of action verb understanding and mental imagery of actions.

Is it possible that the LD task only evoked processing of the verbs at a presemantic level and hence did not activate representations of action verb meaning in the LD task? We cannot definitely rule out this possibility, but it is unlikely to be an adequate alternative explanation of these data for several reasons. First, previous research indicates that LD leads to processing up to the semantic level, as indexed by modulations of the N400 component (e.g., Relander, Rama, & Kujala, 2009; Chwilla, Brown, & Hagoort, 1995), RT studies (see Neely, 1991) and overlapping neural correlates between more explicit semantic tasks and LD task (Ruff, Blumstein, Myers, & Hutchison, 2008). Second, the pseudowords were all phonotactically legal and all ended in the suffix indicating the infinitive in Dutch, which necessitates full reading of the verb to be able to perform the task. Finally, it would be hard to explain the effector-specific activations we observed in premotor cortex if the action verbs were not processed beyond a presemantic level.

According to the version of embodied semantics proposed by Gallese and Lakoff (2005), the neural correlates of motor imagery and action semantics should be identical, or at least overlapping (see also Pulvermuller, 2005). Yet, the present data provide no support for this proposal, despite showing that both motor imagery and action verb semantics engage premotor cortex. Some researchers have stated that they use the terms “mental simulation” and “mental imagery” synonymously (e.g., Bergen, Lindsay, Matlock, & Narayanan, 2007 p. 735). But our results urge caution in equating these constructs and suggest that theories of embodied semantics should distinguish implicit mental simulation during language processing from explicit mental imagery.

Possible Relationships between Simulation and Imagery

How does implicit simulation differ from explicit imagery? Here we explore three possibilities. First, simulation could simply be an unconscious version of mental imagery. Whereas language understanding is usually fast and effortless, constructing conscious mental images is comparatively slow and effortful (Kosslyn & Ochsner, 1994; Farah, 1989). Hence, perhaps implicit simulation comprises a subset of the neurocognitive processes involved in explicit imagery (i.e., imagery = simulation + consciousness). In principle, this view could be easily reconciled with Gallese and Lakoff's (2005) proposal. When they wrote that the neural substrates of language understanding and imagination were “the same” (2005, p. 456), presumably they were referring to the motor correlates of these processes. Yet, this possibility is difficult to reconcile with the present data. If simulation were a proper subset of imagery, we would expect to see overlapping activation in motor areas during the LD and IM tasks. In fact, we found that the parts of premotor cortex activated during LD and IM were mutually exclusive.

On a second possibility, perhaps implicit simulation and explicit imagery are at opposite ends of a continuum of richness or detail. In order for mental simulations to occur rapidly enough to support on-line language processing, they must be highly schematic. Details can be filled in if the context encourages elaborating on the initial simulation and if time permits. On this account, motor representations that constitute simulation and imagery differ in amount of detail, but not in kind. Yet, this is also inconsistent with the present data. The neural correlates of two processes that only differ in amount should be partially overlapping, or at least correlated, contrary to our findings. Although simulations and images may indeed differ in richness or detail, this difference cannot account for the double dissociation between their neural substrates. (N.B., The present data should not be interpreted as suggesting that the neural substrates of simulation and imagery can never be overlapping, a point we return to below.)

On a third possibility, perhaps implicit simulation during language understanding and explicit imagery rely on different cerebral structures because they serve different functions at a computational level (Marr, 1982). A core component of implicit simulation during language processing is prediction. Myriad studies using behavioral and neural measures have demonstrated language users' forward-looking orientation. Comprehenders use incoming linguistic and extralinguistic information, rapidly and often unconsciously, to anticipate words, sounds, semantic associates, syntactic structures, discourse referents, and changes in the extralinguistic environment that are likely to be relevant (e.g., DeLong, Urbach, & Kutas, 2005; Van Berkum, Brown, Zwitserlood, Kooijman, & Hagoort, 2005; for a review, see Van Berkum, in press). Presumably, prediction during language comprehension is not motivated solely by the need to comprehend language per se. Rather, language is a tool that helps its users to interact with their physical and social environments. As such, implicit motor simulation during action verb understanding (termed “presonance” by Zwaan & Kaschak, 2008) may serve predictive functions: preparing the language user for likely actions, linguistic or extralinguistic, on a brief time scale that is relevant for using language and planning bodily actions (for a discussion, see Zwaan & Kaschak, 2008; Zwaan, 2004).

By contrast with implicit simulation during language processing, explicit mental imagery is fundamentally reflective. Explicit imagery cued by words necessarily occurs after a word has been at least partially understood; we must know what to imagine before we can start imagining it consciously. The computational functions of imagery that have been proposed emphasize its utility for retrospective tasks (Pinker, 1984), such as recovering information learned implicitly via perception (e.g., you may not know how many windows your house has, but you can recover this information from your perceptual experiences by mentally scanning its exterior) or confirming initial perceptual guesses during motor imagery (e.g., de Lange et al., 2005; Parsons, 1994). Of course mental imagery can be used prospectively, as when an athlete mentally rehearses a sequence of movements before executing them, but even in such examples, the imager prepares for a future event via gradual, effortful mental reenactment of past experiences.

If implicit motor simulation is predictive, then understanding action words should preferentially engage regions involved in motor planning. If conscious motor imagery is reflective (i.e., a covert reenactment of prior actions), then imagining actions should engage not only regions involved in motor planning but also regions involved in motor execution. Consistent with these proposals, we find effector-specific activation during LD in premotor cortex but not primary motor cortex (see also Aziz-Zadeh et al., 2006; Tettamanti et al., 2005; but see Pulvermuller, 2005). By contrast, we find effector-specific activation during imagery in both premotor and primary motor cortices (Tomasino, Fink, Sparing, Dafotakis, & Weiss, 2008; Tomasino et al., 2007; see also Papeo, Vallesi, Isaja, & Rumiati, 2009).4

The proposal that simulation and imagery are partially distinct processes with different computational goals predicts a dissociation in the motor system, and it is the only proposal we are aware of that can predict the double dissociation we observe in premotor cortex. However, it does not necessarily entail a double dissociation. Why might simulations and images cued by the same verbs have different premotor representations? Assuming that participants had to understand each word before they could begin to imagine the action it referred to, the words presented in the “imagery” condition may have first cued implicit simulations (partly constitutive of understanding), followed by explicit images. Initial premotor activation in the imagery condition may have corresponded closely to the activation observed for LD. Although this prospective activation is specified at the level of the effector, it is likely to be highly schematic. This schematicity is important for two reasons. First, simulation must be fast enough to support on-line language processing. Second, simulations cued by language must be underspecified enough to flexibly accommodate an incoming message: Very different action plans would be necessary if the word grasp were followed by “…the barbell” as opposed to “…the needle”.

A different level of specification is necessary to create a mental image cued by language. If we vividly imagine the action corresponding to the verb throw, it is necessary to decide whether to imagine throwing a baseball or a Frisbee because these require different grips and different arm motions. As simulation ends and imagery begins, the premotor representation cued by the appearance of the word throw is changed, perhaps due to the specification of action plans within premotor cortex. Such a change should be observable given neuroimaging methods with sufficient spatio-temporal resolution. Given the temporal resolution of fMRI, however, any transient activation corresponding to implicit simulation at the beginning of an “imagery” trial is obscured by activation corresponding to the more sustained process of creating and monitoring an explicit mental image.

As should now be clear, our proposal does not imply that simulation during understanding language precludes explicit imagery. On the contrary, at times language encourages explicit imagery, as when we appreciate a vivid description of scenery or reflect on poetry (e.g., Just, Newman, Keller, McEleney, & Carpenter, 2004). We may engage mental imagery more in these contexts than during mundane language understanding or when reading isolated verbs in an LD task. Again, we emphasize that nothing we propose here implies that the neural correlates of language understanding and explicit motor imagery can never be overlapping. Rather, our data show that they do not necessarily overlap, contrary to the predictions derived from some theories of embodied language understanding.

Constraining Interpretation of Previous Experimental Results

The present study addressed two concerns raised by Postle et al. (2008), which have complicated interpretation of previous experimental results. First, on a skeptical interpretation of the original studies to show effector-specific activation of motor areas during verb processing (e.g., Tettamanti et al., 2005; Hauk et al., 2004), it is possible that observed activation was due to explicit imagery rather than action verb semantics per se (see also Willems & Hagoort, 2007). Although participants in these studies were not instructed to form explicit mental images in response to the stimuli, they were not prevented from forming them (perhaps to pass the time between stimuli in the scanner). By comparing effector-specific activation across tasks (LD vs. IM), we explicitly controlled for spurious activation due to explicit imagery during LD.

Second, Postle et al. (2008) did not find effector-specific activation in premotor cortex during action verb processing, in contrast to earlier studies. They suggested that perhaps earlier positive results were artifacts of differences in imageability between critical and control stimuli. Indeed, some previous studies compared action verbs to abstract language as a high-level control (Tettamanti et al., 2005) or to hash marks as a lower-level control (Hauk et al., 2004). Given that concrete action verbs are arguably more imageable than abstract words and that this is known to affect activations in (among other regions) premotor cortex (e.g., D'Esposito et al., 1997), it is possible that effects in earlier studies were mainly driven by increased imagery to concrete action language as compared with more abstract language. Yet, in the present study, we find effects in the premotor ROI during LD on man > nonman words despite having equated the different verb types for imageability among other standard psycholinguistic variables.

Conclusion

Understanding manual action verbs and forming mental images of the actions they name both produce effector-specific activation in regions of premotor cortex. Yet, parts of premotor cortex involved in these processes were found to be mutually exclusive: activation in the two tasks was neither overlapping nor correlated. These dissociations are inconsistent with the proposal that the neural substrates of implicit mental simulation during language processing and explicit mental imagery are the same and also inconsistent with the possibility that simulation and imagery merely differ in degree of conscious awareness or level of detail. Rather, these data are most consistent with the possibility that simulation and imagery serve different functions at a computational level, simulation being strongly predictive and imagery being largely reflective. Given the observed neural dissociations and the proposed computational-level distinctions, the constructs of mental simulation and mental imagery should be distinguished in theories of embodied semantics.

Acknowledgments

This study was supported by the European Union Joint-Action Science and Technology Project (IST-FP6-003747) as well as by grants from the Netherlands Organisation for Scientific Research (NWO Rubicon 446-08-008) and the Niels Stensen Foundation. The authors thank Martin Laverman, Jacqueline de Nooijer, Daan van Rooij, and Paul Gaalman for assistance, and Rick Helmich and two anonymous reviewers for their helpful comments.

Reprint requests should be sent to Roel Willems, Helen Wills Neuroscience Institute, University of California Berkeley, 132 Barker Hall, Berkeley, CA 94720-3190, or via e-mail: roelwillems@berkeley.edu.

Notes

1. 

As a control analysis, we also split the data in four bins of 12 trials each, comparing data from Bins 1 and 3 with data from Bins 2 and 4. The results confirm the odd–even split-half analysis, and we do not report the results of the four-bin analysis.

2. 

The same results were obtained when ROIs were based on the even-numbered trials and testing was done on the odd-numbered trials.

3. 

This area of superior frontal sulcus has been implicated previously in working memory maintenance (Passingham & Rowe, 2002; Rowe, Toni, Josephs, Frackowiak, & Passingham, 2000). Notably, Hauk et al. (2004) observed activation in middle frontal gyrus to reading of hand action verbs compared with abstract verbs. This activation was more lateral but in the same vicinity as activation found here in the LDman > nonman comparison. More research is needed to reveal what underlies these activations. It is interesting to note that this was the only region from the whole-brain analysis, which was not activated during the action execution localizer.

4. 

Potentially, this distinction could help to explain conflicting findings of primary motor cortex involvement during motor imagery (see Munzert, Lorey, & Zentgraf, 2009; de Lange, Roelofs, & Toni, 2008; Jeannerod, 2006), an issue which is beyond the scope of the present article.

REFERENCES

Aziz-Zadeh
,
L.
, &
Damasio
,
A.
(
2008
).
Embodied semantics for actions: Findings from functional brain imaging.
Journal of Physiology (Paris)
,
102
,
35
39
.
Aziz-Zadeh
,
L.
,
Wilson
,
S. M.
,
Rizzolatti
,
G.
, &
Iacoboni
,
M.
(
2006
).
Congruent embodied representations for visually presented actions and linguistic phrases describing actions.
Current Biology
,
16
,
1818
1823
.
Baayen
,
R. H.
,
Piepenbrock
,
R.
, &
Rijn
,
H. v.
(
1993
).
The CELEX lexical database.
Philadelphia, PA
:
Linguistic Data Consortium, University of Pennsylvania
.
Bakker
,
M.
,
De Lange
,
F. P.
,
Helmich
,
R. C.
,
Scheeringa
,
R.
,
Bloem
,
B. R.
, &
Toni
,
I.
(
2008
).
Cerebral correlates of motor imagery of normal and precision gait.
Neuroimage
,
41
,
998
1010
.
Barsalou
,
L. W.
(
1999
).
Perceptual symbol systems.
Behavioural and Brain Sciences
,
22
,
577
609
.
Barsalou
,
L. W.
(
2009
).
Simulation, situated conceptualization, and prediction.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
364
,
1281
1289
.
Bergen
,
B. K.
,
Lindsay
,
S.
,
Matlock
,
T.
, &
Narayanan
,
S.
(
2007
).
Spatial and linguistic aspects of visual imagery in sentence comprehension.
Cognitive Science
,
31
,
733
764
.
Bonda
,
E.
,
Petrides
,
M.
,
Frey
,
S.
, &
Evans
,
A.
(
1995
).
Neural correlates of mental transformations of the body-in-space.
Proceedings of the National Academy of Sciences, U.S.A.
,
92
,
11180
11184
.
Boulenger
,
V.
,
Hauk
,
O.
, &
Pulvermuller
,
F.
(
2009
).
Grasping ideas with the motor system: Semantic somatotopy in idiom comprehension.
Cerebral Cortex
,
19
,
1905
1914
.
Brett
,
M.
,
Anton
,
J.-L.
,
Valabregue
,
R.
, &
Poline
,
J.-B.
(
2002
).
Region of interest analysis using an SPM toolbox.
Neuroimage
,
16
,
497
.
Chwilla
,
D. J.
,
Brown
,
C. M.
, &
Hagoort
,
P.
(
1995
).
The N400 as a function of the level of processing.
Psychophysiology
,
32
,
274
285
.
Cisek
,
P.
, &
Kalaska
,
J. F.
(
2004
).
Neural correlates of mental rehearsal in dorsal premotor cortex.
Nature
,
431
,
993
996
.
Dale
,
A. M.
(
1999
).
Optimal experimental design for event-related fMRI.
Human Brain Mapping
,
8
,
109
114
.
de Lange
,
F. P.
,
Hagoort
,
P.
, &
Toni
,
I.
(
2005
).
Neural topography and content of movement representations.
Journal of Cognitive Neuroscience
,
17
,
97
112
.
de Lange
,
F. P.
,
Helmich
,
R. C.
, &
Toni
,
I.
(
2006
).
Posture influences motor imagery: An fMRI study.
Neuroimage
,
33
,
609
617
.
de Lange
,
F. P.
,
Roelofs
,
K.
, &
Toni
,
I.
(
2008
).
Motor imagery: A window into the mechanisms and alterations of the motor system.
Cortex
,
44
,
494
506
.
DeLong
,
K. A.
,
Urbach
,
T. P.
, &
Kutas
,
M.
(
2005
).
Probabilistic word pre-activation during language comprehension inferred from electrical brain activity.
Nature Neuroscience
,
8
,
1117
1121
.
D'Esposito
,
M.
,
Detre
,
J. A.
,
Aguirre
,
G. K.
,
Stallcup
,
M.
,
Alsop
,
D. C.
,
Tippet
,
L. J.
,
et al
(
1997
).
A functional MRI study of mental image generation.
Neuropsychologia
,
35
,
725
730
.
Downing
,
P. E.
,
Chan
,
A. W.
,
Peelen
,
M. V.
,
Dodds
,
C. M.
, &
Kanwisher
,
N.
(
2006
).
Domain specificity in visual cortex.
Cerebral Cortex
,
16
,
1453
1461
.
Eickhoff
,
S. B.
,
Stephan
,
K. E.
,
Mohlberg
,
H.
,
Grefkes
,
C.
,
Fink
,
G. R.
,
Amunts
,
K.
,
et al
(
2005
).
A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data.
Neuroimage
,
25
,
1325
1335
.
Farah
,
M. J.
(
1989
).
The neural basis of mental imagery.
Trends in Neurosciences
,
12
,
395
399
.
Fernandino
,
L.
, &
Iacoboni
,
M.
(
in press
).
Are cortical motor maps based on body parts or coordinated actions? Implications for embodied semantics.
Brain Language
.
Friston
,
K. J.
,
Holmes
,
A.
,
Poline
,
J. B.
,
Price
,
C. J.
, &
Frith
,
C. D.
(
1996
).
Detecting activations in PET and fMRI: Levels of inference and power.
Neuroimage
,
4
,
223
235
.
Gallese
,
V.
, &
Lakoff
,
G.
(
2005
).
The brain's concepts: The role of the sensory-motor system in conceptual knowledge.
Cognitive Neuropsychology
,
22
,
455
479
.
Gerardin
,
E.
,
Sirigu
,
A.
,
Lehericy
,
S.
,
Poline
,
J. B.
,
Gaymard
,
B.
,
Marsault
,
C.
,
et al
(
2000
).
Partially overlapping neural networks for real and imagined hand movements.
Cerebral Cortex
,
10
,
1093
1104
.
Geyer
,
S.
,
Ledberg
,
A.
,
Schleicher
,
A.
,
Kinomura
,
S.
,
Schormann
,
T.
,
Burgel
,
U.
,
et al
(
1996
).
Two different areas within the primary motor cortex of man.
Nature
,
382
,
805
807
.
Grush
,
R.
(
2004
).
The emulation theory of representation: Motor control, imagery, and perception.
Behavioral and Brain Sciences
,
27
,
377
396
; discussion 396-442.
Hauk
,
O.
,
Johnsrude
,
I.
, &
Pulvermuller
,
F.
(
2004
).
Somatotopic representation of action words in human motor and premotor cortex.
Neuron
,
41
,
301
307
.
Helmich
,
R. C.
,
de Lange
,
F. P.
,
Bloem
,
B. R.
, &
Toni
,
I.
(
2007
).
Cerebral compensation during motor imagery in Parkinson's disease.
Neuropsychologia
,
45
,
2201
2215
.
Heremans
,
E.
,
Helsen
,
W. F.
, &
Feys
,
P.
(
2008
).
The eyes as a mirror of our thoughts: Quantification of motor imagery of goal-directed movements through eye movement registration.
Behavioural Brain Research
,
187
,
351
360
.
Jeannerod
,
M.
(
2006
).
Motor cognition.
Oxford
:
Oxford University Press
.
Johnson
,
S. H.
,
Rotte
,
M.
,
Grafton
,
S. T.
,
Hinrichs
,
H.
,
Gazzaniga
,
M. S.
, &
Heinze
,
H. J.
(
2002
).
Selective activation of a parietofrontal circuit during implicitly imagined prehension.
Neuroimage
,
17
,
1693
1704
.
Just
,
M. A.
,
Newman
,
S. D.
,
Keller
,
T. A.
,
McEleney
,
A.
, &
Carpenter
,
P. A.
(
2004
).
Imagery in sentence comprehension: An fMRI study.
Neuroimage
,
21
,
112
124
.
Kemmerer
,
D.
, &
Gonzalez-Castillo
,
J.
(
2008
).
The two-level theory of verb meaning: An approach to integrating the semantics of action with the mirror neuron system.
Brain and Language
.
Kleinbaum
,
D. G.
,
Kupper
,
L. L.
,
Muller
,
K. E.
, &
Nizam
,
A.
(
1998
).
Applied regression analysis and other multivariable methods.
Pacific Grove, CA
:
Brooks/Cole
.
Kosslyn
,
S. M.
, &
Ochsner
,
K. N.
(
1994
).
In search of occipital activation during visual mental imagery.
Trends in Neurosciences
,
17
,
290
292
.
Kriegeskorte
,
N.
,
Simmons
,
W. K.
,
Bellgowan
,
P. S.
, &
Baker
,
C. I.
(
2009
).
Circular analysis in systems neuroscience: The dangers of double dipping.
Nature Neuroscience
,
12
,
535
540
.
Mahon
,
B. Z.
, &
Caramazza
,
A.
(
2008
).
A critical look at the embodied cognition hypothesis and a new proposal for grounding conceptual content.
Journal Physiology (Paris)
,
102
,
59
70
.
Marr
,
D.
(
1982
).
Vision.
San Francisco
:
W.H. Freeman
.
McGonigle
,
D. J.
,
Howseman
,
A. M.
,
Athwal
,
B. S.
,
Friston
,
K. J.
,
Frackowiak
,
R. S.
, &
Holmes
,
A. P.
(
2000
).
Variability in fMRI: An examination of intersession differences.
Neuroimage
,
11
,
708
734
.
Munzert
,
J.
,
Lorey
,
B.
, &
Zentgraf
,
K.
(
2009
).
Cognitive motor processes: The role of motor imagery in the study of motor representations.
Brain Research Reviews
,
60
,
306
326
.
Neely
,
J. H.
(
1991
).
Semantic priming effects in visual word recognition: A selective review of current findings and theories.
In D. Besner & G. W. Humphreys (Eds.),
Basic processes in reading: Visual word recognition
(pp.
264
336
).
Hillsdale, NJ
:
Lawrence Erlbaum
.
Nichols
,
T.
,
Brett
,
M.
,
Andersson
,
J.
,
Wager
,
T.
, &
Poline
,
J. B.
(
2005
).
Valid conjunction inference with the minimum statistic.
Neuroimage
,
25
,
653
660
.
Oldfield
,
R. C.
(
1971
).
The assessment and analysis of handedness: The Edinburgh inventory.
Neuropsychologia
,
9
,
97
113
.
Papeo
,
L.
,
Vallesi
,
A.
,
Isaja
,
A.
, &
Rumiati
,
R. I.
(
2009
).
Effects of TMS on different stages of motor and non-motor verb processing in the primary motor cortex.
PLoS ONE
,
4
,
e4508
.
Parsons
,
L. M.
(
1994
).
Temporal and kinematic properties of motor behavior reflected in mentally simulated action.
Journal of Experimental Psychology: Human Perception and Performance
,
20
,
709
730
.
Passingham
,
R. E.
, &
Rowe
,
J.
(
2002
).
Dorsal prefrontal cortex: Maintenance in memory of attentional selection?
In D. T. Stuss & R. T. Knight (Eds.),
Principles of frontal lobe function
(pp.
221
232
).
Oxford
:
Oxford University Press
.
Peelen
,
M. V.
,
Wiggett
,
A. J.
, &
Downing
,
P. E.
(
2006
).
Patterns of fMRI activity dissociate overlapping functional brain areas that respond to biological motion.
Neuron
,
49
,
815
822
.
Pinker
,
S.
(
1984
).
Visual cognition: An introduction.
Cognition
,
18
,
1
63
.
Poline
,
J. B.
,
Worsley
,
K. J.
,
Evans
,
A. C.
, &
Friston
,
K. J.
(
1997
).
Combining spatial extent and peak intensity to test for activations in functional imaging.
Neuroimage
,
5
,
83
96
.
Postle
,
N.
,
McMahon
,
K. L.
,
Ashton
,
R.
,
Meredith
,
M.
, &
de Zubicaray
,
G. I.
(
2008
).
Action word meaning representations in cytoarchitectonically defined primary and premotor cortices.
Neuroimage
,
43
,
634
644
.
Pulvermuller
,
F.
(
2005
).
Brain mechanisms linking language and action.
Nature Reviews Neuroscience
,
6
,
576
582
.
Raposo
,
A.
,
Moss
,
H. E.
,
Stamatakis
,
E. A.
, &
Tyler
,
L. K.
(
2009
).
Modulation of motor and premotor cortices by actions, action words and action sentences.
Neuropsychologia
,
47
,
388
396
.
Relander
,
K.
,
Rama
,
P.
, &
Kujala
,
T.
(
2009
).
Word semantics is processed even without attentional effort.
Journal of Cognitive Neuroscience
,
21
,
1511
1522
.
Rowe
,
J. B.
,
Toni
,
I.
,
Josephs
,
O.
,
Frackowiak
,
R. S.
, &
Passingham
,
R. E.
(
2000
).
The prefrontal cortex: Response selection or maintenance within working memory?
Science
,
288
,
1656
1660
.
Ruff
,
I.
,
Blumstein
,
S. E.
,
Myers
,
E. B.
, &
Hutchison
,
E.
(
2008
).
Recruitment of anterior and posterior structures in lexical-semantic processing: An fMRI study comparing implicit and explicit tasks.
Brain and Language
,
105
,
41
49
.
Sato
,
M.
,
Mengarelli
,
M.
,
Riggio
,
L.
,
Gallese
,
V.
, &
Buccino
,
G.
(
2008
).
Task related modulation of the motor system during language processing.
Brain and Language
,
105
,
83
90
.
Szameitat
,
A. J.
,
Shen
,
S.
, &
Sterr
,
A.
(
2007a
).
Effector-dependent activity in the left dorsal premotor cortex in motor imagery.
European Journal of Neuroscience
,
26
,
3303
3308
.
Szameitat
,
A. J.
,
Shen
,
S.
, &
Sterr
,
A.
(
2007b
).
Motor imagery of complex everyday movements. An fMRI study.
Neuroimage
,
34
,
702
713
.
Tettamanti
,
M.
,
Buccino
,
G.
,
Saccuman
,
M. C.
,
Gallese
,
V.
,
Danna
,
M.
,
Scifo
,
P.
,
et al
(
2005
).
Listening to action-related sentences activates fronto-parietal motor circuits.
Journal of Cognitive Neuroscience
,
17
,
273
281
.
Tomasino
,
B.
,
Fink
,
G. R.
,
Sparing
,
R.
,
Dafotakis
,
M.
, &
Weiss
,
P. H.
(
2008
).
Action verbs and the primary motor cortex: A comparative TMS study of silent reading, frequency judgments, and motor imagery.
Neuropsychologia
,
46
,
1915
1926
.
Tomasino
,
B.
,
Werner
,
C. J.
,
Weiss
,
P. H.
, &
Fink
,
G. R.
(
2007
).
Stimulus properties matter more than perspective: An fMRI study of mental imagery and silent reading of action phrases.
Neuroimage
,
36(Suppl. 2)
,
T128
T141
.
Van Berkum
,
J. J. A.
(
in press
).
The electrophysiology of discourse and conversation.
In M. Spivey, M. Joanisse, & K. McRae (Eds.),
The Cambridge handbook of psycholinguistics.
Cambridge
:
Cambridge University Press
.
Van Berkum
,
J. J. A.
,
Brown
,
C. M.
,
Zwitserlood
,
P.
,
Kooijman
,
V.
, &
Hagoort
,
P.
(
2005
).
Anticipating upcoming words in discourse: Evidence from ERPs and reading times.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
31
,
443
467
.
Verhagen
,
L.
,
Grol
,
M.
,
Dijkerman
,
H.
, &
Toni
,
I.
(
2006
).
Studying visually-guided reach-to-grasp movements in an MR-environment.
Neuroimage
,
31
,
S45
.
Willems
,
R. M.
, &
Hagoort
,
P.
(
2007
).
Neural evidence for the interplay between language, gesture, and action: A review.
Brain and Language
,
101
,
278
289
.
Willems
,
R. M.
,
Hagoort
,
P.
, &
Casasanto
,
D.
(
2010
).
Body-specific representations of action verbs: Neural evidence from right- and left-handers.
Psychological Science
,
21
,
67
74
.
Wolpert
,
D. M.
, &
Ghahramani
,
Z.
(
2000
).
Computational principles of movement neuroscience.
Nature Neuroscience
,
3(Suppl.)
,
1212
1217
.
Woolsey
,
C. N.
(
1963
).
Comparative studies on localization in precentral and supplementary motor areas.
International Journal of Neurology
,
4
,
13
20
.
Zwaan
,
R. A.
(
2004
).
The immersed experiencer: Toward an embodied theory of language comprehension.
In B. H. Ross (Ed.),
The psychology of learning and motivation
(
Vol. 44
).
New York
:
Academic Press
.
Zwaan
,
R. A.
, &
Kaschak
,
M. P.
(
2008
).
Language in the brain, body, and world.
In M. Robbins & M. Aydede (Eds.),
Cambridge handbook of situated cognition.
Cambridge
:
Cambridge University Press
.