Perception and action are classically thought to be supported by functionally and neuroanatomically distinct mechanisms. However, recent behavioral studies using an action priming paradigm challenged this view and showed that action representations can facilitate object recognition. This study determined whether action representations influence object recognition during early visual processing stages, that is, within the first 150 msec. To this end, the time course of brain activation underlying such action priming effects was examined by recording ERPs. Subjects were sequentially presented with two manipulable objects (e.g., tools), which had to be named. In the congruent condition, both objects afforded similar actions, whereas dissimilar actions were afforded in the incongruent condition. In order to test the influence of the prime modality on action priming, the first object (prime) was presented either as picture or as word. We found an ERP effect of action priming over the central scalp as early as 100 msec after target onset for pictorial, but not for verbal primes. A later action priming effect on the N400 ERP component known to index semantic integration processes was obtained for both picture and word primes. The early effect was generated in a fronto-parietal motor network, whereas the late effect reflected activity in anterior temporal areas. The present results indicate that action priming influences object recognition through both fast and slow pathways: Action priming affects rapid visuomotor processes only when elicited by pictorial prime stimuli. However, it also modulates comparably slow conceptual integration processes independent of the prime modality.
It is well accepted that recognizing a visual object heavily depends on analysis of the visual input and on activation of stored visual object representations (Grill-Spector & Malach, 2004; Humphreys, Riddoch, & Quinlan, 1988; Biederman, 1987). Recent studies, however, suggest that recognition of manipulable objects not only involves visual representations but also activates action representations within the motor system (Martin & Chao, 2001; Tucker & Ellis, 1998; Warrington & McCarthy, 1987). Behavioral studies using an action priming paradigm, which are described in more detail below, showed that action representations can facilitate object recognition, and thus, play a functional role in visual object recognition (Helbig, Steinwender, Graf, & Kiefer, 2010; Helbig, Graf, & Kiefer, 2006). However, the mechanisms underlying this action priming effect linking perception and action are largely unknown. The present study therefore aimed at specifying this influence of action representations on object perception by exploiting the high temporal resolution of ERP recordings. In particular, we were interested to determine whether action representations influence object recognition already during an early processing window, that is, within the first 150 msec after stimulus onset that temporally overlaps with visual processing stages. Alternatively, action priming could facilitate object recognition in a later time window between 300 and 500 msec through modulating semantic integration processes subserving object recognition (Goodale & Haffenden, 2003).
Action priming effects on object recognition challenge classical cognitive and neuropsychological models of object recognition and action control that propose two functionally and neuroanatomically distinct pathways subserving object recognition and object-directed action, respectively (Milner & Goodale, 1995; Ungerleider & Mishkin, 1982): The ventral visual stream, which runs from primary visual cortex to the inferior portions of occipital and temporal cortex, subserves visual object recognition. In contrast, the dorsal visual stream, which also originates in primary visual cortex, but continues to superior parietal cortex and premotor cortex, is devoted to the computation of visuospatial information and to the preparation of object-directed action. In this classical model, object recognition and object-directed action have been conceived as fundamentally distinct processes following different computational principles (Desanghere & Marotta, 2008; Hu & Goodale, 2000; Bridgeman, Peery, & Anand, 1997; Milner & Goodale, 1995): Object recognition is relatively slow and based on the construction of a conscious visual percept of the object. In contrast, preparation of object-directed action is fast and unconscious.
However, in disagreement with this classical view, evidence is accumulating that the functional and neural processing pathways underlying object recognition and object-directed action influence each other (Amazeen & DaSilva, 2005; Goodale & Haffenden, 2003; Creem & Proffitt, 2001; Hommel, Muesseler, Aschersleben, & Prinz, 2001), and rely on similar computational principles (Graf, 2006; Graf, Kaping, & Bulthoff, 2005). Moreover, in line with an embodiment framework of conceptual representations, which assumes close links between conceptual object representations and the sensory–motor brain systems (Pulvermüller, 2005; Kiefer & Spitzer, 2001; Barsalou, 1999), several lines of evidence suggest that action representations within the motor system are crucial for recognizing visual objects. For example, some brain-damaged patients were impaired in retrieving knowledge about small manipulable artifact objects, but exhibited preserved knowledge about large artifact objects such as buildings and natural objects such as animals and plants (Warrington & McCarthy, 1987). This selective impairment presumably reflects damage to the brain system representing action information, which predominantly deteriorates recognition of small manipulable artifacts more severely than other objects. Similarly, patients with Parkinson's disease that suffer from motor impairments exhibit a disrupted access to action representations of visually presented manipulable objects such as door handles (Poliakoff, Galpin, Dick, Moore, & Tipper, 2007).
Neurophysiological studies in neurologically intact participants provide indirect evidence for the view that action-related information processed within the dorsal visual stream plays a functional role for the recognition of manipulable artifact objects (Hoenig, Sim, Bochev, Herrnberger, & Kiefer, 2008; Kiefer, Sim, Liebich, Hauk, & Tanaka, 2007; Noppeney, Price, Penny, & Friston, 2006; Kiefer, 2001, 2005; Chao & Martin, 2000; Martin, Wiggs, Ungerleider, & Haxby, 1996): Passively viewing, categorizing, and silently naming manipulable artifact objects activated areas involved in action representation (left premotor, left posterior parietal cortex). Action representations presumably constitute the conceptual core of manipulable objects because activity in frontal and parietal motor areas in response to artifact objects was also observed when they were not task-relevant (Hoenig et al., 2008). However, neurophysiological studies only provide correlational information concerning the link between action representations and visual object recognition.
The functional role of action representations during visual recognition of manipulable artifact objects has been directly tested in two behavioral studies. In the first study, Helbig et al. (2006) developed an action priming paradigm, in which participants had to name briefly presented and masked pairs of manipulable artifact objects. Object pairs either afforded similar (e.g., hammer–axe) or dissimilar actions (e.g., hammer–saw). Prime and target pictures differed only with regard to action congruency, but were carefully matched for possibly confounding factors. Participants named the second object more accurately when it was preceded by a prime object associated with a similar action, demonstrating an action priming effect. The involvement of action representations in generating this priming effect could be substantiated in a second study (Helbig et al., 2010), in which participants viewed short action movies as primes. When the prime was an object name (Helbig et al., 2006, Experiment 2), the action priming effect was reduced (and not statistically reliable), suggesting that action representations elicited by words do not facilitate recognition of target objects as efficiently as those elicited by pictorial or movie primes. Presumably, pictures and action movies activate more detailed action representations, and thus, lead to stronger action priming effects than words (for a more detailed explanation, see the Discussion section). It is very unlikely, however, that the reduced priming effect for words was due to a lack of semantic processing during word reading. Although word reading is a shallow semantic task, it induces semantic processing of word meaning as shown by priming effects (Kiefer, Weisbrod, Kern, Maier, & Spitzer, 1998; Lupker, 1985) and suffices to activate action representations (Hauk, Johnsrude, & Pulvermuller, 2004; Hauk & Pulvermüller, 2004). Altogether, these previously observed behavioral action priming effects demonstrate that action representations play a functional role in object recognition. They suggest that action representations (predominantly dorsal processes) influence object recognition (predominantly ventral processes). Therefore, a recurrent exchange of information between the visual streams has to be assumed before the process of visual object recognition is completed.
The present study was set up in order to elucidate the temporal and spatial orchestration of brain processes subserving the influence of action representations on object recognition. In particular, we sought to determine whether action priming influences motor representations rapidly activated within the first 150 msec of visual processing during the ongoing object recognition process. Visual object recognition is estimated to be completed between 150 and 300 msec (Hauk et al., 2007; Johnson & Olshausen, 2003; Liu, Harris, & Kanwisher, 2002; Thorpe, Fize, & Marlot, 1996). Alternatively, the action priming effect could result only from later semantic integration processes of action-related features (van Elk, van Schie, Zwaan, & Bekkering, 2010; Goodale & Haffenden, 2003) in a time interval between 300 and 500 msec.
To distinguish between these alternatives, we recorded ERPs while participants had to recognize object pictures within an action priming paradigm (Helbig et al., 2006). Target objects (always presented as pictures) were preceded by prime objects that either afforded congruent or incongruent actions. Prime objects were presented either as pictures or as words, in order to assess the modulatory influences of action representation as a function of input modality (pictures vs. words). In contrast to the behavioral study (Helbig et al., 2006), in which naming accuracy served as the only dependent measure, stimuli were presented unmasked and clearly visible to obtain sufficient correct trials for ERP extraction. In addition, participants had to delay the naming response after the target object in order to avoid contamination of EEG recordings by articulation artifacts.
If the action priming effect depends on a modulation of rapidly activated action representations within the dorsal stream, ERP effects of action priming should be obtained within the first 150 msec of target processing. Previous studies observed ERP effects over the frontal and central scalp, which were associated with access to action representations (Pulvermüller, Shtyrov, & Ilmoniemi, 2005; Hauk & Pulvermüller, 2004). ERP effects related to the activation of action representations were already observed at about 100 msec after stimulus onset and temporally overlapped with the visual P1 ERP component (Hoenig et al., 2008; Kiefer, Sim, et al., 2007). Rapid activation of action representations provides the functional basis for influences of action priming on object recognition during early visual processing stages. This proposed influence of action representations on the ongoing visual object recognition process is in line with the previous suggestion that perceptual and conceptual processes interact in a cascaded processing sequence during visual object recognition within the first 150 msec of stimulus processing (Hauk et al., 2007; Levelt, Praamstra, Meyer, Helenius, & Salmelin, 1998). Conversely, if the action priming effect originates from later semantic integration processes, the N400 ERP component should be modulated by action congruency. The N400 is a negative ERP deflection between 300 and 500 msec over the centro-parietal scalp, which specifically reflects semantic integration processes (Kutas & Hillyard, 1980). The N400 is sensitive to semantic deviations with larger N400 amplitudes for semantically incongruent words and pictures compared to congruent stimuli (e.g., Kiefer, 2001; Ganis, Kutas, & Sereno, 1996; Bentin, McCarthy, & Wood, 1985). Studies using intracranial electrodes (Nobre & McCarthy, 1995) and ERP source analysis (Kiefer, Schuch, Schenck, & Fiedler, 2007) have suggested generators for the N400 in the anterior temporal lobe, a region that anatomically belongs to the ventral pathway. It should be noted at this place that due to volume conduction in the brain, ERP components (e.g., N400, which is maximal over the central and parietal scalp) do not necessarily exhibit their largest amplitudes in scalp regions, which are neuroanatomically close to the generating brain regions (Nunez, 1981). The significance of anterior temporal areas for semantic integration processes elicited by pictorial and verbal stimuli has also been shown in neuroimaging studies (e.g., Moss, Rodd, Stamatakis, Bright, & Tyler, 2005; Vandenberghe, Price, Wise, Josephs, & Frackowiak, 1996). Anterior temporal regions have been associated with the integration of modality-specific conceptual information from the sensory and motor systems (Kiefer, Sim, Herrnberger, Grothe, & Hoenig, 2008; Kiefer, Sim, et al., 2007; Patterson, Nestor, & Rogers, 2007). It is possible that action representations that are congruent between prime and target help to integrate distributed conceptual features into a coherent concept, thereby facilitating object recognition (van Elk et al., 2010).
In summary, given the evidence of action priming effects on object recognition (Helbig et al., 2006, 2010), we predict a modulation of action representations by action priming which occurs fast enough to influence visual object recognition, that is, occurring within the first 150 msec. Based on the results from our previous behavioral study (Helbig et al., 2006), in which we found an action priming effect more strongly for picture primes than for word primes, we predict an early frontal and central ERP action priming effect reflecting rapid preactivation of the dorsal visuomotor system only for the pictorial prime modality. In contrast, a later action priming effect on the N400 ERP component, which indexes conceptual integration processes, might be obtained for both pictures and word primes.
Twenty healthy right-handed (Oldfield, 1971) native German-speaking volunteers (10 men/10 women) with normal or corrected-to-normal vision participated in the ERP experiment. The average age was 24 years (range 20–31 years). The subjects signed a written consent after the nature and the consequences of the experiment had been explained before taking part in the experiment. The experiment was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and has been approved by the local Ethical Committee. The subjects were paid for their participation.
Stimuli consisted of man-made manipulable familiar objects, which were presented visually on a computer screen. All stimuli were the same as in our earlier behavioral priming study (Helbig et al., 2006). Pairs of prime and target objects were selected such that prime and target objects were associated with a similar typical action (congruent condition) in half of the pairs, while in the other half of the pairs the typical actions were dissimilar (incongruent condition). As every prime object was paired with the same number of congruent and incongruent targets (up to three corresponding target couples), possible repetition effects would influence congruent and incongruent conditions comparably. Overall, 20 prime objects were combined with the 70 target objects (35 congruent and 35 incongruent pairs). Examples are shown in Figure 1. Targets were always presented as pictures. The primes were presented as pictures in one block and as words in a different block. The same prime–target object pairings were presented in both the pictorial and verbal priming blocks. For the verbal prime condition, we used the object names most frequently produced in a pilot study (n = 6). Relative frequency of the selected names was 87.0% or greater. The pictorial stimuli (primes and targets) were grayscale photographs of man-made manipulable familiar objects (HEMERA Photo Objects The objects were inscribed into a square of 280 × 280 pixels in order to equate the maximal extension corresponding to a picture size on the screen of about 10.3 × 10.3 cm. At a viewing distance of about 90 cm, the visual angle subtended about 6.5°.
Stimuli of the congruent and incongruent prime–target conditions were carefully matched for various important conceptual and linguistic variables obtained in several pilot studies (Helbig et al., 2006). They were selected such that two-tailed two-sample t tests did not reveal significant differences of baseline naming accuracy (incongruent mean accuracy = 81.0%, congruent mean accuracy = 82.1%, p > .82), visual similarity (ratings on a scale from 1 to 6, with 6 indicating high similarity of shape: incongruent 2.93, congruent 3.23, p > .29), and semantic similarity (ratings on a scale from 1 to 6, with 6 indicating high similarity of the conceptual meaning, i.e., same category or function: incongruent 2.28, congruent 2.71, p > .13) between prime–target stimulus pairs from incongruent and congruent conditions. Baseline naming accuracy (n = 13) and ratings (n = 16) were obtained in two separate participant samples different from the main experiment. Furthermore, target stimuli were matched for word length (incongruent 8.06 letters, congruent 8.33, p > .70) and word frequency (according to the CELEX database; Baayen, Piepenbrock, & Gulikers, 1995) (incongruent 33.19, congruent 23.00, p > .52). Conditions differed reliably only with regard to the similarity of the actions associated with the objects, with higher similarity for congruent as compared to incongruent prime–target stimulus pairs (ratings on a scale from 1 to 6, with 6 indicating high action similarity: incongruent 1.75, congruent 4.69, p < .0001). The results of the pilot studies were also used to determine the correct naming responses to the targets. Stimuli were presented using ERTS software on an IBM-PC compatible computer; all responses of subjects were collected using a standard answer checklist by the experimenter.
Subjects were seated in front of a computer screen in a dimly lit electrically shielded room. All stimuli were displayed in the center of a computer screen synchronously with the screen refresh rate. The entire experiment consisted of two blocks: In one block the prime was presented as picture and in the other block the prime was presented as object name. Presentation order of the blocks was counterbalanced across subjects. Except for the modality of prime presentation, procedure and timing of events in each trial were identical for the two blocks. Each trial started with three hash marks, which prompted participants to initiate the next trial with a button press. Thereafter, a fixation cross was shown for 500 msec that was replaced by the prime object (picture or word depending on the block), which was shown for 300 msec. After a blank screen of 70 msec duration, the target object was presented for 300 msec followed by another blank screen for 700 msec and a question mark. Participants had to name the objects after the question mark was presented, in order to avoid articulation artifacts in the EEG during target presentation. Participants were instructed to name both objects in the order of presentation with their basic level names. Instructions stressed accuracy and did not impose any time pressure. Participants' naming responses activated a voice key that triggered the disappearance of the question mark. After the response was manually recorded by the experimenter, the hash marks as break signal reappeared and the participant was able to initiate the next trial. The order of prime–target stimulus pairs was randomized differently for each subject. Each experimental block started with a short practice phase (10 trials) that included stimuli not used in the main experiment. The entire experiment lasted about 1 hour 30 min, including electrode placement, instructions, and training trials.
ERP Recording, Analysis, and Statistical Analysis
Scalp potentials were collected using an equidistant montage of 64 sintered Ag/AgCl electrodes mounted in an elastic cap (Easy Cap, Herrsching, Germany). An electrode between Fpz and Fz was connected to the ground, and an electrode between Cz and FCz was used as recording reference. Eye movements were monitored with supra- and infraorbital electrodes and with electrodes on the external canthi. Electrode impedance was kept below 5 kΩ. Electrical signals were amplified with Synamps amplifiers (low-pass filter: 70 Hz, 24 dB/octave attenuation; 50 Hz notch filter) and continuously recorded (digitization rate: 250 Hz), digitally band-pass filtered (high cutoff: 16 Hz, 24 dB/octave attenuation; low cutoff: 0.1 Hz, 12 dB/octave attenuation), and segmented (150 msec before prime onset to 600 msec after target onset). EEG data were corrected to a 150-msec baseline prior to the onset of the prime. EEG was corrected for ocular artifacts using independent components analysis (Makeig, Bell, Jung, Ghahremani, & Sejnowski, 1997). Separately for each experimental condition, artifact-free EEG segments to trials with correct responses were averaged synchronous to the onset of the stimulus. Thereafter, the average-reference transformation was applied to the ERP data (Kiefer et al., 1998; Bertrand, Perrin, & Pernier, 1985). EEG analysis was performed with BrainVision Analyzer (Brain Products, Gilching, Germany). Statistical analysis of the ERP data focused on two time windows of interest (Kiefer, Sim, et al., 2007; Kiefer et al., 1998): a P1 window (85–115 msec after stimulus onset) and a N400 time window (380–480 msec after stimulus onset). As in previous studies (Kiefer et al., 2008; Kiefer, Sim, et al., 2007), electrodes within two scalp regions of interest were selected for analysis: central (electrode sites: C3/C4, C1/C2, CP1/CP2, Cz) and parietal (electrode sites: PO1/PO2, P1/P2, CP3/CP4, Pz). Repeated measures ANOVAs were performed on mean voltages with prime modality (picture vs. word), congruency (congruent vs. incongruent), and electrode site (seven positions within the region of interest) as within-subjects factors (p level of .05). In order to account for possible violations of the sphericity assumption of the repeated measures ANOVA model, degrees of freedom were adjusted according to the method of Huynh and Feldt (1970), and the Huynh–Feldt ɛ as well as the corrected significance levels are reported when appropriate.
In order to determine the neural sources for significant action priming effects on scalp ERPs, source analyses were performed using distributed source modeling (minimum norm source estimates; Hauk, 2004) implemented in BESA 5.1 (MEGIS, Gräfelfing, Germany). This method yields the unique solution that explains the data and does not contain components that are “silent” (i.e., do not produce any measurable surface signal by themselves; Hauk, 2004; Hamalainen & Ilmoniemi, 1994). Sources were computed for the grand-averaged ERP difference waves between incongruent and congruent conditions in order to focus on brain activity related to action priming and to eliminate unspecific brain activity related to visual object processing. Cortical currents were determined within the P1 and N400 time windows at the time point of maximal global field power in the ERP difference waves in order to ensure optimal signal-to-noise ratio. Minimum norm source estimates (minimum L2 norm) were calculated using a standardized realistic head model (finite element model). For estimating the noise regularization parameters, the prestimulus baseline was used. Minimum norm was computed with depth weighting, spatio-temporal weighting, and noise weighting for each individual channel. Talairach coordinates for the activation peaks were determined on the 2-D surface covering the cortex on which the source solution was computed. We report the nearest Brodmann's areas (BA) to the peak activations located by the Talairach Daemon (Lancaster et al., 1997, 2000).
Each subject's naming response was classified by the experimenter as being either correct or erroneous on the basis of the object names obtained in the norming studies (Helbig et al., 2006). A repeated measures ANOVA with the within-participants factors prime modality (picture vs. word) and congruency (congruent vs. incongruent) on error rates yielded a main effect of prime modality [F(1, 19) = 37.33, p < .0001]: Following pictorial primes, targets were named more erroneously (7.79%) than following word primes (3.93%). This finding may reflect the fact that naming the word prime is cognitively less demanding than naming a picture prime carrying over to target naming. Most importantly, a main effect of congruency [F(1, 19) = 6.60, p < .05] showed that incongruent targets (6.86%) were associated with a higher error rate than congruent targets (4.86%). The interaction between prime modality and congruency was far from being significant (F > 1, p > .78), indicating that the behavioral action priming effect did not differ between modalities. Naming latencies were not analyzed because the naming response was delayed (700 msec lag between target offset and beginning of the response interval) in order to avoid articulation artifacts in the ERPs to the target stimulus. Moreover, participants had to name both prime and target objects so that the theoretically interesting naming latencies to the target object would have been additionally distorted by the preceding response to the prime.
Visual inspection of the ERP waveforms revealed ERP effects of action congruency in the P1 and N400 time windows (see Figure 2). In the P1 time window, congruency modulated ERPs at central electrodes with more positive potentials for incongruent than for congruent prime–target pairs. This effect was only obtained for pictorial primes. In the N400 time window, congruency influenced ERP widespread over the central and parietal scalp for both picture and word primes. In accordance with the classical N400 congruency effect (Kutas & Hillyard, 1980), incongruent object pairs elicited more negative potentials than congruent pairs. These action priming effects were statistically assessed using separate repeated measures ANOVAs for each time window and scalp region.1 As ERP action priming effects were of theoretical interest, we only report main effects or interactions involving the factor congruency.
85–115 msec after Stimulus Onset (P1)
In this early time window, prime modality and congruency significantly interacted at central electrodes [F(1, 19) = 5.19, p < .05]. This interaction was further assessed in separate ANOVAs for the pictorial and word prime conditions. These subsequent analyses revealed a significant main effect of congruency for pictorial primes [F(1, 19) = 4.41, p < .05], showing that ERPs in the congruent condition were more negative than in the incongruent condition (congruent: −2.0 μV; incongruent −1.49 μV). For verbal primes, in contrast, the congruency effect did not reach significance (p = .12). At parietal electrodes, no effect involving the factor congruency was reliable. The topography of the early congruency effect in the P1 time window is shown in Figure 2B. Mean voltages in the P1 time window in the different experimental conditions are displayed in Figure 3.
380–480 msec after Stimulus Onset (N400)
In this later time window, action congruency affected both pictorial and verbal priming conditions over widespread scalp regions. We obtained significant main effects of congruency at both central [F(1, 19) = 14.80, p < .01] and parietal electrodes [F(1, 19) = 8.33, p < .01]: Incongruent trials elicited a more negative potential than congruent trials. Figure 2B illustrates the scalp distribution of the late N400 congruency effects (for mean voltages in the N400 time window as a function of the experimental conditions, see Figure 3).
Source analyses of scalp ERPs were performed for the conditions with significant effects of action priming in the 85–115 msec (action priming by pictorial primes) and 380–480 msec time windows (action priming irrespective of prime modality). In the 85–115 msec time window (P1), source activity elicited by pictorial action priming was obtained in brain areas typically associated with action processing (see Figure 4): Activation foci were observed in right inferior parietal (BA 40/39), right postcentral (BA 1/2), and right precentral (BA 4) areas. A further activation focus was found in right posterior middle temporal cortex (BA 19/22), an area frequently associated with the processing of action-related motion. Source activity related to action priming was also obtained in right inferior temporal areas including inferior temporal and fusiform gyrus (BA 20/BA 37), that is, areas typically involved in visual object recognition. As illustrated in Figure 4, in the 380–480 msec time window (N400), the strongest source activity related to action priming in both prime modalities was found in bilateral anterior inferior temporal areas (BA 20) extending to the temporal pole (BA 38). Albeit numerically much weaker, source activity was found again in a cluster encompassing right postcentral (BA 1/2) and precentral areas (BA 4).
The present study was set up to elucidate the modulatory influence of action representations on object recognition. Using ERP measurements within an action priming paradigm, we assessed whether action representations contribute to object recognition already in the early perceptual stages through rapid activation of action representations. Alternatively, action representations could affect only later postperceptual stages such as semantic integration, which are indexed by the N400 ERP component.
The results of our study show that action representations influence object recognition at both early and late processing stages. We found an ERP effect of action congruency (action priming effect) over the central scalp that started as early as about 100 msec after the onset of the target picture within the P1 time window. This early action priming effect was only observed for pictorial, but not for verbal, prime presentation. A later action priming effect at about 400 msec after target onset on the N400 ERP component was obtained for both picture and word primes. Likewise, at a behavioral level, we found an action priming effect (i.e., a higher error rate for incongruent than for congruent targets) in both prime modalities.
The topography of the early central action priming effect is in line with previous ERP studies on the electrophysiological correlates of processing action representations: Pictures or words with a high relevance of action representations typically elicited ERP differences at central electrodes, which were related to activity of motor areas (Hoenig et al., 2008; Kiefer, 2001, 2005; Hauk & Pulvermüller, 2004; Pulvermüller, Lutzenberger, & Preissl, 1999).
Although the localizational value of ERPs has to be considered with caution, source analysis of the present ERPs suggested generators in a cortical network previously implicated in the representation of actions associated with manipulable objects (Hoenig et al., 2008; Kiefer, Sim, et al., 2007; Noppeney et al., 2006; Chao & Martin, 2000) and action verbs (Hauk et al., 2004; Hauk & Pulvermüller, 2004): We found source activity related to action priming in inferior parietal cortex as well as in precentral and postcentral cortex overlapping with or in close vicinity to the frontal motor areas and somatosensory cortex. Parietal motor areas within the dorsal visual stream that are engaged in rapid unconscious action preparation processes are functionally linked to frontal motor areas and form a cortical motor network (Milner & Dijkerman, 2001; Milner & Goodale, 1995). Posterior and anterior motor areas have been frequently found to be coactivated during the processing of manipulable objects (Hoenig et al., 2008; Chao & Martin, 2000). Source activity in response to action priming was also observed in a posterior middle temporal area, which has been frequently found to be activated by manipulable objects (Gerlach, 2007; Martin & Chao, 2001). It has been suggested that posterior middle temporal cortex is involved in the representation of action-related motion (Martin & Chao, 2001). In contrast to earlier studies on action processing, we found activity to action priming exclusively in right hemisphere areas. We assume that due to the dominance of the right hemisphere in visual processing (Kosslyn, 1994) and conscious stimulus recognition (Verleger et al., 2009), the motor system in the right hemisphere was more susceptible to action priming during object recognition than its left hemisphere counterpart.
In addition to these sources in motor and motion-related areas of the dorsal stream, a further activity focus was found in areas of the ventral stream: Source activity related to action priming was also observed in the right inferior temporal and fusiform gyri, which play an important role in shape processing and visual object recognition (Logothetis, Pauls, & Poggio, 1995; Desimone, Albright, Gross, & Bruce, 1984), particularly in the right hemisphere (Gerlach, Law, Gade, & Paulson, 1999). This simultaneous activation of motor and visual areas in response to action priming at about 100 msec suggests an interaction between motor processes in the dorsal stream and visual processes in the ventral stream during the course of visual object recognition.
The present study shows for the first time that the human visuomotor system is sensitive to action congruency between prime and target object already within the first 150 msec of object processing. We assume that the early ERP action priming effect indexes a match or a mismatch of action representations elicited by the prime and target object. This early ERP effect temporally coincides with the process of visual feature extraction, which is indexed by the P1 ERP component (Compton, Grossenbacher, Posner, & Tucker, 1991; Mangun & Hillyard, 1991). Hence, action priming affects the ongoing visual object recognition process within the ventral visual system, which is estimated to be completed between 150 and 300 msec (Hauk et al., 2007; Johnson & Olshausen, 2003; Liu et al., 2002; Thorpe et al., 1996). Our findings suggest that action representations are rapidly activated presumably contingent upon an initial coarse visual analysis of the object, and are available to the visual recognition system before recognition is completed. This early effect is in line with previous demonstrations of ultra-rapid semantic categorization processes that allow coarse categorical decisions within less than 120 msec (Kirchner & Thorpe, 2006). The presently demonstrated influence of action representations on the ongoing visual object recognition process also supports earlier suggestions of an interaction between perceptual and conceptual (here action-related) processes during visual object recognition within the first 150 msec of stimulus processing (Hauk et al., 2007; Levelt et al., 1998).
The early action priming ERP effect was only observed for pictorial primes, but not for verbal primes. This influence of prime modality could be due to several reasons. Firstly, it is possible that access to action representations is much faster for object pictures than for names because only for pictures was the access directly triggered by the visual appearance of the stimulus without further semantic analysis (Noppeney et al., 2006). Secondly, an object name may activate several actions associated with the object, whereas the visual action affordance conveyed by the picture might make one action more salient than the others: For instance, one can grasp a telephone at the receiver or can type numbers using its keyboard. A picture, in which the receiver is clearly visible, makes “grasping” more salient compared with “typing.” The picture also specifies more precisely how an action toward the object is performed (e.g., grasping a pan with a handle pointing to the left). Thus, pictures may activate more detailed action representations than words because only pictures provide rich information on the visual appearance of an object, and thus, lead to stronger priming effects. Possibly, there are multiple levels of action representations which differ in their level of abstraction and which are differentially accessed by picture and words (for a discussion, see Helbig et al., 2006; Simmons & Barsalou, 2003).
Previous work indicated that words can elicit action representations within the first 150 msec of stimulus processing (e.g., Hoenig et al., 2008). We therefore assume that the absence of the early ERP action priming effect for verbal primes was due to the less specific action representations elicited by words compared with pictures rather than differences in the time course of their activation. To further test this specificity hypothesis, the prime pictures could be presented in a rotated or mirrored fashion in future ERP studies. This manipulation alters the precise motor program of the potential motor interaction with the object (e.g., grasping an object at a handle pointing to the left or to the right), but leaves a more abstract action scheme intact (e.g., the action involves a power grip at a handle). If the early ERP effect depends on action similarity at the level of the precise motor program, it should be abolished with transformed prime pictures. In support of the assumption that a highly specific action similarity is required for eliciting the early ERP effect, a previous behavioral study from our group indicated reduced action priming for rotated prime objects (Helbig, Graf, and Kiefer, unpublished data).
Unlike some earlier ERP studies (Hoenig et al., 2008; Pulvermüller et al., 2005; Hauk & Pulvermüller, 2004), we did not observe direct signs for an access to action representations for verbal material. It should be noted, however, that our action priming paradigm focuses on action congruency effects, that is, on an implicit computation of the similarity of action affordances between two different objects. Thus, our paradigm is not suited to reveal access to action representations directly. However, we found indirect signs for the activation of action representations with verbal primes: It is plausible to assume that action representations were activated even for word primes during the course of prime processing because action priming influenced later conceptual integration processes as indexed by the N400 action priming effects and naming performance. The present results are therefore compatible with embodiment theories of conceptual representations, according to which conceptual word meaning is essentially grounded in the sensory and motor brain systems (Pulvermüller, 2005; Kiefer & Spitzer, 2001; Barsalou, 1999).
The later action priming effect on the N400 ERP component, which was obtained for both picture and word primes, is presumably generated in anterior temporal areas as suggested by the present source analysis. In line with our findings, intracranial ERP recordings (Nobre & McCarthy, 1995) as well as source analyses of scalp potentials (Kiefer, Schuch, et al., 2007) have shown that the N400 ERP component is generated in the anterior ventral temporal lobe (anterior fusiform gyrus, parahippocampal gyrus, perirhinal cortex). The neural processes within the anterior temporal lobe and the N400 as their electrophysiological correlate are assumed to subserve the integration of modality-specific conceptual features, which are distributed represented within the sensory and motor systems, into a coherent concept (Kiefer, Sim, et al., 2007; Patterson et al., 2007). The action priming effect on the N400 ERP component therefore suggests that conceptual feature integration was easier for congruent than for incongruent prime targets pairs, regardless of the prime modality. In that respect, the present action priming effect resembles the classical semantic priming effect on the N400, which has been shown to occur independently of input modality (Kiefer, 2001; Ganis et al., 1996). It should be noted, however, that the congruent and incongruent conditions only differed with regard to action congruency, but were carefully matched for overall semantic similarity of the conceptual meaning of object pairs. Hence, the present results indicate that activation of congruent action representations by the prime object, whether it is presented pictorially or verbally, suffices to facilitate the establishment of a coherent conceptual representation which facilitates the recognition of a subsequent object (see also van Elk et al., 2010). These relatively slow influences of action priming on semantic integration processes can presumably facilitate object recognition only if there is sufficient time for object processing as in the present study, in which the response to the target was delayed. This may explain why in an earlier behavioral study that afforded an immediate naming response to a briefly presented target object, action priming effects were only significant for pictorial primes, but not for verbal primes (Helbig et al., 2006).
In conclusion, we obtained action priming effects on ERPs at two distinct stages: At about 100 msec, ERPs over the central scalp were modulated only for pictorial primes suggesting a rapid activation of action representations within the dorsal stream during the course of object recognition. This early effect depends on visuomotor processing of pictorially presented prime objects. The later N400 action priming effect, in contrast, was obtained for both picture and word primes. Presumably, this N400 effect originates from areas within anterior temporal cortex and reflects integration of semantic features into a coherent concept irrespective of prime modality. The present results indicate that action priming can influence object recognition through both fast and slow pathways (for dual-route models of action planning, see Milner & Dijkerman, 2001; Rumanti & Humphreys, 1998): It affects rapid visuomotor processes elicited by pictorial prime stimuli, but can also modulate comparably slow conceptual integration processes evoked by both picture and word primes.
This research was supported by grants from the German Research Community (DFG Ki 804/5-1) and from the European Social Foundation to M. K. and a grant from the Max Planck Society to M. G. We thank Hyeon-Woo Yi, Cornelia Müller, and Florian Diehl for their help with data acquisition.
Reprint requests should be sent to Markus Kiefer, Section for Cognitive Electrophysiology, Department of Psychiatry, University of Ulm, Leimgrubenweg 12, 89075 Ulm, Germany, or via e-mail: Markus.Kiefer@uni-ulm.de, Web: www.uni-ulm.de/∼mkiefer/.
ERPs were also analyzed statistically in two time windows in a latency range between the P1 and the N400 that covered the N1 (116–278 msec) and the onset of the N400 (279–379 msec), respectively. In the N1 time window (116–278 msec), action congruency did not significantly modulate ERPs (all F < 1, all p > .52). In the time window covering the onset of the N400 (279–379 msec), analysis yielded only nonsignificant tendencies of a main effect of action congruency at parietal [F(1, 19) = 2.72, p = .12] and central electrodes [F(1, 19) = 2.73, p = .11]. Interactions between congruency and prime modality were far from being significant (all F < 1, all p > .75).