Several theories of the mechanisms linking perception and action require that the links are bidirectional, but there is a lack of consensus on the effects that action has on perception. We investigated this by measuring visual event-related brain potentials to observed hand actions while participants prepared responses that were spatially compatible (e.g., both were on the left side of the body) or incompatible and action type compatible (e.g., both were finger taps) or incompatible, with observed actions. An early enhanced processing of spatially compatible stimuli was observed, which is likely due to spatial attention. This was followed by an attenuation of processing for both spatially and action type compatible stimuli, likely to be driven by efference copy signals that attenuate processing of predicted sensory consequences of actions. Attenuation was not response-modality specific; it was found for manual stimuli when participants prepared manual and vocal responses, in line with the hypothesis that action control is hierarchically organized. These results indicate that spatial attention and forward model prediction mechanisms have opposite, but temporally distinct, effects on perception. This hypothesis can explain the inconsistency of recent findings on action–perception links and thereby supports the view that sensorimotor links are bidirectional. Such effects of action on perception are likely to be crucial, not only for the control of our own actions but also in sociocultural interaction, allowing us to predict the reactions of others to our own actions.
Perception and action are often regarded as separate and distinct processes that are located at the input and output ends of cognitive systems: Perceptual mechanisms provide information about the external world, while action-related mechanisms are involved in the selection and execution of goal-directed behavior, and these two functions are performed independently. This traditional view of perception and action is no longer tenable. Recent results from cognitive neuroscience and experimental psychology have demonstrated close links between perception and action. Observing an action has a strong impact on the brain processes involved in action execution: The “mirror system,” in human ventral premotor and inferior parietal cortices, is responsive to both the perception and the execution of actions. It has been suggested that this system enables us to predict the outcomes of observed actions and thereby to understand the intentions of others (for a review, see Rizzolatti & Craighero, 2004).
Although most research on the mirror system has emphasized links from perception to action, there is reason to assume that these sensorimotor links are in fact bidirectional, that they also mediate effects of action on perception. Bidirectional sensorimotor links are implied by the hypothesis that links between perception and action arise through associative learning (e.g., Brass & Heyes, 2005; Keysers & Perrett, 2004; Heyes, 2001) and by Bayesian theories, which assume that the same models are used to generate predicted sensory representations of executed actions and to derive motor commands from observed actions (e.g., Kilner, Friston, & Frith, 2007; Wolpert, Doya, & Kawato, 2003).
These theories postulate links from action to perception, but mixed results have been obtained in empirical studies examining the effects of action and action preparation on perception. Whereas numerous behavioral, electrophysiological, and fMRI studies have found that the processing of visual stimuli that are compatible with a currently prepared or executed action is facilitated relative to incompatible stimuli (e.g., Eimer, van Velzen, Gherri, & Press, 2006, 2007; Gherri, van Velzen, & Eimer, 2007; Eimer & van Velzen, 2006; Williams et al., 2006; Wohlschläger, 2000; Deubel & Schneider, 1996), other studies have found instead that the processing of compatible stimuli is attenuated when compared with incompatible stimuli (e.g., Stanley & Miall, 2007; Blakemore, Frith, & Wolpert, 1999; Blakemore, Wolpert, & Frith, 1998; Williams, Shenasa, & Chapman, 1998). For example, a recent fMRI study by Stanley and Miall (2007) required participants to observe a hand opening and closing while concurrently either opening and closing their hand in time with the observed stimulus display (compatible action) or rotating their wrist (incompatible action). Participants exhibited less activation in primary visual cortex in the compatible relative to the incompatible condition, suggesting an attenuation of processing of compatible stimuli. In contrast, Williams et al. (2006) found greater inferior and middle occipital gyrus activation when participants were imitating finger actions than when they were simply observing these actions, indicating that the visual processing of compatible stimuli is facilitated.
Although such fMRI activation patterns need to be interpreted with caution, unless they can be shown to be firmly linked to changes in behavioral performance, they highlight the fact that the mechanisms involved in action-induced perceptual modulations and the direction of such effects are still poorly understood (for a critical review, see Schütz-Bosbach & Prinz, 2007). In this context, it is useful to distinguish two different types of compatibility between stimuli and actions. Sensory events and actions can occur in the same or in different spatial locations and can accordingly be described as spatially compatible or incompatible. In addition, stimuli and actions can have the same spatio-temporal configuration and therefore represent the same action type (action compatible), such as opening a hand or curling an index finger, or they can represent different action types (action incompatible). Because spatial and action compatibility can vary independently, it is important to find out whether any action effect on perception is primarily determined by spatial or action compatibility or a combination of both.
There are two theories that make explicit, and contrasting, predictions about whether action preparation and execution will facilitate or attenuate the processing of compatible sensory events. First, the premotor theory of attention (Rizzolatti, Riggio, & Sheliga, 1994) states that preparing an action in a certain spatial location is linked to shifts of spatial attention to that location. This will result in facilitated sensory processing of stimuli at locations that are spatially compatible rather than incompatible with the current action. As the premotor theory is exclusively concerned with the locus of stimuli and actions in external space, it makes no predictions about any differences in processing according to action compatibility when spatial compatibility is held constant. Support for the premotor theory comes from behavioral experiments demonstrating superior performance for visual and auditory events located at saccade target locations (e.g., Rorden & Driver, 1999; Deubel & Schneider, 1996) and for visual events at the target location of a goal-directed manual movement (Schiegg, Deubel, & Schneider, 2003) as well as from ERP studies where irrelevant visual probe stimuli were presented on the left or right side while participants prepared left or right manual or saccadic responses (Eimer et al., 2006, 2007; Gherri et al., 2007; Eimer & van Velzen, 2006). In these ERP studies, the visual N1 component evoked by spatially compatible stimuli (e.g., a left visual stimulus when a left-hand action or a leftward saccade was prepared) was consistently enhanced relative to incompatible stimuli, indicative of facilitated perceptual processing when visual stimuli are spatially compatible with a prepared response.
In contrast with the premotor theory, forward models of action control (e.g., Wolpert, Ghahramani, & Jordan, 1995) predict that the sensory processing of compatible stimuli should be generally attenuated relative to incompatible stimuli. According to forward models, the generation of a motor command produces an efference copy of the predicted sensory consequences of that action. This efference copy is compared against incoming sensory information, such that sensory consequences of actions can be distinguished from sensory information from other sources, and these signals attenuate processing of predicted action consequences (e.g., Blakemore et al., 1998, 1999). Given that the predicted consequences of action should be both on the same side of space and constitute the same configural action type, forward models predict attenuated processing of spatially compatible relative to incompatible stimuli as well as of action compatible relative to incompatible stimuli. Support for these predictions comes from demonstrations that perception of tactile stimuli is attenuated when these are presented to effectors currently involved in action execution (Williams et al., 1998), reduced somatosensory cortical activation in response to self-generated rather than externally generated touch (Blakemore et al., 1998), and attenuated primary visual cortical activation during action execution when observing action compatible relative to incompatible stimuli (Stanley & Miall, 2007) as well as from a number of behavioral experiments that have demonstrated impairments in the detection and recognition of action-related stimuli that are compatible with a currently prepared or executed response (for a review, see Schütz-Bosbach & Prinz, 2007).
In summary, whereas forward models claim that action preparation will impair the sensory processing of spatially compatible and action compatible stimuli, the premotor theory predicts facilitated processing of spatially compatible stimuli. The aim of the current study was to test these conflicting hypotheses by measuring effects of action preparation on visual perception with visual ERPs. Because of their excellent temporal resolution, ERPs provide an ideal on-line measure to investigate whether and how action planning affects early stages of visual processing and to distinguish and dissociate the relative roles of spatial and action compatibility.
On each trial, participants prepared a specific response (a left-hand or right-hand lift or tap of the index finger), as indicated by centrally presented letter response cues that were flanked by a left and right hand (see Figure 1, left panel). The prepared response had to be executed after one of these hands moved (imperative stimulus), except in occasional catch trials where no such hand movement occurred. Regardless of which response was signaled by the cue, the visual imperative stimulus was equally likely to be a lift or tap of the index finger of the left or right hand. Thus, and critically, this visual stimulus could be spatially compatible (SC+) or incompatible (SC−) and action compatible (AC+) or incompatible (AC−) with the prepared response. For example, following a cue that instructed participants to prepare a tap with their left index finger, the imperative stimulus could be a left tap (SC+AC+), a left lift (SC+AC−), a right tap (SC−AC+), or a right lift (SC−AC−). Visual ERPs were measured in response to these different stimulus types and were compared as a function of both spatial (SC) and action (AC) compatibility. If the compatibility between prepared actions and visual events facilitates sensory processing, the amplitudes of early visual ERP components (P1, N1, or N2) obtained at posterior scalp sites over visual areas should be enhanced for compatible compared with incompatible stimuli. According to the premotor theory, such an effect should be found in particular for SC+ relative to SC− stimuli. In contrast, if the processing of visual events that are compatible with a prepared response is attenuated, as predicted by forward models, the opposite pattern of results should be found: Early visual ERP components at posterior electrode sites should be reduced in amplitude for compatible stimuli (SC+ and AC+) compared with incompatible stimuli (SC− and AC−).
In Experiment 2, we investigated whether any effects of AC on visual perception, as reflected by modulations of visual ERPs, are specific to a given response modality or are instead mediated by higher level representations. On half of all trials, cues instructed participants to prepare to lift or to tap the index finger of their right hand. A single hand was presented at the screen center, and movement of this hand (lift or tap) served as an imperative stimulus (see Figure 1, right panel). As in Experiment 1, the visual stimulus could be compatible (AC+) or incompatible (AC−) with the cued manual response. Critically, on the other half of trials, cues now instructed participants to prepare a vocal response (“up” or “down”) that was to be executed in response to the same visual imperative stimulus. On these trials, AC was no longer defined in terms of the spatio-temporal configuration of perceptual and motor codes but instead on a higher level (i.e., seen finger tap actions were defined as compatible with a “down” response, and a finger lift with an “up” response). If effects of AC on visual perception are specific to a given response modality, they should only be present on trials where manual responses were being prepared, but not on trials where a vocal response was prepared instead. In contrast, if such effects depend on a higher level of representation, they should be similar for both manual and vocal response trials.
At first glance, forward models seem to predict that effects of action preparation on perception will be response-modality specific. If these effects are due to the selective attenuation of anticipated sensory consequences of actions, mediated by efference copy mechanisms, they may only be found when perception and action both involve the same effector system. However, according to recent versions of forward models (e.g., Kilner et al., 2007; Wolpert et al., 2003), motor control is arranged hierarchically, with higher level representations of an intended action determining the weight of lower level representations according to prior learning and the current behavioral context. In a hierarchy of this kind, one would expect preparation of an action to activate both lower level effector-specific representations and higher level action representations. For example, preparation of a vocal response “up,” specifying the intended vocal output, might activate a higher level representation (“upward”), encompassing the range of upward movements primed by the task context. If this is the case, Experiment 2 might find similar effects of action preparation on visual perception for both manual and vocal responses.
In summary, the aim of these two experiments was to use electrophysiological markers of visual processing efficiency to identify and dissociate action-induced perceptual facilitation and attenuation effects. We found consistent perceptual attenuation of action-type compatible stimuli, in line with the forward model account. However, visual processing of spatially compatible stimuli was first facilitated and then attenuated, suggesting that in this case perception was affected by both premotor attention and forward model mechanisms.
Twelve paid healthy participants took part in this study (6 men, mean age = 23 years, range = 19–30 years). All were right-handed, had normal or corrected-to-normal vision, were naive with respect to the purpose of the experiment, and gave informed consent. The experiment was performed with the approval of the ethics committee of the School of Psychology, Birkbeck College, and performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki.
All stimuli were presented on a computer screen (100 Hz, 500 mm, 96DPI), in color on a gray background (70% black). Viewing was unrestrained at a distance of 50 cm. At the onset of a trial, participants observed an approximately life-sized left and right hand from a first person perspective, with all fingers grouped together and a fixation cross between the hands (see Figure 1, left panel). The response cue consisted of replacing the fixation cross with two letters, which indicated the response to be prepared: An “LU” cue indicated that participants should prepare to move their left index finger upward with respect to the back of their hand; an “LD” cue indicated that they should prepare to move their left index finger downward; an “RU” or “RD” cue indicated that they should prepare these actions with their right hand. The visual imperative stimulus consisted of the index finger on either the left or the right hand on the screen moving upward or downward with respect to the back of the hand. Therefore, a spatially compatible trial was also effector compatible (left hand stimulus movement when preparing a left hand response), and a spatially incompatible trial was also effector incompatible (right hand stimulus movement when preparing a left hand response). One pair of male and one pair of female hands were used as visual stimuli, which were presented with equal probability and in random order across trials. The maximal width of all stimuli was 37.4° of visual angle, and the maximal height varied between 15.6° and 20.2° of visual angle.
Participants were tested individually in a dimly lit room. Their left and right arms were positioned on a tabletop 8 cm to the right and left of the body midline, supported by foam armrests, with stimulus hands presented 3 cm to the right and left of the body midline. The distance between participants' left and right hands and the left and right hand on the computer screen was therefore 5 cm. Participants' hands were rotated such that upward finger actions with respect to the back of the hand moved away from the body midline and downward finger actions moved toward the body midline. Because stimulus actions were presented in the vertical plane (up–down), response actions were orthogonal to stimulus actions. This ensured that any effects of the manipulation of AC could not be attributed to SC. Once hands were positioned, a black cloth was attached to the tabletop and tied around the participant's neck to prevent vision of the hands.
The visual stimulus sequence is shown in Figure 1 (left panel). All trials began with presentation of the left and right stimulus hands in neutral positions, with a fixation cross in the middle. The fixation cross was replaced 1000 msec later by a cue of 200-msec duration. The cue was followed after a variable interval by the imperative stimulus (1000-msec duration), except on catch trials, where no imperative stimulus was presented. The SOA between cues and imperative stimuli varied randomly between 700 and 1200 msec in 100-msec steps. After the imperative stimulus, the screen went blank for 1000 msec before the next trial began. In catch trials, the stimulus hands remained in a neutral position for 2400 msec before the screen went blank. Participants were instructed to prepare the response indicated by the cue but to wait until presentation of the imperative stimulus before executing this response. They were instructed to refrain from moving their hand in catch trials, where both stimulus hands stayed in neutral positions. Responses were recorded using infrared detectors positioned above the participants' hands, 1.25 cm to the left and right of each of their index fingers.
Each block contained 216 trials (192 response trials and 24 catch trials, in random order). For response trials, each combination of response hand (left or right), response action type (up or down), stimulus hand (left or right), stimulus action type (up or down), stimulus hand gender (male or female), and SOA (700, 800, 900, 1000, 1100, or 1200 msec) was presented once per block. For catch trials, each combination of response hand, response action type, and stimulus hand gender was presented three times per block. Participants completed three blocks. They were permitted to rest between blocks and also after every 54 trials within a block. Before testing commenced, participants completed 54 practice trials to learn the cue–response relationships.
EEG Recording and Data Analysis
EEG was recorded with a band-pass filter of 0 to 40 Hz and a sampling rate of 500 Hz from Ag-AgCl electrodes mounted in an elastic cap according to the extended 10–20 system, at scalp sites Fpz, F7, F3, Fz, F4, F8, FC5, FC6, T7, C3, Cz, C4, T8, CP5, CP6, P7, P3, Pz, P4, P8, PO7, PO8, and Oz. Horizontal EOG (HEOG) was recorded bipolarly from the outer canthi of both eyes. All electrodes were referenced to the left earlobe and rereferenced off-line to averaged left and right earlobes. Electrode impedance was kept below 8 kΩ, and the impedances of the earlobe electrodes were kept as equal as possible.
EEG and EOG were epoched off-line into 600-msec periods, starting 100 msec before visual imperative stimulus onset and ending 500 msec after onset. Trials with eyeblinks (Fpz exceeding ±60 μV), small horizontal eye movements (HEOG exceeding ±30 μV), or other artifacts (a voltage exceeding ±80 μV at any electrode) in the 500-msec interval following visual stimulus onset were excluded from EEG data analysis. Averaged HEOG waveforms obtained for each participant and task condition in this interval in response to left versus right stimuli were scored for systematic deviations of eye position, indicating residual tendencies to move the eyes toward the visual movement stimulus. Residual HEOG deflections did not exceed ±5 μV at any point during this interval, thus confirming that participants were not moving their eyes toward the visual stimuli.
The EEG obtained in the 500-msec interval following the onset of the imperative visual stimulus for each participant was averaged relative to a 100-msec prestimulus baseline for combinations of SC (SC+ vs. SC−) and AC (AC+ vs. AC−). ERP mean amplitudes were computed within measurement windows centered on the latency of early visual P1 (80–110 msec), N1 (160–200 msec), and N2 (210–290 msec) components as well as within the P3 time range (330–430 msec). Statistical analyses were conducted over posterior electrodes PO7, Oz, and PO8, where early visual ERP components are maximal.
Fourteen new paid healthy participants took part in this study. All were right-handed, had normal or corrected-to-normal vision, were naive with respect to the purpose of the experiment, and gave informed consent. Two participants were excluded because of excessive alpha activity in synchrony with the VEPs. Thus, 12 participants remained in the sample (5 men, mean age = 24 years, range = 18–32 years). The experiment was performed with the approval of the ethics committee of the School of Psychology, Birkbeck College, and performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki.
Stimuli and Procedure
The right hand stimuli used in Experiment 1 were again used but were now presented in the center of the computer screen with a fixation cross halfway along the index finger (see Figure 1, right panel). Cues also differed from Experiment 1: An “FU” cue indicated that participants should prepare to move their right index finger upward with respect to the back of their hand; an “FD” cue indicated that they should prepare to move their right index finger downward; a “VU” cue indicated that they should prepare to say “up” and a “VD” cue indicated that they should prepare to say “down.” Only the participant's right arm was now supported by a foam armrest; their left arm simply rested on the tabletop next to their right arm. Each block presented, in random order, two trials of each combination of response modality (manual or vocal response), response type (up and down), stimulus type (up and down), stimulus hand gender (male and female), SOA (700, 800, 900, 1000, 1100, and 1200 msec), and 24 catch trials (3 of each cue type with a male hand, and 3 of each cue type with a female hand), totalling 216 trials in each block.
EEG Recording and Data Analysis
EEG and EOG were epoched off-line into 400-msec periods, starting 100 msec before visual imperative stimulus onset and ending 300 msec after onset. Epochs needed to be shorter than in Experiment 1 because half of the response trials required participants to produce vocal responses, which generated EEG artifacts beyond 300 msec poststimulus because of muscle activity associated with head and mouth movements. Trials with eyeblinks (Fpz exceeding ±60 μV), small horizontal eye movements (HEOG exceeding ±30 μV), or other artifacts (a voltage exceeding ±80 μV at any electrode) in the 300-msec interval following visual stimulus onset were excluded from EEG data analysis. The EEG obtained in the 300-msec interval following the onset of the imperative visual stimulus for each participant was averaged relative to a 100-msec prestimulus baseline for combinations of AC (AC+ vs. AC−) and response modality (manual vs. vocal). Similarly to Experiment 1, ERP mean amplitudes were computed within measurement windows centered on the latency of early visual P1 (80–120 msec), N1 (140–180 msec), and N2 (210–290 msec) components and analyzed at PO7, Oz, and PO8. As epochs could only be computed up to 300 msec poststimulus, no analysis of P3 amplitudes could be conducted.
Participants initiated movement in 3.9% of catch trials. These data were not analyzed further. They failed to initiate movement in 1.8% of imperative trials. Incorrect responses (2.0%) and RTs smaller than 100 msec (0.4%) were excluded from the RT analysis. Figure 2 (top panels) shows RTs (left) and error rates (right) for each combination of SC and AC. RTs were subjected to ANOVA with SC (SC+ and SC−) and AC (AC+ and AC−) as within-subject factors. A main effect of SC, F(1, 11) = 21.7, p = .001, reflected faster SC+ responses than SC− responses. A main effect of AC, F(1, 11) = 21.8, p = .001, demonstrated that AC+ responses were faster than AC− responses. The interaction between SC and AC was also significant, F(1, 11) = 9.5, p = .01, as the effect of AC was more pronounced on SC+ than on SC− trials (see Figure 2). The analysis of error rates (trials on which no response or the wrong response was executed) indicated a main effect of SC, F(1, 11) = 7.1, p < .03, as there were fewer errors on SC+ than on SC− trials. The main effect of AC, F(1, 11) = 1.9, p = .2, and the interaction between SC and AC, F(1, 11) = 2.9, p = .1, did not reach significance.
Visual Event-related Brain Potentials
P1 and N1
Figure 3 (top two panels) shows ERPs elicited in response to visual imperative stimuli at posterior electrodes PO7, Oz, and PO8. ERPs are displayed separately for trials where these visual stimuli were SC+ or SC− (collapsed across AC+ and AC− trials) and AC+ or AC− (collapsed across SC+ and SC− trials) with the prepared manual response. Although SC affected early visual P1 and N1 components, with enhanced amplitudes for SC+ relative to SC− trials, no such early ERP modulation is visible for AC. These observations were confirmed by statistical analyses, which revealed main effects of SC on the P1 amplitude, F(1, 11) = 5.0, p < .05, as well as on the N1 amplitude, F(1, 11) = 5.6, p < .05, thus confirming that these early components were reliably enhanced when visual stimuli were SC+ with the prepared manual response. In contrast, there were no main effects of AC or interactions between AC and SC for either P1 or N1 amplitudes (all F < 1.7).
As can be seen in Figure 3, ERP amplitudes in the N2 time range (210–290 msec poststimulus) were generally more negative for incompatible than compatible trials. In marked contrast to the effects observed for the P1 and N1, this effect appeared for both SC and AC. This was confirmed by statistical analyses, which demonstrated main effects of SC, F(1, 11) = 5.3, p < .05, and AC, F(1, 11) = 5.3, p < .05, on ERP mean amplitudes in the N2 time range, reflecting enhanced negativities for SC− relative to SC+ trials and for AC− relative to AC+ trials. There was no interaction between SC and AC, F(1, 11) < 1. The scalp topography of this effect is illustrated in the maps of Figure 3, which show the distribution of ERP differences in the N2 time range, between SC+ and SC− trials (left map), and between AC+ and AC− trials (right map). These maps demonstrate that compatibility effects on N2 amplitudes have a distinct posterior distribution.
Figure 3 (bottom panels) shows ERPs for all four combinations of SC and AC up to 500 msec poststimulus to illustrate compatibility effects on the later P3 component. P3 amplitudes were smallest for SC+AC+ trials, intermediate for SC+AC− trials, and largest for SC− trials, regardless of AC. This was reflected by main effects of SC, F(1, 11) = 12.9, p < .005, and AC, F(1, 11) = 8.0, p < .02, and an interaction between SC and AC, F(1, 11) = 5.7, p < .04. Simple effects analyses indicated that on SC+ trials, the P3 was larger for AC− relative to AC+ trials, F(1, 11) = 24.7, p < .001, whereas on SC− trials, there was no effect of AC, F(1, 11) < 1.
Participants initiated movement in 3.0% of all catch trials. These data were not analyzed further. They failed to initiate movement in 2.6% of imperative trials, made errors on 0.9% of trials, and had RTs smaller than 100 msec on 0.1% of trials. The RTs and errors were each subjected to ANOVA in which AC (AC+ and AC−) and response modality (manual and vocal) were within-subject factors. Figure 2 (bottom panels) shows RTs (left) and error rates (right) for AC+ and AC− trials, separately for trials where manual or vocal responses were prepared. There was a main effect of response modality on RTs, F(1, 11) = 12.2, p = .005; manual responses were executed faster than vocal responses. There was also a main effect of AC, F(1, 11) = 36.3, p < .001, with faster responses in AC+ relative to AC− trials. Critically, there was no indication of any interaction between AC and response modality, F(1, 11) < 1, suggesting that compatibility effects were present regardless of whether manual or vocal responses were being executed. This was confirmed by analyses conducted separately for RTs on trials with manual and vocal responses, which revealed significant effects of AC for both response modalities, F(1, 11) = 23.2, p = .001, for manual response trials; F(1, 11) = 27.0, p < .001, for vocal response trials. There were no significant effects on error rates: AC, F(1, 11) = 2.7, p = .1; response modality, F(1, 11) = 1.9, p = .2; interaction, F(1, 11) < 1.
Visual Event-related Brain Potentials
P1 and N1
Figure 4 (top two panels) shows ERPs elicited in response to visual AC+ and AC− stimuli at posterior electrodes PO7, Oz, and PO8, shown separately for trials where manual or vocal responses were being prepared. As in Experiment 1, there was no effect of AC on the P1 component. However, N1 amplitudes appear to be enhanced for AC− relative to AC+ trials, both for manual and for vocal response trials. This was confirmed by statistical analyses with AC (AC+ and AC−) and response modality (manual and vocal) as within-subject factors. For P1 amplitudes, no main effects or interactions were found, all F(1, 11) < 1.8, all p > .2. In contrast, there was a significant main effect of AC on N1 amplitudes, F(1, 11) = 5.5, p < .04, confirming that this component was reliably enhanced on AC− relative to AC+ trials. Critically, there was no interaction between AC and response modality, F(1, 11) < 1, suggesting that this N1 modulation was present regardless of whether manual or vocal responses were being prepared. This was confirmed by one-tailed tests, which showed enhanced N1 amplitudes for AC− relative to AC+ trials both for manual response trials, F(1, 11) = 3.3, p < .05, as well as for vocal response trials, N1, F(1, 11) = 3.4, p < .05.
As can be seen from Figure 4, ERP amplitudes in the N2 time range were more negative for AC− than for AC+ trials, confirming the findings of Experiment 1. Critically, these N2 modulations were present not only for trials with manual responses but also for vocal response trials. Statistical analyses revealed a significant main effect of AC on N2 mean amplitude, F(1, 11) = 14.5, p < .005, but no indication of any interaction between response modality and AC, F(1, 11) < 1, thus confirming that these N2 modulations were present regardless of whether participants prepared manual or vocal responses. This was confirmed by one-tailed tests, which showed enhanced N2 amplitudes for AC− relative to AC+ trials both for manual response trials, F(1, 11) = 10.6, p < .005, and for vocal response trials (N1), F(1, 11) = 4.7, p < .05. The scalp topographies of these AC effects are illustrated in the maps of Figure 4, which show the distribution of ERP differences between AC+ and AC− trials in the N2 time range, shown separately for manual and vocal responses trials. These maps demonstrate that regardless of response modality, effects of AC on N2 amplitudes have a distinct posterior topography, analogous to the pattern of results observed in Experiment 1.
The present study investigated the influence of action preparation on visual perception by measuring ERPs in response to visual action stimuli under conditions where their spatial or action compatibility with a prepared response was manipulated. Our aim was to determine whether response preparation facilitates or attenuates the visual processing of compatible stimuli. Different theories have made contrasting predictions with respect to the direction of action preparation effects on perception. Although the premotor theory of attention (e.g., Rizzolatti et al., 1994) predicts enhanced processing of spatially compatible relative to incompatible stimuli, forward models (e.g., Wolpert et al., 2003) predict that compatible stimuli should be processed less than incompatible stimuli for both spatial and action compatibility.
Facilitatory or Attenuating Effects of Action Preparation on Visual Perception?
In Experiment 1, early visual ERP components (P1 and N1) were enhanced on trials where visual action stimuli were spatially compatible with a cued manual response (e.g., when participants prepared a response with their left hand and the visual imperative stimulus was a movement of the left hand) relative to spatially incompatible trials, suggesting a facilitation of early stages of visual processing for spatially compatible stimuli. These P1 and N1 modulations were independent of whether the visual stimulus was action compatible (e.g., a finger tap when participants prepared a tap) or incompatible. Such facilitatory effects of spatial compatibility are consistent with predictions of the premotor theory of attention, which postulates that preparing an action will draw attention to its location in external space, resulting in improved perceptual processing of stimuli at that location. It also confirms the results of previous ERP experiments, which have found enhanced N1 amplitudes for stimuli presented during response preparation at cued response locations (Eimer et al., 2006, 2007; Gherri et al., 2007; Eimer & van Velzen, 2006). The present results show for the first time that such facilitatory spatial compatibility effects can be observed in the P1 latency range (80–110 msec after stimulus onset). It should be noted that in contrast with previous ERP experiments that investigated the effects of spatial compatibility between action and perception, where visual stimuli were always presented in close spatial proximity to the response hands, left and right hand stimuli were presented centrally on a computer screen, while response hands were located at a distance of about 5 cm to the left and right of these visual stimuli. The observation that spatial compatibility still had systematic effects on visual perception, as reflected by P1 and N1 amplitude modulations, suggests that the focus of attention when preparing an action is considerably wider than the immediate area surrounding the response effector.
Although effects of spatial compatibility on visual P1 and N1 components in Experiment 1 were maximal at posterior electrodes, it could be argued that these effects might not be entirely perceptual but are in part linked to lateralized readiness potentials (see Eimer & Coles, 2003) that are generated during manual response preparation and execution over motor areas and may affect more posterior ERPs via volume conduction. However, the fact that the ERP effects of spatial compatibility on visual P1 and N1 components were of opposite polarity (an enhanced positivity for SC+ trials followed by an enhanced negativity), and that they were not visible at central electrodes C3 and C4 where lateralized readiness potentials are strongest,1 comprehensively rules out a motor contribution to these effects.
Although these early P1 and N1 modulations observed in Experiment 1 suggest that action planning facilitates the processing of (spatially) compatible visual stimuli, longer latency ERPs revealed evidence for a subsequent attenuation of response-compatible stimuli. In the N2 time range (210–290 msec poststimulus), ERPs were more negative for spatially incompatible relative to compatible stimuli and also for action incompatible relative to compatible stimuli. This attenuation of N2 amplitudes for compatible stimuli had a distinct posterior scalp distribution, which is consistent with the assumption that it reflects a modulation of visual–perceptual processing. Importantly, this effect was confirmed in Experiment 2, which focussed on effects of action compatibility in the absence of a manipulation of spatial compatibility. Again, N2 amplitudes were attenuated for action compatible compared with incompatible stimuli, and this attenuation was also localized over posterior scalp sites. The reduction of posterior N2 amplitudes in response to spatially compatible and action compatible stimuli is a novel finding and strongly suggests that the processing of response-compatible visual stimuli is attenuated, as predicted by forward models. In fact, Experiment 2 provided additional evidence for such an attenuation. Here, N1 amplitudes were reliably smaller for action compatible relative to incompatible stimuli, indicating that relatively early stages of visual–perceptual processing were attenuated for visual stimuli that were action compatible with a prepared response. For Experiment 1, a similar tendency for N1 amplitudes to be smaller on action compatible than on incompatible trials can be seen (Figure 3), but this difference was not statistically significant. The fact that ERP evidence for the attenuation of visual processing of action compatible stimuli emerged earlier in Experiment 2 than in Experiment 1 is likely due to the increased salience of action compatibility. In Experiment 2, single hands were always presented at fixation and were not accompanied by another static hand stimulus on the other side (see Figure 1). The attenuation of N1 and N2 amplitudes in response to action compatible stimuli observed here is also consistent with a previous fMRI study that has found greater activation in primary visual cortex when observing action incompatible rather than compatible actions (Stanley & Miall, 2007).
We have defined AC in an effector-independent way. For example, lift responses with the right hand were classified as action compatible with stimuli showing lift responses with the right or left hand. Some features of the results might suggest that, in contrast with this classification, the action–perception matching system treats as action compatible only movements that involve the same configural body action (lift versus tap) and are made with the same effector. In particular, the RT and P3 data from Experiment 1 indicated stronger effects of action compatibility in spatially compatible than spatially incompatible trials. However, although there was a similar trend for the action compatibility effects on N2 amplitudes in Experiment 1, with a numerically larger effect in spatially compatible (mean = 0.8 μV, SEM = 0.4 μV) than incompatible (mean = 0.5 μV, SEM = 0.4 μV) trials, there was no statistical evidence for an interaction between spatial compatibility and action compatibility in the N2 time range (F < 1).
Mechanisms Underlying Facilitation and Attenuation of Visual Perception
With respect to the central question of whether compatibility facilitates or attenuates visual processing, our ERP results suggest that both are the case but that facilitation and inhibition have a different time course. Spatial (but not action) compatibility facilitates perceptual processing at short latencies, as predicted by the premotor theory of attention, but this effect is then followed by an attenuated processing of both spatially compatible and action compatible stimuli, as postulated by forward models. However, the temporal sequence of facilitation followed by attenuation observed for spatially compatible stimuli in Experiment 1 was not predicted by either of the two theories.2
The current findings can therefore most comprehensively be explained by combining assumptions from the premotor theory and forward models. As hypothesized by the premotor theory of attention, preparing an action draws attention to that location in space, enhancing sensory processing of spatially compatible stimuli. However, preparing an action will also lead to formation of an efference copy of the predicted consequences of that action, and these predicted consequences will be processed less than other sensory information. In the case of action compatibility, the prediction that follows from this compound hypothesis is straightforward: There will be attenuated processing of action compatible relative to incompatible stimuli, and the current ERP results provide new evidence for this (note that the premotor principle relates to spatial compatibility only). However, in the case of spatial compatibility, attentional mechanisms and forward model mechanisms will produce opposite effects on perceptual processing. The results of the present study suggest that attentional effects dominate visual processing at short latencies, whereas efference copy mechanisms dominate at longer latencies. It is plausible to assume that attentional mechanisms are already fully operational during the response preparation interval and well before the presentation of the imperative visual movement stimulus (as suggested by previous ERP evidence, see Eimer et al., 2006, 2007; Gherri et al., 2007; Eimer & van Velzen, 2006), whereas an efference copy is generated later (see also Williams et al., 1998, who find evidence for the suppression of tactile processing 120 msec before response execution). If this was the case, facilitatory attentional effects should precede inhibitory effects of efference copy mechanisms on visual processing, as was observed in the current study. Therefore, facilitation should dominate for stimuli presented early during action preparation, whereas attenuation effects should be most pronounced when they are presented at later stages and during action execution.
Finally, it is important to underline the fact that the present study found facilitatory effects of action planning on visual perception only for spatial compatibility, but not for action compatibility. This suggests that previous studies which have confounded spatial and action compatibility and reported facilitatory effects of action on perception and visual processing are likely to have detected effects of spatial compatibility (e.g., Williams et al., 2006; Schübo, Prinz, & Aschersleben, 2004). To give an example of how spatial and action compatibility may be confounded, consider the study by Williams et al. (2006). Participants were required to imitate or to observe an index or middle finger lifting action. The enhancement of occipital activation when imitating relative to observing may have been generated through activation of the visual action-type representations associated with the executed actions (e.g., index finger lifting representations). Alternatively, and more plausibly, because index finger actions were located on the left and the middle finger actions were on the right, this effect may simply reflect differential activation of left and right visual–spatial codes.
Are Action Effects on Perception Response-modality Specific?
Experiment 2 also investigated whether any attenuation of visual processing for action compatible stimuli during manual response preparation, as reflected by reduced visual ERP component amplitudes, would also be present when participants were preparing vocal responses (“up” or “down”) instead. Interestingly, action compatibility had very similar effects on visual N1 and N2 components during manual and vocal response preparation. These components were attenuated in amplitude for action compatible stimuli (e.g., a finger lift on the screen) relative to action incompatible stimuli (e.g., a finger tap on the screen), regardless of whether participants were preparing a manual response (to lift their index finger) or a vocal response (to say “up”). This finding suggests that attenuation effects of action preparation on perception are not response-modality specific.
This finding that preparing to say “up” attenuates visual processing of fingers moving upward may initially seem inconsistent with a forward model view that attributes perceptual attenuation effects to a match between the current sensory input and the predicted sensory consequences of an action. However, recent accounts of forward models explicitly acknowledge the hierarchical organization of motor control (e.g., Kilner et al., 2007; Wolpert et al., 2003). If motor control is hierarchically organized, one would expect preparation of a vocal response “up” to activate a higher level representation (“upward”) encompassing the range of upward movements primed by the task context and activation of this higher level representation to be propagated to multiple, lower level representations of the specific upward responses. Given the perceptual context of Experiment 2, where visual finger movement stimuli were presented on every trial, regardless of whether participants were cued to respond manually or vocally, it is likely that this range of lower level motor representations would include these finger lifting actions, thus resulting in attenuation of visual processing for action compatible stimuli on manual as well as vocal response trials.
Are Action Compatibility Effects on Perception Action Specific?
At a more general level, the finding that very similar ERP effects of action compatibility were observed on trials where participants prepared manual or vocal responses might call into question our hypothesis that these effects reflect perceptual effects of action preparation that are specific to the perception of action-related stimuli. Their apparent lack of response-modality specificity might instead suggest that these effects are due to a more generic type of compatibility that arises whenever there is a mismatch between certain perceptual features of visual stimuli (such as movement direction) and response parameters. To check whether the ERP correlates of action compatibility found in Experiments 1 and 2 were indeed action specific, we conducted a control experiment with 12 new participants (6 male, mean age = 27.5 years, range 24–30 years). Procedures were identical to the manual response condition of Experiment 2, except that male and female hands were replaced by action-unrelated visual stimuli (blue or red squares with an angular size of 0.8° × 0.8°). At the start of each trial, one square appeared at the same elevation as the tip of the index finger in Experiment 2 and in the center of the screen, and movement direction was again indicated by a letter cue (“U” or “D,” indicating an upward or downward index finger movement with respect to the back of the hand). The imperative stimulus was a picture of this square that was displaced upward or downward by the same amount as the index finger in Experiment 2, with blue and red square displacements identical to the movements of male and female index fingers, respectively. AC+ and AC− trials were defined with respect to the direction of the square movement.
Participants were again faster on AC+ relative to AC− trials (419 vs. 451 msec), F(1, 11) = 26.7, p < .001, demonstrating that this behavioral action compatibility effect was not dependent on the presence of a moving hand but remained present for action-unrelated visual stimuli. In contrast, and most importantly, the ERP effects of action compatibility that were consistently observed in Experiments 1 and 2 at posterior electrodes for moving hands were absent when hands were replaced by moving squares. This is illustrated in Figure 5, which directly compares ERPs on AC+ and AC− trials at occipital electrodes (collapsed across PO7, PO8, and Oz) for the manual response condition of Experiment 2 (left) and for the control experiment (right). The onset of visual ERP components was delayed in this control experiment because the visual transient associated with square movements was much smaller than with finger movements. To account for this latency shift, time windows used for the analysis of ERP mean amplitudes at PO7, Oz, and PO8 were adjusted (N1, 190–240 msec; N2, 250–350 msec). There were no effects of action compatibility within either time interval, both F(1, 11) < 1.9, both p > .2.
These observations strongly suggest that the ERP effects of action compatibility observed in Experiments 1 and 2 do not merely reflect a generic conflict between mismatching low-level perceptual and response features but are more specifically linked to the effects of action preparation on the perception of action-related stimuli. The results of the control experiment are therefore consistent with our suggestion that Experiment 2 found compatibility effects with vocal as well as manual responses because preparation of vocal and manual responses activates perceptual representations that are primed by the experimental context rather than a more global action-unspecific concept. For example, if preparing to say “up” or to make an up manual response activated a global “up” concept and thereby all low-level perceptual representations of “up” stimuli, one would have expected the same ERP effects with inanimate stimuli in the control experiment as were observed in Experiments 1 and 2 with hand action stimuli. However, if preparing an “up” response, vocal or manual, activated only action-specific representations that are primed by the context (e.g., by the presence of visually presented finger movements in Experiments 1 and 2), one would not expect similar effects with inanimate stimuli.
Behavioral Consequences of Links between Perception and Action
In both experiments, RTs were faster on trials where visual imperative stimuli were compatible with a prepared response than on incompatible trials, both for spatial and action compatibility. This observation is consistent with many previous experiments investigating the effects of irrelevant response-compatible and incompatible visual stimuli on action (e.g., Brass, Bekkering, & Prinz, 2001; Simon, 1990). Response compatibility effects are thought to reflect automatic activation of compatible or incompatible motor representations upon presentation of visual stimuli. On compatible trials, motor representations activated by visual imperative stimuli are consistent with the cued response, and RTs are fast. On incompatible trials, such motor representations do not match the cued response and interfere with response execution, resulting in slower RTs. The observation in Experiment 2 that action compatibility effects were present regardless of whether manual or vocal responses were required demonstrates that automatic response activation by visual stimuli can transfer across response modalities, suggesting that higher levels of motor representation are involved.
Experiment 1 has shown that response compatibility effects on RT arise both for the case of spatial and action compatibility. It should be noted that these behavioral effects were also mirrored by P3 amplitudes in Experiment 1, which were larger for incompatible relative to compatible trials. Enhanced P3 amplitudes have been associated with an updating of representations of the environment (e.g., Donchin, 1981) that is required whenever a new stimulus does not fit into an expected situational context. Thus, the P3 enhancement for incompatible trials is likely to reflect the fact that motor representations associated with the cued response were incongruent with motor representations activated by the visual imperative stimulus. There was, however, also an interaction between spatial and action compatibility for RTs and P3 amplitudes: The RT and P3 effects of action compatibility were larger for spatially compatible relative to incompatible stimuli. This observation that action preparation had stronger effects on performance and P3 amplitudes when there was an exact match between a prepared and a perceived action may be indicative of functional links between spatial and action compatibility. Alternatively, the spatial conflict between perception and action on spatially incompatible trials may have had a stronger impact on response selection than the incompatibility of action types on action incompatible trials. This apparent dominance of spatial over action compatibility in Experiment 1 may be due to the fact that the disparity between the visual cues responsible for spatial compatibility effects (i.e., movement of the left versus right hand) was perceptually more salient than the disparity associated with action compatibility (i.e., finger lift versus tap).3
Finally, the question needs to be addressed how behavioral compatibility effects (i.e., faster responses on compatible than on incompatible trials) can be reconciled with the ERP evidence that suggests an attenuation of perceptual processing on compatible trials. This is especially relevant for Experiment 2, where attenuated processing of action compatible stimuli was evident at an early poststimulus latency (N1 component). In fact, there is no contradiction between these behavioral and ERP findings because they reflect different types of perception–action interactions. The visual ERP modulations found here reflect effects of action preparation on perception. They represent the facilitation or the attenuation of visual processing as a function of whether a current action plan is compatible or incompatible with the features of a visual stimulus. In contrast, behavioral compatibility effects are generated when perception affects response selection and execution, that is, when a visual imperative stimulus automatically activates a corresponding motor representation. This distinction is further supported by the results of our control experiment where moving hands were replaced by moving squares (see above), which found reliable behavioral effects of action compatibility (i.e., perception–action effects), but no effects on visual ERPs at posterior electrodes.
Action planning has both facilitatory and attenuating effects on visual perception. When preparing an action, the processing of visual stimuli on the same side of space as prepared actions is initially enhanced, which is likely due to attention being directed to that side of space. This initial facilitation of processing of spatially compatible stimuli is then followed by an attenuation in processing of spatially compatible stimuli as well as of stimuli that represent the same configural action type. These later effects are likely driven by efference copy signals of the predicted consequences of actions, which result in reduced sensory processing of stimuli matching these predictions. The attenuation of action compatible stimuli is not response-modality specific and thus appears to be mediated by higher level representations of action. Our suggestion that spatial attention and forward model mechanisms have opposite but temporally distinct effects on the perception of action stimuli can explain the inconsistency of the findings reported in the recent literature on action–perception links and thereby support the view that the mirror system consists of bidirectional sensorimotor links. Bidirectional links of this kind are likely to be crucial, not only for the control of our own actions but also in sociocultural interaction, allowing us to predict the actions and reactions of others.
This research was supported by a Postdoctoral Fellowship awarded to C. P. by the Economic and Social Research Council (ESRC), award number PTA-026-27-1602. M. E. holds a Royal Society-Wolfson Research Merit award. The authors are grateful to Heijo van de Werf for making the response keys, to Sue Nicholas for help with programming and data collection, and to James Kilner for comments on an earlier version of the manuscript.
Reprint requests should be sent to Clare Press, Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London, WC1N 3BG, UK, or via e-mail: firstname.lastname@example.org.
ANOVAs conducted on ERP mean amplitudes obtained in Experiment 1 for the P1, N1, and N2 time windows at lateral central electrodes C3/C4 found no evidence of any spatial compatibility effects (all F < 1.5, all p > .2).
It should be noted that the “code occupation hypothesis” (e.g., Stoet & Hommel, 1999) predicts that response preparation involves integrating sensory representations of response-compatible features into an action plan. Integration results in these features being less available to perceptual processing, leading to less efficient perceptual processing of action-compatible stimuli. This theory in fact predicts a temporal sequence of facilitation followed by attenuation, predicting facilitatory effects after imperative stimulus presentation, but before inhibitory effects associated with response preparation. However, assuming that the interval between cue onset and target onset in the present study (900–1400 msec) was long enough to allow responses to be prepared by the time the imperative stimulus was presented, the code occupation hypothesis would suggest that no facilitatory compatibility effects should have been observed.
Further evidence for the dominance of spatial over action compatibility in Experiment 1 comes from an additional analysis of anterior ERPs at F3, Fz, and F4 during the 190- to 220-msec poststimulus time window (not shown in figures). An enhanced negativity was observed for spatially incompatible relative to compatible trials, F(1, 11) = 7.3, p < .03, but there was no such difference between action compatible and incompatible trials, F(1, 11) < 1. Enhancements of the anterior N2 component are usually interpreted as reflecting top–down cognitive control processes involved in conflict monitoring (for a recent review, see Folstein & van Petten, 2008). The fact that such an effect was present for spatial but not action compatibility further supports the hypothesis that the conflict between perceptual and motor representations was more pronounced on spatially incompatible trials relative to action incompatible trials.