There is evidence that action observation (AO) and the processing of action-related words are associated with increased activity in cortical motor regions. Research has examined the effects of AO and action verb processing on activity in the motor system independently. The aim of this experiment was to investigate, for the first time, the modulation of corticospinal excitability and visual attention during the concurrent processing of action verbs and AO stimuli. Twenty participants took part in an integrated transcranial magnetic stimulation and eye-tracking protocol. Single-pulse transcranial magnetic stimulation was delivered to the hand representation of the left motor cortex during (i) observation of a static hand, (ii) AO of a hand squeezing a sponge, (iii) AO of the same action with an audio recording of the word “squeeze,” and (iv) AO of the same action with an audio recording of the word “green”. Motor evoked potentials were recorded from the abductor pollicis brevis and abductor digiti minimi muscles of the right hand. Eye gaze was recorded throughout the four conditions as a proxy for visual attention. Interviews were conducted to discuss participants' preferences and imagery use for each condition. The AO and action verb condition resulted in significantly increased motor evoked potential amplitudes in the abductor pollicis brevis muscle; participants also made significantly more fixations on the sponge and reported wanting to move their hand more in the action verb condition. The inclusion of auditory action verbs, alongside AO stimuli, in movement simulation interventions could have implications for the delivery of AO interventions for motor (re)learning.
The development of effective interventions is essential to improve the quality of life for individuals affected by movement disorders. One such intervention, action observation (AO), which can include live demonstrations or video models (Holmes, 2007), involves the deliberate and structured observation of human movement (Neuman & Gray, 2013) and is typically employed as an adjunct to physical therapy. The implementation of AO-based interventions has been shown to contribute to improvements in motor function following various motor impairments, including stroke (Chatterton et al., 2008; Ertelt et al., 2007), Parkinson's disease (Pelosin et al., 2010), and cerebral palsy (Buccino et al., 2012; see Buccino, 2014, for a review).
The cognitive neuroscience literature offers a potential mechanism to explain the efficacy of AO interventions for motor (re)learning. According to Simulation Theory (Jeannerod, 2001), AO is associated with activity in some of the brain regions shared with action execution, particularly those involved with motor planning and preparation. The theory has been supported by neuroimaging research, which shows that similar, but not identical brain regions are active during action execution and observation of the same action (Hardwick, Caspers, Eickhoff, & Swinnen, 2018; Filimon, Rieth, Sereno, & Cottrell, 2014; Caspers, Zilles, Laird, & Eickhoff, 2010; Grezès & Decety, 2001). For example, in a recent meta-analysis of neuroimaging data, Hardwick et al. (2018) identified a bilateral premotor, parietal and sensorimotor network that was active during both action execution and AO. This network included regions of the SMA, ventral and dorsal premotor cortex, and inferior parietal lobule. In addition, premotor–parietal and occipital regions were primarily active for AO, and the primary motor cortex was active primarily for action execution.
Transcranial magnetic stimulation (TMS) research provides further support for some shared activity in brain regions associated with action execution and AO. The application of single-pulse TMS to the primary motor cortex elicits a motor evoked potential (MEP) in the corresponding muscle, the amplitude of which provides a marker of corticospinal excitability (Naish, Houston-Price, Bremner, & Holmes, 2014). Using this method, Fadiga, Fogassi, Pavesi, and Rizzolatti (1995) demonstrated that corticospinal excitability was facilitated during the observation of various human movements, compared with non-movement-related control conditions. This effect has since been replicated for a variety of populations and tasks (see Naish et al., 2014; Loporto, McAllister, Williams, Hardwick, & Holmes, 2011, for reviews). In relation to motor (re)learning, it is possible that the increased activity in cortical motor regions following repeated engagement in AO contributes to a Hebbian modulation of intracortical excitatory mechanisms and promotes synaptic plasticity in a similar manner to physical practice (Holmes & Calmels, 2008). Consequently, considerable research attention has been devoted to establishing the effect of manipulating different AO variables on activity in the motor system, with a view to informing the design and delivery of AO interventions for motor (re)learning (Holmes & Wright, 2017).
One variable that may have the potential to further facilitate activity in the motor system during AO is the inclusion of congruent auditory action-related verbs alongside observation of the stimuli, because there is also evidence that the processing of action-related words is associated with activity in the motor regions of the brain. For example, using fMRI, Hauk, Johnsrude, and Pulvermüller (2004) showed that silent reading of action-related words elicited activity in the areas of the motor cortex responsible for muscles associated with execution of that action. Listening to action-related sentences has been shown to have a similar effect. Tettamanti et al. (2005) reported that listening to action-related sentences was associated with increased activity in motor-related brain regions, including premotor cortex and inferior parietal lobule, compared with listening to non-action-related abstract sentences. Raposo, Moss, Stamatakis, and Tyler (2009) extended these findings by demonstrating that the extent to which action verb processing evokes activity in motor regions is modulated by the context in which the verbs are presented. Specifically, activity in motor regions was reported when action verbs were presented in isolation (e.g., kick) or in literal goal-directed sentences (e.g., kick the ball), but not when presented in idiomatic sentences (e.g., kick the bucket).
Similar experiments have been conducted using TMS, although the findings are less consistent. For example, Oliveri et al. (2004) presented hand-action-related and non-action-related verbs and nouns on screen before asking participants to verbalize the words shown. TMS was delivered to the hand representation of the motor cortex 500 msec after presentation of the stimuli while participants were processing the word and retrieving an appropriate response. Corticospinal excitability was facilitated to a greater extent during the processing of the action-related words, compared with non-action-related words. The authors interpreted this finding as an indication that motor programs associated with a specific action may be activated by the processing of words related to that action. In contrast, in a related experiment, Buccino et al. (2005) used TMS to explore how listening to action-related sentences modulated activity in the motor system. Participants listened to action-related sentences and abstract content sentences, with the stimulation delivery time-locked to when the second syllable of the verb in each sentence was audible. They reported a decrease in MEP amplitude when listening to action-related sentences, compared with the abstract content sentences. The authors argued that this conflicting finding might be due to the timing of the TMS delivery, because the stimulation was delivered during the second syllable of the verb, before the relevant noun had been spoken. As the noun may have provided context and meaning to the sentence, participants may have been unable to simulate a motor representation for the action, resulting in lower amplitude MEPs. To resolve this potential confound, Papeo, Vallesi, Isaja, and Rumiati (2009) used a similar paradigm to Oliveri et al. (2004) but manipulated the timing of the TMS delivery. Verbs were presented on screen, and participants were asked to decide whether the verbs were action related (semantic task) or how many syllables the word contained (syllabic task). When TMS was delivered at either 170 or 350 msec after the presentation of the verb, when lexical-semantic processing occurs, there was no increase in MEP amplitude. There was an increase in MEP amplitude, however, when the stimulation was delivered 500 msec after the presentation of the verb on screen, but only for the semantic task. The authors concluded that language-related enhancement of motor system activity is dependent on the explicit retrieval of the action content of the word, as opposed to simply the recognition or structure of the word.
Taken together, there is evidence that both AO and the processing of action-related words are associated with increased activity in cortical motor regions of the brain. To date, however, research has examined the effects of both AO and action verb processing on activity in the motor system independently. It is possible that the inclusion of auditory action verbs presented at an appropriate time with congruent AO stimuli will elicit increased activity in the motor system, and this could have implications for the delivery of AO interventions for motor (re)learning.
Visual attentional processes have the potential to modulate activity within the motor system during AO. As such, the inclusion of eye-tracking technology is becoming increasingly common in AO research, particularly alongside TMS techniques (e.g., Riach, Holmes, Franklin, & Wright, 2018; Wright et al., 2018; D'Innocenzo, Gonzalez, Nowicky, Williams, & Bishop, 2017; Donaldson, Gurvich, Fielding, & Enticott, 2015). There is evidence that corticospinal excitability is facilitated to a greater extent when visual attention is directed explicitly toward task-relevant aspects of the display (Wright et al., 2018; D'Innocenzo et al., 2017) and when there are more fixations on hand–object interactions (Wright et al., 2018; Donaldson et al., 2015). When engaging in AO of hand–object interaction stimuli, it is conceivable that the inclusion of auditory action verbs, which refer to the hand–object interaction (e.g., the word “squeeze” when a hand is observed squeezing a sponge), may serve to direct visual attention toward the object. This would be expected to be reflected in an increased number and duration of fixations on the object, which, based on the findings of Donaldson et al. (2015) and Wright et al. (2018), may facilitate corticospinal excitability.
The aim of this experiment was to investigate, for the first time, the modulation of corticospinal excitability and visual attention during the concurrent processing of action verbs and AO stimuli. It was hypothesized that (i) corticospinal excitability would be facilitated during AO compared with observation of a static hand; (ii) this facilitation in corticospinal excitability would be greater when auditory action verbs congruent to the observed action were provided in conjunction with the AO stimuli, but not when non-action-related words were provided; and (iii) auditory action verbs congruent to the observed action would be associated with greater visual attention to objects involved in the observed action.
Twenty right-handed individuals (11 men, 9 women), with a mean age of 22 ± 3.1 years, participated in the experiment. The TMS Adult Safety Screen (Keel, Smith, & Wassermann, 2001) was used to ensure that no participants were predisposed to possible adverse effects of the stimulation. No participants were excluded based on these criteria, and none reported discomfort or negative reactions during the experiment. All participants provided full written informed consent prior to participation. The protocol was granted ethical approval by the Manchester Metropolitan University local ethics committee. See Table 1 for participant demographic characteristics and TMS characteristics.
|Demographic Characteristics .||TMS Method .|
|Sample Size .||Sex .||Age, years .||OSP .||RMT Intensity .||Stimulation Intensity .|
|20||11 male||22.95||4.0 cm (±0.5) lateral||48.9%||55.1%|
|9 female||(±3.1)||1.5 cm (±0.5) anterior||(±6.5)||(±6.5)|
|Demographic Characteristics .||TMS Method .|
|Sample Size .||Sex .||Age, years .||OSP .||RMT Intensity .||Stimulation Intensity .|
|20||11 male||22.95||4.0 cm (±0.5) lateral||48.9%||55.1%|
|9 female||(±3.1)||1.5 cm (±0.5) anterior||(±6.5)||(±6.5)|
OSP = optimal scalp position (calculated as modal distance from Cz); RMT = resting motor threshold.
EMG recordings were collected from the midpoint of the muscle belly of the abductor pollicis brevis (APB) and the abductor digiti minimi (ADM) of the right hand using DE-2.1 bipolar, single differential surface EMG electrodes connected to Bagnoli-8 EMG system and (Delsys, Inc.). A reference electrode was attached over the right ulnar process. Electrode sites were cleaned using alcohol wipes prior to attachment. The EMG signal was recorded using Spike2 Version 6.18 software (Cambridge Electronic Design) via a Micro 1401-3 analogue-to-digital converter (Cambridge Electronic Design), with a sampling rate of 2 kHz, bandwidth of 20 Hz to 450 kHz, 92 dB common mode rejection ratio, and >1015 Ω input impedance.
A figure-of-eight coil (two 70-mm diameter loops) was used to deliver single-pulse stimulations from a Magstim 2002 magnetic stimulator (Magstim Co.), which delivers monophasic pulses with a maximum field strength of 2.2 T. The TMS procedure followed the published guidelines of Loporto et al. (2011) for AO research. The coil was held accurately over the hand representation of the left motor cortex with a mechanical arm (Manfrotto UK Limited) and was orientated for the induced current to flow in a posterior–anterior direction by positioning the coil at a 45° angle to the midline between nasion and inion landmarks of the skull (Roth & Hallett, 1992). This coil orientation was used to achieve indirect transsynaptic activation and optimal MEP amplitudes (Opitz et al., 2013; Sakai et al., 1997). The optimal scalp position was found by stimulating the approximate area of the motor cortex for the APB and ADM muscles of the right hand at 60% stimulation intensity (Wright, Williams, & Holmes, 2014). The coil was then moved in 1-cm steps around this area until the site that produced MEPs of largest amplitude in both muscles was found. This area was then marked on a tightly fitting cap worn by the participants to ensure consistent coil placement throughout the experiment. After identifying the optimal scalp position, the resting motor threshold was determined by gradually adjusting the stimulation intensity until peak-to-peak MEP amplitudes of 50 μV or less were found in 5 of 10 trials (Rossini et al., 2015). Following the protocol defined by Loporto, Holmes, Wright, and McAllister (2013), the experimental stimulation intensity was then set at 110% resting motor threshold to reduce direct wave stimulation.
Participants' eye movements were recorded throughout the experiment using iView ETG 2.7 software (SensoMotoric Instruments) at a sampling rate of 60 Hz. The eye-tracking glasses (ETG 2w, SensoMotoric Instruments) contained two cameras and projected six infrared lights onto both of the participants' eyes to record eye movements. A circular cursor indicated the location of gaze in the visual scene recorded from a forward-facing camera to an accuracy of 0.5°.
Participants were asked to complete a short interview with four questions: (1) Which condition did you prefer? (2) Why did you prefer that condition? (3) Did you use imagery during any of the conditions? (4) If you answered yes to Question 3, when and how did you use the imagery? An explanation of what imagery involves was provided to ensure that there were no differences in how participants interpreted the term. Participants were told that, in this context, imagery referred to “either deliberately or unintentionally imagining the feelings and sensations associated with performing the movement whilst they observed it.”
Participants were seated in a dimly lit room with their elbows flexed at 90° and both hands resting on a table directly in front of them and positioned under a black-painted wooden box. The left hand was in a relaxed, pronated position, and the right hand was supinated in the same position as the hand on the screen (see Figure 1). A chinrest and a headrest were used to limit head movements. Participants were asked to refrain from any voluntary movement during each condition and to attend fully to the actions presented on the screen. Blackout curtains were drawn alongside the screen and table setup to reduce any extraneous distracting visual stimuli. Participants were asked to wear the eye-tracking glasses for the duration of the experiment. To calibrate the glasses, a 3-point calibration system was used on a 6-point grid prior to testing. Calibration accuracy was checked prior to commencing each experimental condition, and a recalibration was conducted if necessary.
Each participant took part in four conditions, each of which consisted of observing 25 video trials on a horizontally orientated 32-in. LCD screen (DGM Model LTV-3203H). Condition 1 (static) involved the observation of a supinated static right hand holding a sponge between the thumb and four fingers in front of a plain background. This stimulus was chosen as Loporto et al. (2011) argued that the use of a static image of a body or body part as the control condition is the most appropriate as it ensures that any facilitation of corticospinal excitability during AO conditions is related to the observation of biological movement. In contrast, when using rest, a fixation cross, or blank screen as a control, it is not possible to determine whether a facilitation effect is due to the observation of biological movement or rather just the presence of some form of visual stimuli on screen (Loporto et al., 2011). The static image of the hand and arm therefore provides a more stringent control condition against which the other conditions can be compared. Condition 2 (AO) involved the observation of the same hand in the same position squeezing a sponge three times. Condition 3 (AO and action verb [AOAV]) involved the observation of the same video as the AO condition, with the addition of an audio recording of the word “squeeze” each time the hand squeezed the sponge. The word “squeeze” was chosen as the action verb for use in this condition, as it accurately described the action presented on screen. Condition 4 (AO and non-verb [AONV]) involved the observation of the same video as the AO condition, with the addition of an audio recording of the word “green” each time the hand squeezed the sponge. The word “green” was chosen as the non-action-related word for use in this condition, as it was of similar length and sound to the word “squeeze” (e.g., both contain the long “e” sound) and was not movement-related so would not imply any form of action. All videos showed a White athletic female's right hand and forearm, filmed from a first-person visual perspective, and positioned to the right of the screen to give the visual appearance of the observed arm and hand in a similar position to the participant's own limb. The hand and arm were free from any distinguishing features, the skin tone of the model was similar to all of the participants, and no jewelry was worn by the model. A horizontal screen position and hand position to the right of the screen was chosen to increase the participant's perception of the hand being their own and was in line with the protocol used by Wright, McCormick, Williams, and Holmes (2016) and Riach, Wright, Franklin, and Holmes (2018). All videos were 12 sec in duration, with the AO videos containing three squeezes of the sponge. One stimulation was delivered per trial at the point of maximal flexion of the hand during the second sponge squeeze (6252 msec after video onset). Participants were given a break of approximately 2 min between each block. At the end of the TMS procedure (see Figure 2), participants were asked to complete a short survey about their experiences of each condition. The testing session lasted for approximately 90 min per person.
Increased EMG activity prior to stimulation can modulate the amplitude of the subsequent MEP (Devanne, Lavoie, & Capaday, 1997; Hess, Mills, & Murray, 1987). Consequently, EMG activity 200 msec prior to delivery of the TMS pulse was checked to identify trials with increased muscle activity immediately prior to the stimulation. Trials in which the baseline peak-to-peak amplitude was 2.5 SDs greater than the mean baseline were discarded from further analysis (Loporto, McAllister, Edwards, Wright, & Holmes, 2012). To demonstrate that there were no differences in the muscle activity prior to the delivery of TMS in the remaining trials, a 2 (Muscle: APB, ADM) × 4 (Condition: static, AO, AOAV, AONV) repeated-measures ANOVA was run on the EMG amplitude data 200 msec prestimulation. Results demonstrated that there was no significant Muscle × Condition interaction, F(3, 57) = 2.72, p > .05, and no significant main effect for Condition, F(3, 57) = 2.76, p > .05, or Muscle, F(1, 19) = 1.60, p > .05. To account for interindividual variability in TMS-induced activity, raw MEP data were transformed into z scores prior to analysis with a 2 (Muscle: APB, ADM) × 4 (Condition: static, AO, AOAV, AONV) repeated-measures ANOVA.
Individual trials were analyzed by drawing four separate areas of interest (AOIs) around the sponge, thumb, other hand areas, and background (see Figure 3). The number of fixations and duration of fixations within each AOI was calculated for each condition using BeGaze 3.7 software (SensoMotoric Instruments). A fixation was defined as any gaze that remained stable within 1° of visual angle for a minimum duration of 100 msec (Salvucci & Goldberg, 2000). The fixation count and fixation duration data were analyzed using separate 4 (AOI: sponge, thumb, other hand areas, background) × 4 (Condition: static, AO, AOAV, AONV) repeated-measures ANOVAs.
For both the TMS and eye-tracking analysis, where Mauchly's test indicated that the assumption of sphericity was violated, the Greenhouse–Geisser method was used to correct the degrees of freedom. The alpha level for statistical significance was set at p < .05, and post hoc pairwise comparisons with Bonferroni corrections were performed on significant results. Effect sizes were calculated using Cohen's d.
Descriptive statistics were analyzed for the preference (Question 1) and use of imagery (Question 3) questions. Questions 2 and 4 were analyzed thematically to identify themes associated with preference and imagery (Braun & Clarke, 2006). Coding and data management were facilitated using NVivo qualitative data analysis software (Version 11). Strategies to enhance analytic rigor included comparison of categories and themes between the interview responses. The themes were verified further following discussion with the wider research team to ensure they were comprehensive.
A 2 (Muscle) × 4 (Condition) repeated-measures ANOVA on the z score MEP amplitude data showed a significant Muscle × Condition interaction, F(5, 95) = 4.97, p = .01, and a significant main effect for Condition, F(5, 95) = 11.72, p < .001. There were no significant effects of Muscle, F(1, 19) = 1.05, p = .32 (see Figure 4). Bonferroni pairwise comparisons from the interaction effect indicated that, in the APB muscle, MEP amplitudes were significantly higher in the AOAV condition compared with the static (p < .001, d = 0.6), AO (p < .001, d = 1.7), and AONV (p < .001, d = 1.03) conditions. MEP amplitudes were also larger in the AO condition compared with static (p = .01, d = 0.9). In the ADM muscle, pairwise comparisons indicated that MEP amplitudes were significantly lower in the static condition compared with the AO (p = .007, d = 0.5) and AOAV (p = .048, d = 0.2) conditions.
The 4 (AOI) × 4 (Condition) repeated-measures ANOVA on the fixation count data showed a significant AOI × Condition interaction, F(9, 171) = 7.938, p < .001, and a significant main effect for AOI, F(3, 57) = 9.981, p < .001. There were no significant effects of Condition, F(3, 57) = 0.688, p = .56 (Figure 5). Bonferroni pairwise comparisons indicated that there were significantly more fixations on the sponge compared with the thumb in the static (p = .01, d = 0.9), AO (p = .02, d = 1.10) and AONV (p = .02, d = 0.8) conditions. In the AOAV condition, there were significantly more fixations on the sponge compared with all other AOI (thumb, p = .001, d = 1.8; other hand, p < .001, d = 3.10; and background p < .001, d = 3.01), and participants fixated on the thumb significantly more than the other hand (p = .04, d = 1.01). Participants made significantly fewer fixations on the other hand in the AOAV condition compared with AO (p = .005, d = 1.26) and AONV (p = .001, d = 1.27) conditions. There were significantly more fixations on the sponge in the AOAV condition compared with all other conditions (static, p = .009, d = 1.08; AO, p = .001, d = 1.16; AONV, p < .001, d = 1.21). There were no other significant differences between any other conditions or AOI.
The 4 (AOI) × 4 (Condition) repeated-measures ANOVA on the fixation duration data showed a significant AOI × Condition interaction, F(9, 171) = 6.153, p < .001, and a significant main effect for AOI, F(3, 57) = 8.790, p < .001. There were no significant effects of Condition, F(3, 57) = 0.195, p = .90 (Figure 6). Bonferroni pairwise comparisons indicated that participants fixated for significantly longer on the sponge compared with the thumb in the AO (p = .03, d = 0.9) and AONV (p = .04, d = 0.9) conditions. In the AOAV condition, fixation duration was significantly longer on the sponge compared with all other AOI (thumb, p = .001, d = 1.68; other hand, p < .001, d = 2.84; and background p < .001, d = 2.71). Participants fixated for a significantly shorter duration on the other hand in the AOAV condition compared with the static (p = .01, d = 1.08), AO (p = .006, d = 1.22), and AONV (p = .001, d = 0.9) conditions. There was a significantly longer fixation duration on the sponge in the AOAV condition compared with static (p = .005, d = 0.9) and AONV conditions (p = .008, d = 0.7). There were no other significant differences between any other conditions or AOI.
The four conditions and participants' viewing experiences of these conditions (e.g., movement agency, movement kinaesthesis) provided the structure for the thematic analysis. Analysis of the interviews suggested primary themes of preference and imagery were associated with the four conditions. Data from the interview are presented under the deductive themes of “preference” and “imagery.”
All participants reported that they preferred conditions involving movement, rather than the static condition. Most participants (85%) reported that they preferred the AOAV condition, while the remaining 15% reported that they preferred the AO condition. Three main themes contributed to participants' preference for a particular condition: (1) meaning, (2) realism, and (3) increased need to move. Participants reported that the word squeeze in the AOAV condition made the hand feel more like their own and made them want to move their hand more (“it [AOAV condition] was more realistic and made me think I needed to squeeze”—P5). Participants who preferred the AOAV condition reported that the addition of task-relevant sound made them focus more on the action (“I wasn't trying to, but when you hear the word (squeeze), it made me focus on the hand”—P7; “Makes it feel as though the hand is yours, particularly when you look at the thumb moving, it was more realistic”—P11). In contrast, participants who preferred the AO condition reported that they found it harder to concentrate when the word was spoken because of their need to move (“it was harder to keep my hand relaxed with the word, when the sound was there I wanted to move it more”—P20). Some participants said that during the static condition they were searching for movement to occur (“Even though after a while I knew the hand wasn't going to move, I kept looking at it expecting something to happen”—P1).
Despite no imagery instructions being provided in this experiment, 75% of participants reported using some form of imagery during one or more of the conditions. Participants reported that during the AO and AOAV conditions they found it easier to imagine that they were doing the action (kinesthetic imagery). Participants who preferred the AOAV condition reported that when the word squeeze was spoken over the video they imagined that their hand was moving, thereby increasing the kinesthetic imagery (“Each time I heard the word squeeze I imagined my hand moving”—P8). Furthermore, kinesthetic imagery was increased as they imagined how they would squeeze the sponge and the feelings they would experience (“I wasn't trying to but when the hand was there I was thinking of the feeling”—P5).
The aim of this experiment was to investigate whether the inclusion of congruent auditory action verbs alongside AO stimuli would modulate corticospinal excitability and visual attention. In support of the first hypothesis, corticospinal excitability was facilitated during AO of a hand squeezing a sponge compared with the observation of a static hand holding a sponge in both the APB and ADM muscles. This finding is consistent with previous TMS research indicating that AO is associated with increased activity in the extended motor system (e.g., Fadiga et al., 1995; see Naish et al., 2014; Loporto et al., 2011, for reviews). Recent meta-analyses of neuroimaging data have shown that AO does not elicit activity in the motor cortex consistently (e.g., Hardwick et al., 2018; Caspers et al., 2010), which was activated by TMS in the current experiment. This research does, however, indicate that the premotor cortex is activated reliably by AO. Consequently, the facilitation in corticospinal excitability during AO is generally accepted to reflect increased activity in premotor regions that link to the motor cortex via strong cortico-cortical connections (Fadiga, Craighero, & Olivier, 2005).
The data provide support for the second hypothesis in that the inclusion of auditory action verbs alongside the AO stimuli (AOAV condition) facilitated corticospinal excitability in the APB muscle to a greater extent than AO alone. The inclusion of non-action-related words alongside the AO stimuli (AONV condition) had no significant effect. Previous research has shown that the processing of action-related words is associated with activity in premotor regions of the brain (Raposo et al., 2009; Tettamanti et al., 2005; Hauk et al., 2004) and increased corticospinal excitability (Papeo et al., 2009; Oliveri et al., 2004). In the AOAV condition the processing of the action verb ‘squeeze’, therefore, may have elicited increased activity in the premotor cortex, in addition to the activity that would be induced by the AO stimuli (Hardwick et al., 2018). Consequently, it is plausible that corticospinal excitability was facilitated to a greater extent in the AOAV condition, compared with the AO condition, as a result of increased activity in the premotor cortex due to the processing of both AO stimuli and the action verb simultaneously, albeit with some associated kinesthetic imagery. In contrast, the processing of a non-action-related word in AONV condition did not result in increased activity in the motor system because of the lack of action content with the word “green.” This could explain why MEP amplitude in the AONV condition was not facilitated in comparison to the AO condition. Indeed, the incongruence between the AO and auditory stimuli may have also suppressed the MEP amplitude to some extent in this condition, and this requires further consideration.
The eye-tracking data recorded in this experiment provide some insight into the mechanisms by which corticospinal excitability was facilitated in the AOAV condition, compared with the AO condition. The inclusion of an auditory action verb alongside the AO stimuli appears to have modulated participants' visual attention. Traditionally, experiments and interventions using AO are auditory free. In this study, the AO video was manipulated to include sound, which was either action related (squeeze) or task irrelevant (green). The significant difference in the MEP amplitude between the AO and AOAV conditions can be explained through different attentional mechanisms. In support of the final hypotheses, the inclusion of action-related words (AOAV) resulted in significantly more fixations on the sponge compared with other AOIs and in comparison to all other conditions. Furthermore, there were significantly fewer fixations on the other hand areas in the AOAV condition compared with the other two conditions containing movement (AO and AONV). In all four conditions, participants fixated on the sponge significantly more than the thumb. Visual attention is controlled by both cognitive factors (top–down control, related to current goals) and sensory stimulation (bottom–up factors). The dynamic interaction between these two factors controls the cues that we attend to in the environment. The addition of an action-related verb facilitated a goal-directed attentional system and suppressed the stimulus-driven attentional system (Corbetta & Shulman, 2002). In contrast, in the AO or static condition, participants are responding to sensory information in the environment and attempting to attribute the meaning, the goal, and the intention of the action to the image or video. The qualitative data highlighted that in the static condition participants reported that they were “searching for movement.” This comment is supported by there being no significant differences in the fixation duration between the four AOIs in the static condition. Visual processing is commonly thought to proceed along two distinct pathways, a dorsal pathway and a ventral pathway (Milner & Goodale, 2008; Goodale & Milner, 1992). Each pathway has been considered to be responsible for understanding spatial vision (dorsal ‘where’ pathway) and object vision (ventral ‘what’ pathway) variables within the observed scene. The dorsal visual pathway, forms part of the dorsal attentional network and is concerned with automatic, guidance of action (e.g., reaching or squeezing; Kravitz, Saleem, Baker, & Mishkin, 2011) and is activated during top–down attentional control (Corbetta, Patel, & Shulman, 2008; Kincade, Abrams, Astafiev, Shulman, & Corbetta, 2005). The dorsal attentional network has been shown to be active during both visual and auditory task conditions (Braga, Fu, Seemungal, Wise, & Leech, 2016; Corbetta et al., 2008). It is possible, therefore, that there is greater activation of the dorsal attentional network during the AOAV condition due to the enhanced goal-relevant cues and congruency of the visual action with the auditory cue. The increased number of fixations on the sponge during the AOAV condition, may have been a result of an inhibition of eye movements when attentional resources are required on the auditory modality (squeeze), which can serve to reduce the amount of novel incoming visual information due to an enhanced goal-directed attentional mechanism and therefore, increased MEP amplitude. In contrast to the dorsal visual pathway, the ventral visual pathway is concerned with object recognition and is influenced by a variety of factors in the environment (Kravitz, Saleem, Baker, Ungerleider, & Mishkin, 2013). The lack of congruency of the word green with the visual action during the AONV condition, may have led to greater dominance of the ventral visual pathway (van Polanen & Davare, 2015; Corbetta & Shulman, 2002). Participants may have been searching for the meaning and goal of the action and attending to several cues in the environment reflected in slower processing of information. The proposal is supported by previous research, which suggests that the timing of stimulations is essential (Papeo et al., 2009). Language-related enhancement of motor system activity is dependent on the explicit retrieval of the action content of the word to retrieve the appropriate response, as opposed to simply the recognition or structure of the word (Papeo et al., 2009; Oliveri et al., 2004). Within this study, participants were stimulated during the second squeeze of the sponge; therefore, they had already seen and heard the relevant stimuli once. The delayed stimulation time allowed for the retrieval of motor representations associated with the task. This suggests that the inclusion of an action congruent word during AO allows for more meaning to be inferred, resulting in facilitation of the motor system. Within this study, the verb “squeeze” was chosen for its congruency with the action, and the adjective “green” was chosen to be the non-action-related word as it was of similar length and sound to the word “squeeze” (i.e., both contain the long “e” sound) and was not movement-related so would not imply any form of action. A word that was not movement related was included to ensure that any MEP differences were due to the word being action related. One potential area for future research is to use action-related verbs, which differ in their congruency with the depicted action (e.g., “squeeze” vs. “kick” or “squeeze” vs. “shake”). Investigating whether different verbs lead to an altered response may allow for additional auditory content in interventions to be more closely matched to the action on the screen.
More recently, researchers have highlighted the importance of directing participants' attention to the goal of the task rather than allowing them to view the visual scene passively (Wright et al., 2018; D'Innocenzo et al., 2017). Wright et al. (2018) identified that when participants' attention was directed toward the goal of the task (e.g., squeezing a ball) and not the index finger, this led to greater MEPs in the muscle involved with the action. In the current study, there was a significantly greater MEP amplitude in the APB muscle for the AOAV condition compared with the AONV condition. This finding was not replicated in the ADM muscle. This difference could be explained through two mechanisms: the allocation of attention and the primary muscle involvement. In all conditions involving the hand squeezing the sponge actively, the APB is the primary muscle that can be seen in the visual scene (see Figures 2 and 3). In contrast, the ADM muscle is smaller within the visual scene. To assess where participants' attention was allocated, we used the eye-tracking technology. Because of the proximity of the sponge to the thumb, the differences between the APB and ADM muscles may be due to the participants making microsaccades to the primary (APB) muscle involved in the action or seeing it in their peripheral vision, despite the fixation location primarily on the sponge. Although eye tracking is regarded as a more direct assessment of visual attention, it does not reflect covert attentional engagement. Attention can occur in the absence of eye movements (Zhao, Gersch, Schnitzer, Dosher, & Kowler, 2012), and the current eye-tracking technology does not measure peripheral vision. The qualitative data highlighted that the word “squeeze” gave more meaning to the video in comparison to the AO condition. Raposo et al. (2009) demonstrated that the extent to which action verb processing evokes activity in motor regions is modulated by the context in which the verbs are presented. Specifically, activity in motor regions was reported when action verbs were presented in isolation (e.g., kick) or in literal sentences (e.g., kick the ball). Within this study, the AO element of the video provided the context, and the word “squeeze” provided the meaning and the goal of the task, which produced a greater MEP response compared with the AO condition. There has been considerable discussion within the literature relating to the influence of AO providing individuals with the goal or intention of the task (e.g., Riach, Holmes, et al., 2018). The addition of the word “squeeze” gave the participants the goal of the task (i.e., to squeeze the sponge), which contributed to greater top–down control. In contrast, the intention of the task was not specifically shown (e.g., getting water out of the sponge). Some participants may have perceived the intention behind the task to be something specific to them. The qualitative findings, however, did not highlight this. Future research should consider incorporating all three elements (context, intention, and goal) through a more multimodal experiment to identify whether corticospinal excitability is modulated further when these three elements are more explicit in the AO/auditory information.
The facilitation of corticospinal excitability supports Jeannerod's simulation theory (Jeannerod, 2001). Jeannerod proposed that activation of the motor cortex and motor pathways during AO generates signals, which allows the participant to perceive that they are the agent of the covert activity without any physical behavior. One of the central tenets of this activation is the importance of the task being goal directed. The inclusion of a word that is task-relevant gives the participant meaning for the activity and, therefore, congruency, agency, and embodiment of the observed action. The qualitative data highlighted that participants not only did prefer the AOAV condition but also made the action feel more real, made them want to move their arm, and enhanced the use of kinesthetic imagery. There is a significant body of research within the field of TMS investigating the modulation of corticospinal excitability during concurrent AO and motor imagery (see Eaves, Riach, Holmes, & Wright, 2016; Vogt, Di Rienzo, Collet, Collins, & Guillot, 2013, for reviews). Motor imagery is defined as the combination of both visual and kinesthetic modalities to generate a mental image of a movement (Jeannerod, 1995). Despite no specific imagery instructions being provided, participants in this study reported that the word “squeeze” led to spontaneous kinesthetic imagery of squeezing the sponge and imagining the feelings they would experience when performing this action. The increase in realism, embodiment, and kinesthetic imagery may have also been facilitated by the horizontal position of the screen. Riach, Wright, et al. (2018) identified that when stimuli were presented in a first-person visual perspective on a horizontal screen, there was a greater sense of ownership associated with the arm on the screen. Jeannerod (1995) proposed that an imagined first-person visual perspective resulted in a greater kinesthetic experience of the action, compared with a third-person perspective where the self and imagined experience is separate. When an action is observed on a horizontal screen, which facilitates the first-person perspective more accurately, there is presumably no need for participants to rotate the AO perception to their perspective (Riach, Wright, et al., 2018). The requirement to generate a visual image of the action is also removed, which may enable participants to experience enhanced kinesthetic imagery and may be more beneficial for motor (re)learning (Eaves et al., 2016). Taken together, the combination of the horizontal screen position and the word squeeze supported the participant to perceive that they were the agent of the activity, thereby creating a more goal-directed state to view the action and facilitating kinesthetic imagery. An important area of future research is to consider participants' imagery perspective, modality, and agency during AO in a similar way to imagery studies (Holmes & Calmels, 2008; Holmes & Collins, 2001), as this may have differed between the four conditions. It may also be worthwhile to investigate the effects of manipulating different AO and motor imagery perspective combinations in interventions that include action verb processing on various neurophysiological and behavioral measures.
This study contributes to the body of work that has focused on manipulating different AO variables to inform the design and delivery of AO interventions for motor (re)learning. The addition of an action-related verb not only facilitated corticospinal activity but also enhanced the goal-directed attentional system, which led to spontaneous kinesthetic imagery. Future studies should aim to investigate whether the facilitated MEP is primarily due to the action verb processing, imagery, attentional factors individually, or a combination of the three. Future studies should consider including task-relevant sounds in AO interventions within a clinical population and testing whether this leads to a clinically meaningful improvement for motor dysfunction conditions.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Reprint requests should be sent to Zoë C. Franklin, Department of Sport and Exercise Sciences, Musculoskeletal Science and Sports Medicine Research Centre, Manchester Metropolitan University, Manchester, M15 6BH, United Kingdom, or via e-mail: email@example.com.