Abstract

Comprehending action words often engages similar brain regions to those involved in perceiving and executing actions. This finding has been interpreted as support for grounding of conceptual processing in motor representations or that conceptual processing involves motor simulation. However, such demonstrations cannot confirm the nature of the mechanism(s) responsible, as word comprehension involves multiple processes (e.g., lexical, semantic, morphological, phonological). In this study, we tested whether this motor cortex engagement instead reflects processing of statistical regularities in sublexical phonological features. Specifically, we measured brain activity in healthy participants using functional magnetic resonance imaging while they performed an auditory lexical decision paradigm involving monosyllabic action words associated with specific effectors (face, arm, and leg). We show that nonwords matched to the action words in terms of their phonotactic probability elicit common patterns of activation. In addition, we show that a measure of the action words' phonological typicality, the extent to which a word's phonology is typical of other words in the grammatical category to which it belongs (i.e., more or less verb-like), is responsible for their activating a significant portion of primary and premotor cortices. These results indicate motor cortex engagement during action word comprehension is more likely to reflect processing of statistical regularities in sublexical phonological features than conceptual processing. We discuss the implications for current neurobiological models of language, all of which implicitly or explicitly assume that the relationship between the sound of a word and its meaning is arbitrary.

INTRODUCTION

A fundamental question for the neurobiology of language is how the brain attains word meaning. An enduring perspective beginning with aphasiology in the nineteenth century is that conceptual representations are distributed amodally throughout the cortex, abstracted away from the modality-specific representations underlying perception and action (e.g., Lichtheim, 1885). An alternative perspective with a shorter history is that word meaning is grounded in the modality-specific brain regions responsible for accomplishing perception and action, or that access to word meaning involves simulation or reconstruction of sensorimotor experiences via those brain regions (also called “neural re-use”; e.g., Pulvermüller, 2005, 2018; Barsalou, 2008, 2016; Glenberg & Gallese, 2012; Gallese & Lakoff, 2005; Zwaan, 2004). This latter perspective is often referred to as embodied cognition (e.g., Glenberg, 2015). Hub-and-spoke or convergence zone models have sought to reconcile the older and newer perspectives by proposing conceptual representation is largely accomplished by one brain region, the anterior temporal lobes, in which cross-modal processing arises from connections with modality-specific regions (e.g., Lambon Ralph, Jefferies, Patterson, & Rogers, 2017; Meteyard, Cuadrado, Bahrami, & Vigliocco, 2012).

All of the above neurobiological perspectives assume, either implicitly or explicitly (e.g., Glenberg & Kaschak, 2002), that there is no relationship between the sound of a word and its meaning. Yet, considerable evidence already indicates this strong assumption about the arbitrariness of language is not valid. For example, an early finding by Köhler (1947), replicated many times since, is that there is a statistical relationship between a word's phonology and the shape of the object to which it refers, with rounded versus unrounded vowels associated with round versus angular objects, respectively. This relationship is recognized and used by children as young as 2.5 years (e.g., Maurer, Pathman, & Mondloch, 2006). Similar evidence exists for sound–object size correspondences where high and low vowels tend to denote small and large, respectively (Nuckolls, 1999; Thompson & Estes, 2011).

The evidence for nonarbitrary sound-to-meaning correspondences also extends to statistical regularities for abstract versus concrete words (Reilly, Westbury, Kean, & Peele, 2012) and words with positive or negative valence across languages (Adelman, Estes, & Cossu, 2018; Tillman & Louwerse, 2018; Louwerse & Qu, 2017). Although researchers have often described action words as being grammatically ambiguous when presented in isolation, that is, they may be understood as either verbs or nouns (e.g., Pulvermüller, Hauk, Nikulin, & Ilmoniemi, 2005), there is a substantial body of psycholinguistic research demonstrating the degree to which a word's orthography and/or phonology is typical of other words in its grammatical category influences on-line processing, particularly for verbs and nouns. In English, verbs and nouns differ in lexical stress (Kelly, 1992), length (Cassidy & Kelly, 1991), and vowel type (Sereno & Jongman, 1990). Large-scale corpus analyses show that sublexical phonological cues within monosyllabic words can be used to distinguish the lexical categories of nouns and verbs, both within and across languages, and measures of phonological typicality (PT) for noun or verb status predict lexical decision performance and naming latencies (Monaghan, Christiansen, & Chater, 2007; Monaghan, Chater, & Christiansen, 2005; Kelly, 1992; Cassidy & Kelly, 1991). These systematic correspondences occur more frequently among words that are acquired early during language development, indicating they are likely essential for bootstrapping word learning (e.g., Monaghan, Christiansen, Farmer, & Fitneva, 2010). Thus, people use statistical regularities in sound-to-meaning correspondences as probabilistic cues to constrain access to word meaning.

Elsewhere, we have argued the reason why statistical regularities in language are an important factor to include in neurobiological models is that they can provide an alternate explanation for sensorimotor activity observed during language comprehension that might otherwise be inferred as reflecting grounded conceptual processing or simulation (de Zubicaray, Arciuli, & McMahon, 2013). A recent meta-analysis of 126 experiments in 51 embodied cognition studies supported this perspective by showing that language statistics elicited comparable or larger effect sizes than those of simulation (Hutchinson, Tillman, & Recchia, 2015; see also Tillman & Louwerse, 2018). Here, we focus on research attempting to demonstrate selective motor cortex engagement during action word (verb) processing, or semantic somatotopy, driven largely by theories of embodied cognition (e.g., Hauk, Johnsrude, & Pulvermüller, 2004; see reviews by Kemmerer, 2015; Carota, Moseley, & Pulvermüller, 2012). Semantic somatotopy assumes a direct coupling between action word meaning representations and cortical motor areas such that comprehension (e.g., punch, lick, and kick) necessarily engages the corresponding somatotopic ventral-to-dorsal organization of mouth/face, hand/arm, and foot/leg effectors within primary motor cortex (M1) and premotor cortex (PMC; e.g., Pulvermüller, 2005, 2018; Kemmerer, 2015). However, the reliance on null hypothesis testing to demonstrate presence versus absence of activity means such studies are unable to selectively attribute motor area engagement during word comprehension to a specific mechanism(s). This is because language comprehension involves multiple stages of processing (e.g., phonological, lexical, morphological, semantic). The overwhelming majority of studies employed contrasts of words versus low-level resting or visual character baselines (e.g., hashes).

Criticisms of the neurobiological evidence cited in support of semantic somatotopy have focused mainly on inconsistencies in the brain stimulation, lesion, and neuroimaging evidence for the proposed overlap between action word comprehension and somatotopic organization of the motor cortices (e.g., Argiris et al., 2020; Pritchett, Hoeflin, Koldewyn, Dechter, & Fedorenko, 2018; Reilly & Desai, 2017; Watson, Cardillo, Ianni, & Chatterjee, 2013; Papeo, Vallesi, Isaja, & Rumiati, 2009; Postle, McMahon, Ashton, Meredith, & de Zubicaray, 2008). For example, Postle et al. (2008) reviewed the evidence from early neuroimaging studies, noting there was little overlap between reported peak activity during effector-specific action word comprehension (e.g., punch, lick, kick) and cytoarchitectonic probability maps of M1 and PMC, and that other classes of words (e.g., concrete nouns) reliably engage motor areas, indicating motor cortex activation during word comprehension is not selective for verbs. They further tested this hypothesis by conducting an fMRI study using ROIs from action execution and observation localizer tasks within PMC and M1, failing to find evidence for somatotopically organized action word activity. Postle et al.'s (2008) results were subsequently replicated in Dutch by Schuil, Smits, and Zwaan (2013) and by Zhang, Sun, and Wang (2018) in Chinese.1

Whether motor areas should be defined via cytoarchitectonics and/or neuroimaging functional localizers remains an open question. Postle et al. (2008) originally recommended cytoarchitectonic probability maps be employed in conjunction with functional localizers for action observation and execution to test claims of direct action–semantic links as there is no macro-anatomical landmark to delineate the boundary between anterior PMC and pFC. Kemmerer (2015) noted more of the reported action word peaks could be included in the Human Motor Area Template (HMAT; Mayka, Corcos, Leurgans, & Vaillancourt, 2006), a probability map based on neuroimaging studies of action execution functional localizers that encompassed a much larger proportion of (especially prefrontal) cortex than the cytoarchitectonic maps. A more recent meta-analysis by Courson and Tremblay (2020) showed little overlap between manual action language and functional localizer maps for manual action execution. Some advocates of motor simulation/re-enactment/neural re-use have also proposed that retrieval of action word meanings might involve slightly different regions immediately adjacent to those involved in actual execution/observation, and reuse may not be complete and may vary considerably across tasks and contexts, somewhat weakening the original claims for direct action–semantic links (see Barsalou, 2016).

By contrast, neuroimaging studies of phonological processing have shown much more consistent motor area activity. Of 49 peaks reported in motor areas across nonword fMRI studies, 39 (80%) were located reliably in cytoarchitectonically defined PMC or M1 (de Zubicaray et al., 2013). This is strong evidence that motor areas instead respond selectively to sublexical phonological features, as nonwords, by definition, do not have a specific meaning—or any meaning—in the lexicon, despite conforming to English phonotactic rules. In the same study, the present authors sought to test the hypothesis that PMC and M1 engagement during action word comprehension was likely to reflect statistical regularities in sound-to-meaning correspondences. In particular, Arciuli and Cupples' (2006, 2007; see also Arciuli & Monaghan, 2009) corpus analyses showed nonmorphological orthographic cues in words' beginnings (the letters corresponding to the onset and first vowel) and also in words' endings (the letters corresponding to the rime of final syllable) were highly predictive of grammatical category (verb vs. noun). Using fMRI and a grammatical judgment task, we showed that disyllabic words denoting manual actions evoked increased motor cortex activity compared with non-body-part-related nouns, overlapping the activity evoked by observing and executing hand movements in conjunction analyses (de Zubicaray et al., 2013). Critically, a conjunction analysis showed a contrast of disyllabic nonwords containing endings with probabilistic cues predictive of verb versus noun status also overlapped with this activity. Our interpretation of this overlap in motor cortex activity was that it reflected common processing of probabilistic ortho-phonological regularities across both words and nonwords, rather than selective processing of action word meanings.

A post hoc explanation of our findings was offered in an attempt to maintain compatibility with the embodied account. According to Pulvermüller (2018; Footnote 14), the disyllabic nonwords (e.g., olverve, lurdasm) were able to partially activate “motor systems by way of the action semantic links of their phonologically and orthographically related lexical items”. The grammatical judgment task employed was also proposed to have encouraged participants to strategically align the nonwords with their ortho-phonological neighboring action words (Lin et al., 2015). In our view, this proposal is not a viable explanation for the findings reported in de Zubicaray et al. (2013) because (1) only the endings of the nonwords were manipulated to include probabilistic ortho-phonological cues, (2) the cues in the nonword endings were to grammatical class more generally and not specifically to manual verbs, and (3) the word and nonword lists were deliberately constructed such that the items had minimal overlap in either their beginnings or endings (< 10%), that is, the stimuli did not comprise an orthographic or phonological similarity neighborhood (cf. Pulvermüller, 2018). We will return to this issue in the Discussion section.

This Study

The purpose of this study was to provide additional neurobiological evidence for an alternate role for motor cortical areas during action word comprehension. Our hypothesis—broadly stated—is that motor area activation during action word comprehension is more likely to reflect processing of sublexical phonological statistical regularities embedded in action words, rather than selective processing of their meanings. A corollary to this is that motor cortex activity will likely reflect implicit processing of probabilistic sublexical phonological cues to grammatical category to some extent, particularly as verbs are the stimuli of interest in semantic somatotopy studies (e.g., de Zubicaray et al., 2013).

We tested this hypothesis in an fMRI experiment presenting participants with monosyllabic verbs denoting effector-specific actions performed with mouth/face, hand/arm and foot/leg, and monosyllabic nonwords. We matched the nonwords to the effector-specific verbs according to the sequential arrangements of their constituent phonological units. We accomplished this via two statistical measures used to estimate phonotactic probability (PP): (1) positional segment frequency, for example, /s/, and (2) segmental sequence (i.e., biphone) frequency, for example, /s^/, within words (Vitevitch & Luce, 2004).

We employed an auditory lexical decision task as it necessitates full lexical processing and provides complementary nonword rejection data (Vitevitch & Luce, 2016; Goldinger, 1996). Unlike grammatical judgment involving both nouns and verbs, it does not direct attention to lexical categories. Rather, it involves implicit processing of phonotactic probabilities. We also defined motor cortical areas using the HMAT (Mayka et al., 2006) rather than cytoarchitectonic probability maps (see Kemmerer, 2015). In addition, we included the left inferior frontal gyrus (IFG) as an ROI given its proposed role in action understanding in mirror neuron embodied accounts (see Caramazza, Anzellotti, Strnad, & Lingnau, 2014) and as a number of meta-analyses of action word comprehension reported mouth/face verb-related activity in this region (Kemmerer, 2015; de Zubicaray et al., 2013; Watson et al., 2013; Carota et al., 2012). We tested the extent of spatial overlap in BOLD activation between the effector-specific action words and their matched nonwords via conjunction analyses (Friston, Penny, & Glaser, 2005) to determine whether the effects observed during comprehension are selectively semantic or reflect a common processing component of phonotactic probabilities. Finally, we also tested whether an orthogonal regressor comprising the PT values of the action words from Monaghan et al.'s (2010) corpus analysis of nouns and verbs could elicit BOLD signal responses in the motor areas and IFG during word comprehension, to determine the impact of processing probabilistic phonological cues to grammatical category when varying effector word type.

METHODS

Participants

Eighteen participants (8 women) were recruited from among University of Queensland students and staff. One participant withdrew voluntarily before completing the imaging experiment. The remaining 17 participants (8 women) had a mean age of 23.9 years (range: 18–37 years). All were right-handed, native English speakers with no history of neurological or psychiatric disorder, substance dependence, and self-reported normal hearing. All had normal or corrected-to-normal vision and provided informed consent according to the protocol approved by the medical research ethics committee of the University of Queensland. Participants were reimbursed AUD 30 for participating.

Materials

All stimuli and materials are publicly available via the Open Science Framework at https://osf.io/acpn9/. The critical stimuli comprised 90 monosyllabic verbs divided equally among effectors associated with the actions being referenced, that is, 30 mouth/face, 30 hand/arm and 30 leg/foot (e.g., laugh, stir, march), and 90 monosyllabic nonwords matched for PP. All the words were selected from a subset of stimuli employed previously by neuroimaging studies for the purpose of investigating embodiment/grounding of semantic representations in motor cortex areas, and referenced both uni- or bimanual/bipedal movements (e.g., Moseley, Pulvermüller, & Shtyrov, 2013; Hauk & Pulvermüller, 2011; Willems, Toni, Hagoort, & Casasanto, 2010; Hauk, Johnsrude, & Pulvermüller, 2004; Pulvermüller, Lutzenberger, & Preissl, 1999).

Verbs were selected without reference to probabilistic phonological cues to grammatical class (i.e., PT; Monaghan et al., 2010) and matched across effector types on a range of psycholinguistic variables including letter length, lexical frequency (log SUBTLWF; Brysbaert & New, 2009), orthographic and phonological Levenstein distances (OLD, PLD), dominant part of speech relative to total frequency (Brysbaert, New, & Keuleers, 2012), mean bigram frequency and phonemes according to the on-line English Lexicon Project (Balota et al., 2007), phonological neighborhood density (unstressed; Vaden, Halpin, & Hickok, 2009), imageability (Cortese & Fugett, 2004), and age of acquisition (Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012). The words were also matched on mean semantic neighborhood density (Reilly & Desai, 2017) and associative strength (De Deyne, Navarro, Perfors, Brysbaert, & Storms, 2019; see Table 1). For each word, a PT value was next obtained from Monaghan et al.'s (2010) CELEX English corpus (Baayen, Piepenbrock, & Gulikers, 1995) analysis that reflected its average Euclidian distance to all verbs versus all nouns based upon phonological feature overlap. Typical verbs have overall shorter distances to other verbs, and longer distances to nouns, and vice versa. An ANOVA on the PT values for the different effector types revealed a significant difference, F(2, 87) = 6.44, MSE = .003, p = .002, ηρ2 = .13. Post hoc comparisons revealed this was because of foot words having higher PT values than both mouth (t = 3.53, p = .001) and hand words (t = 2, p = .043). Note that as the PT values are calculated as noun–verb distance, negative values mean the words are more verb-like. Hence, the positive PT values for foot action words indicate they are significantly less verb-like than their counterparts.

Table 1. 
Psycholinguistic Properties of the Word Stimuli
Linguistic VariableEffector Words
MouthHandLeg
Length 4.37 (.62) 4.4 (.68) 4.6 (.68) 
Frequency 2.59 (.78) 2.64 (.62) 2.5 (.59) 
Phonemes 3.37 (.77) 3.53 (.62) 3.87 (.68) 
OLD 1.58 (.29) 1.48 (.3) 1.55 (.27) 
PLD 1.3 (.29) 1.23 (.25) 1.36 (.29) 
Phonological neighborhood density 23.53 (12.32) 23.7 (11.0) 19.5 (10.64) 
Age of Acquisition 6.01 (1.99) 6.36 (2.08) 7.22 (2.04) 
Imageability 4.5 (.8) 4.56 (.68) 4.47 (.81) 
Dominant part of speech 0.74 (.18) 0.76 (.15) 0.77 (.14) 
Semantic neighborhood density 89.8 (67.52) 1114.89 (100.03) 117.62 (77.84) 
Associative strength 105.47 (90.59) 97.83 (73.01) 80.69 (83.53) 
Phonological typicality −0.049 (.05) −0.027 (.055) 0.004 (.065) 
Linguistic VariableEffector Words
MouthHandLeg
Length 4.37 (.62) 4.4 (.68) 4.6 (.68) 
Frequency 2.59 (.78) 2.64 (.62) 2.5 (.59) 
Phonemes 3.37 (.77) 3.53 (.62) 3.87 (.68) 
OLD 1.58 (.29) 1.48 (.3) 1.55 (.27) 
PLD 1.3 (.29) 1.23 (.25) 1.36 (.29) 
Phonological neighborhood density 23.53 (12.32) 23.7 (11.0) 19.5 (10.64) 
Age of Acquisition 6.01 (1.99) 6.36 (2.08) 7.22 (2.04) 
Imageability 4.5 (.8) 4.56 (.68) 4.47 (.81) 
Dominant part of speech 0.74 (.18) 0.76 (.15) 0.77 (.14) 
Semantic neighborhood density 89.8 (67.52) 1114.89 (100.03) 117.62 (77.84) 
Associative strength 105.47 (90.59) 97.83 (73.01) 80.69 (83.53) 
Phonological typicality −0.049 (.05) −0.027 (.055) 0.004 (.065) 

Refer to text for normative sources. Values are means with SDs in parentheses.

We conducted a separate rating study with an independent group of 48 right-handed, native English-speaking participants (mean age 19.75 years, range: 18–33). Participants were undergraduate speech pathology students from the University of Sydney who completed the ratings study for course credit. The written action words were randomized within a single list and rated according to the body part associated with performing the action. The mean agreement according to each effector type is shown in Figure 1.2

Figure 1. 

Mean percent agreement for the word stimuli according to action meaning. An independent group of participants (n = 48) assigned each word to the body part they associated with performing the designated action. The action word meanings are clearly dissociated with respect to their different effectors. Error bars are SEMs.

Figure 1. 

Mean percent agreement for the word stimuli according to action meaning. An independent group of participants (n = 48) assigned each word to the body part they associated with performing the designated action. The action word meanings are clearly dissociated with respect to their different effectors. Error bars are SEMs.

The 90 monosyllabic nonwords (e.g., peff, spee, nitch) were constructed to match the 90 action words according to their PP (Vitevitch & Luce, 2004). This involved first converting the Australian English pronunciation of each word to Klattese, a machine-readable version of the International Phonetic Alphabet. Specifically, we matched nonwords to words in terms of both number of phonemes and consonant–vowel structure (importantly, where there was a consonant cluster at the beginning of the word, we ensured there was a consonant cluster at the beginning of the nonword), and matched their sum of all phoneme probabilities and sum of all biphone probabilities (Vitevitch & Luce, 2004; see Table 2). We employed this approach because English orthography and phonology follow the alphabetic principle with semiregular mapping between graphemes (letters) and phonemes (sounds). By matching nonwords to the sequential arrangements of the action words' constituent phonological and orthographic features, we ensured their constituent features occurred with approximately the same frequency and position. The auditory distractors were then recorded by a male native Australian English speaker in an anechoic chamber, and edited, normalized and DC offset were removed using Audacity software (audacity.sourceforge.net). We confirmed the successful matching via ANOVAs on phonotactic probabilities with factors Lexical Condition and Effector Type (all Fs < 2.6, ps > .5) and on articulatory durations using the same factors (all Fs < 0.5, ps > .5).

Table 2. 
Phonotactic and Articulatory Duration Properties of the Nonwords
ConditionEffectorPhoneme ProbabilitiesBiphone ProbabilitiesDurations (msec)
Words 
  Mouth/face .171 (.059) .01 (.007) 516 (68) 
  Hand/arm .164 (.057) .011 (.008) 508 (82) 
  Foot/leg .195 (.054) .026 (.049) 516 (85) 
  Total .177 (.057) .016 (.03) 514 (78) 
  
Nonwords 
  Mouth/face .167 (.059) .01 (.008) 518 (91) 
  Hand/arm .164 (.053) .011 (.007) 522 (73) 
  Foot/leg .198 (.062) .014 (.01) 525 (92) 
  Total .177 (.06) .012 (.009) 522 (85) 
ConditionEffectorPhoneme ProbabilitiesBiphone ProbabilitiesDurations (msec)
Words 
  Mouth/face .171 (.059) .01 (.007) 516 (68) 
  Hand/arm .164 (.057) .011 (.008) 508 (82) 
  Foot/leg .195 (.054) .026 (.049) 516 (85) 
  Total .177 (.057) .016 (.03) 514 (78) 
  
Nonwords 
  Mouth/face .167 (.059) .01 (.008) 518 (91) 
  Hand/arm .164 (.053) .011 (.007) 522 (73) 
  Foot/leg .198 (.062) .014 (.01) 525 (92) 
  Total .177 (.06) .012 (.009) 522 (85) 

Verbs and verb-like nonwords were randomly distributed into 18 different lists of 180 items using the Mix program (van Casteren & Davis, 2006). Within each list, the randomization was constrained such that stimuli from either the same lexical condition (word, nonword) or effector type (mouth/face, hand/arm, leg/foot) would not appear on more than three consecutive trials and stimuli with the same initial phoneme would not appear consecutively. A 30-db attenuating electrodynamic headset was used to reduce gradient noise and present auditory stimuli (MR Confon GmbH). Stimulus presentation and response recording were accomplished via the Cogent 2000 toolbox extension (www.vislab.ucl.ac.uk/cogent_2000.php) for MATLAB (2014, The MathWorks, Inc.).

Procedure

The lexical decision task involved participants each being presented with 180 auditory stimuli comprising the verbs and verb-like nonwords, split into two consecutive runs/sessions of 90 trials. For each trial, a fixation point (crosshair) appeared at the center of the screen for 150 msec, followed by the auditory presentation of the word/nonword up to 750 msec, and then a blank screen for 800 msec; all of which occurred during a 1700-msec silent period in which no image volumes were acquired (see Figure 2 and Image Acquisition section below). Participants were instructed to withhold their response until the response options appeared. Next, the response options “word” and “nonword” were presented on either side of the center of the screen, remaining for up to 2000 msec depending on the speed of the participant's response. This served both as a prompt to respond and to indicate which button should be pressed for a given response, as the left/right positions of the response options were randomized and counterbalanced across trials to prevent consistent response mappings (de Zubicaray et al., 2013; Pulvermüller et al., 1999). Participants responded using their right hand by pressing one of two buttons corresponding to their decision on a similarly arranged response pad. The selected option changed color to red for 200 msec to provide response feedback, and a blank screen was presented for the remainder of the 2000-msec period. The trial timed out if a response was not made within the 2000-msec period. Thus, each trial lasted for 3700 msec. A blank intertrial interval was jittered pseudorandomly using three different delays of 0, 3700, and 7400 msec to optimize the estimation of the BOLD response to each trial and provide a baseline measure. While in the bore of the MRI system and before the experimental run, participants were provided with a brief practice version of the task involving a pseudorandomized series of 10 trials split equally among words and nonwords. The practice stimuli were not included in the experimental task.

Figure 2. 

Experimental design. (A) Behavioral trial presentation and (B) fMRI acquisition.

Figure 2. 

Experimental design. (A) Behavioral trial presentation and (B) fMRI acquisition.

Image Acquisition

Imaging data were acquired using a 3 T Siemens MAGNETOM Trio TIM System (Siemens Medical Solutions) equipped with a standard 12-channel Matrix head coil. A point-spread function mapping sequence was acquired to correct geometric distortions in the EPI data (Zaitsev, Hennig, & Speck, 2004). High-resolution T1-weighted structural images were acquired using a magnetization-prepared rapid acquisition gradient-echo sequence (512 × 512 matrix, in plane resolution .45 × .45 mm, 192 slices, slice thickness .9 mm, 9° flip angle, inversion time = 900 msec, repetition time [TR] = 1900 msec, echo time = 2.32 msec). A fast sparse image acquisition protocol was next employed after previous fMRI studies in which auditory speech sound stimuli are presented during silent gaps between scans (e.g., Zhuang, Tyler, Randall, Stamatakis, & Marslen-Wilson, 2014; Minicucci, Guediche, & Blumstein, 2013). Functional T2*-weighted images depicting BOLD contrast were acquired using a gradient-echo EPI sequence (36 slices, TR = 3840 msec, echo time = 36 msec, 64 × 64 matrix, 3.3 × 3.3 mm in plane resolution, 3-mm slice thickness with .3-mm gap, and 80° flip angle) in two consecutive functional imaging runs comprising 182 volumes apiece. Each EPI volume acquisition was obtained in 2140 msec, followed by 1700 msec silence during which the auditory stimulus was presented, yielding an effective volume TR of 3840 msec.

Image Analysis

The functional and structural image volumes from each participant were preprocessed and analyzed with statistical parametric mapping software (SPM12; Wellcome Department of Imaging Neuroscience). Each functional time series was first slice-timing corrected for the sequential acquisition and then realigned to the first image of the initial series using the INRIAlign toolbox (Freire, Roche, & Mangin, 2002). A mean image was generated and used to coregister the realigned series to the T1-weighted image. The Segment procedure was next applied to the T1-weighted image, and the DARTEL toolbox (Ashburner, 2007) was employed to create a custom group template from the gray and white matter images. The resulting individual flow fields were used to normalize the realigned fMRI volumes to the Montreal Neurological Institute (MNI) atlas T1 template. Normalized images were resampled to 2-mm3 voxels and smoothed with an 8-mm FWHM isotropic Gaussian kernel.

Statistical analyses were conducted according to two-level, mixed effects models. At the first level, we conducted two separate fixed effects analyses on each participant's data. In both, trial types corresponding to correctly classified stimuli (word/nonword) were defined and modeled as effects of interest, with delta functions representing each auditory stimulus onset, in addition to nuisance regressors consisting of onsets for the delayed responses based on the participants' RTs (to permit condition-specific effects to be estimated) and trials involving misclassifications/omissions. In the second set of fixed effect analyses, we additionally included the PT value for each word as a first order polynomial parametric modulator that was orthogonal to the main trial regressor within each effector type. This allowed us to separate the main effect of word comprehension from the impact of PT processing within each effector word type (Mumford, Poline, & Poldrack, 2015).

Trial onsets were convolved with a canonical hemodynamic response function. Low-frequency noise and signal drift were removed from the time series in each voxel with high pass filtering (1/128 Hz). Temporal autocorrelations were estimated and removed with an autoregressive (AR1) model. Linear contrasts were applied to each participant's parameter estimates for each experimental condition of interest relative to baseline and then entered in second-level group repeated-measures ANOVAs in which covariance components were estimated using a restricted maximum likelihood procedure to correct for nonsphericity (Friston et al., 2002).

As we had a priori hypotheses concerning BOLD signal responses in left hemisphere motor regions, we restricted the second-level analyses to explicit masks of M1 and PMC from the HMAT (Mayka et al., 2006) and the IFG (Hammers et al., 2003). Results of whole brain analyses are reported in the Supplementary Material (available at https://osf.io/acpn9/). Unless otherwise indicated, across all analyses, a height threshold of p < .001 was adopted with a spatial cluster extent threshold of p < .05 (FWE corrected via the Bonferroni procedure implemented in SPM12; see Eklund, Nichols, & Knutsson, 2016; Woo, Krishnan, & Wager, 2014).

We first performed planned t contrasts examining activity within each word effector type condition versus the low-level implicit baseline to identify peak maxima in a manner consistent with the reporting in prior studies of embodied language (e.g., Hauk & Pulvermüller, 2011; Willems et al., 2010; Boulenger, Hauk, & Pulvermüller, 2009; Hauk et al., 2004). We then repeated these contrasts for each nonword effector type condition. To identify voxels that were responsive to both words and nonwords (i.e., those not involved selectively in processing word meaning) within each effector type, we employed a conjunction analysis (i.e., word ∩ nonword). We tested the conjunction null as defined by Nichols, Brett, Andersson, Wager, and Poline (2005), allowing us to infer that there is a conjunction of all word and nonword effects (i.e., k = n; see Friston et al., 2005). We next directly compared word versus nonword contrasts according to effector type to identify any regions maximally or selectively responsive to either type of stimuli.

A number of studies reported results of contrasts designed to identify activity maximally responsive to a specific effector type (e.g., hand/arm vs. mouth/face + foot/leg) within motor area ROIs arbitrarily adopting more liberal primary alpha and cluster thresholds of p < .005 (uncorrected) and k voxels, respectively (e.g., Willems et al., 2010; Raposo, Moss, Stamatakis, & Tyler, 2009; Rüschemeyer, Brass, & Friederici, 2007; Tettamanti et al., 2005; Hauk et al., 2004). We therefore adopted a similar arbitrary approach with thresholds of p < .005 (uncorrected) and clusters of > 20 voxels for these contrasts to ensure comparability of our results.3 Specifically, we conducted conjunction analyses to identify voxels showing maximal responses for both words and nonwords to a specific effector type (e.g., Mouth/Face > [Hand/Arm + Foot/Leg] words ∩ Mouth/Face > [Hand/Arm + Foot/Leg] nonwords). We tested the global null as recommended by Friston et al. (2005) to allow us to infer whether word and nonword effector-specific contrasts were consistently high and jointly significant.

In the final analysis, we performed an ANOVA on the orthogonalized PT regressor contrasts according to action word effector type. As the PT values differed significantly according to effector type, we employed a contrast to reflect these differences accordingly (i.e., mouth < hand < leg; see Table 1). We extracted mean percent BOLD signal changes from significant clusters using the MarsBar toolbox in SPM12 (v0.44; Brett, Romain Valabregue, & Poline, 2002). Results were rendered on cortical surfaces using SPM12 and Connectome Workbench (v1.4.2; S1200 Group Average data set).

RESULTS

Behavioral Data

For the accuracy data, we conducted repeated-measures ANOVAs with Lexicality (word, nonword) and Effector Type (mouth/face, hand/arm, foot/leg) as within-participants factors. Where Mauchly's test indicated that the assumption of sphericity had been violated, we corrected degrees of freedom using Greenhouse–Geisser estimates of sphericity. Significant main effects of Lexicality, F(1, 16) = 23.54, p =.000, pη2 = .60, and Effector Type, F(2, 32) = 5.62, p = .008, pη2 = .26, were observed. However, Mauchly's test was significant for the interaction, χ2(2) = 8.4, p = .015, so the Greenhouse–Geisser correction was applied (ε = .70), as was a significant interaction, F(1.4, 32) = 6.02, p = .014, pη2 = .27. Words were responded to more accurately than nonwords as is usually the case with lexical decision (see Table 3). Follow-up ANOVAs within each lexical condition failed to reveal a significant main effect of Effector Type for words, F(2, 32) = 2.75, p = .08, pη2 = .15, but did reveal a significant main effect for nonwords, F(2, 32) = 9.41, p = .001, pη2 = .37. Paired t tests indicated mouth/face nonwords were responded to more accurately than hand/arm, t(16) = 5.05, p = .000, but not foot/leg stimuli, t(16) = 1.89, p = .076, whereas foot/leg nonwords were responded to more accurately than hand/arm, t(16) = 2.19, p = .04.

Table 3. 
Mean RTs (msec) and Accuracy Rates from the Lexical Decision Task
ConditionEffector TypeTotal
Mouth/FaceHand/ArmFoot/Leg
Words 
 RT 594 (97) 600 (96) 597 (89) 597 (91) 
 % Correct 95.7 (.04) 94.3 (.07) 91.4 (.07) 93.8 (.04) 
  
Nonwords 
 RT 639 (145) 677 (134) 647 (116) 654 (126) 
 % Correct 91.0 (.06) 83.5 (.09) 87.5 (.09) 87.3 (.07) 
ConditionEffector TypeTotal
Mouth/FaceHand/ArmFoot/Leg
Words 
 RT 594 (97) 600 (96) 597 (89) 597 (91) 
 % Correct 95.7 (.04) 94.3 (.07) 91.4 (.07) 93.8 (.04) 
  
Nonwords 
 RT 639 (145) 677 (134) 647 (116) 654 (126) 
 % Correct 91.0 (.06) 83.5 (.09) 87.5 (.09) 87.3 (.07) 

SDs in parentheses.

For the RT data, errors were first excluded, and outlier responses below 200 msec and above 2000 msec were removed. This resulted in 9% of trials being removed overall (i.e., the analysis was conducted on 91% of trials). We performed the same analyses as for the accuracy data. No violations of the assumption of sphericity were detected for the main effects or interaction via Mauchly's test. Although a significant main effect of Lexicality was observed, F(1, 16) = 18.45, p = .001, pη2 = .54, neither the main effect of Effector Type, F(2, 32) = 2.35, p = 0.11, pη2 = .13, nor interaction were significant, F(2, 32) = 2.04, p = .15, pη2 = .11. Words were responded to more quickly than nonwords (Meandiff = 57 msec) as is usually the case with lexical decision (Table 3).

Imaging Data

Contrasts of Word or Nonword Effector Type versus Baseline

Within the HMAT delineated regions, we observed significant BOLD signal responses in separate peaks in lateral (precentral gyrus) and medial (SMA) cortex for all comparisons of the word and nonword stimuli versus baseline (see Table 4; Figure 3). In addition, all the BOLD signal responses represented increases relative to baseline, with none of the reverse contrasts (i.e., stimulus < baseline) revealing significant activity. The spatial extent of this activity (i.e., k voxels) was consistently larger for nonwords than words, although, in both cases, the majority of the voxels in the masks showed significant responses. The patterns of signal increases, and spatial extent were identical for the left IFG ROI.

Table 4. 
MNI Coordinates for Peak BOLD Signal Responses in Motor Areas and IFG
t ContrastPeak MNI (x y z)Z ScoreCluster (Voxels)
Mouth/Face words > baseline 
 Left precentral gyrus −58 26 7.82 3733 
 Left supplementary motor area −6 50 Inf 568 
 Left IFG −32 26 Inf 1146 
  
Mouth/Face nonwords > baseline 
 Left central operculum −52 Inf 3918 
 Left supplementary motor area −4 50 Inf 639 
 Left IFG −36 22 Inf 1309 
  
Mouth/Face words > baseline ∩ Mouth/Face nonwords > baseline 
 Left precentral gyrus −58 26 7.82 3675 
 Left supplementary motor area −6 50 Inf 563 
 Left IFG −36 26 Inf 1138 
  
Hand/Arm words > baseline 
 Left precentral gyrus −58 28 Inf 3749 
 Left supplementary motor area −4 50 Inf 585 
 Left IFG −34 14 Inf 1058 
  
Hand/Arm nonwords > baseline 
 Left precentral gyrus −56 Inf 3645 
 Left supplementary motor area −4 50 Inf 603 
 Left IFG −32 26 Inf 1326 
  
Hand/Arm words > baseline ∩ Hand/Arm nonwords > baseline 
 Left precentral gyrus −58 28 Inf 3544 
 Left supplementary motor area −4 50 Inf 539 
 Left IFG −36 26 Inf 1049 
  
Foot/Leg words > baseline 
 Left precentral gyrus −58 28 Inf 3779 
 Left supplementary motor area −4 50 Inf 575 
 Left IFG −36 14 Inf 1089 
  
Foot/Leg nonwords > baseline 
 Left precentral gyrus −58 −16 24 Inf 3825 
 Left supplementary motor area −4 50 Inf 641 
 Left IFG −34 14 Inf 1317 
  
Foot/Leg words > baseline ∩ Foot/Leg nonwords > baseline 
 Left precentral gyrus −58 28 Inf 3670 
 Left supplementary motor area −4 50 Inf 567 
 Left IFG −36 14 Inf 1088 
t ContrastPeak MNI (x y z)Z ScoreCluster (Voxels)
Mouth/Face words > baseline 
 Left precentral gyrus −58 26 7.82 3733 
 Left supplementary motor area −6 50 Inf 568 
 Left IFG −32 26 Inf 1146 
  
Mouth/Face nonwords > baseline 
 Left central operculum −52 Inf 3918 
 Left supplementary motor area −4 50 Inf 639 
 Left IFG −36 22 Inf 1309 
  
Mouth/Face words > baseline ∩ Mouth/Face nonwords > baseline 
 Left precentral gyrus −58 26 7.82 3675 
 Left supplementary motor area −6 50 Inf 563 
 Left IFG −36 26 Inf 1138 
  
Hand/Arm words > baseline 
 Left precentral gyrus −58 28 Inf 3749 
 Left supplementary motor area −4 50 Inf 585 
 Left IFG −34 14 Inf 1058 
  
Hand/Arm nonwords > baseline 
 Left precentral gyrus −56 Inf 3645 
 Left supplementary motor area −4 50 Inf 603 
 Left IFG −32 26 Inf 1326 
  
Hand/Arm words > baseline ∩ Hand/Arm nonwords > baseline 
 Left precentral gyrus −58 28 Inf 3544 
 Left supplementary motor area −4 50 Inf 539 
 Left IFG −36 26 Inf 1049 
  
Foot/Leg words > baseline 
 Left precentral gyrus −58 28 Inf 3779 
 Left supplementary motor area −4 50 Inf 575 
 Left IFG −36 14 Inf 1089 
  
Foot/Leg nonwords > baseline 
 Left precentral gyrus −58 −16 24 Inf 3825 
 Left supplementary motor area −4 50 Inf 641 
 Left IFG −34 14 Inf 1317 
  
Foot/Leg words > baseline ∩ Foot/Leg nonwords > baseline 
 Left precentral gyrus −58 28 Inf 3670 
 Left supplementary motor area −4 50 Inf 567 
 Left IFG −36 14 Inf 1088 

p < .001 and p < .05 (FWE cluster-corrected) within the HMAT or left IFG ROIs.

Figure 3. 

Renderings of the left hemisphere cortical surface showing significant BOLD signal responses for the word and nonword stimuli relative to baseline according to effector type and conjunctions of same within the HMAT and IFG ROIs (p < .001 and p < .05, FWE cluster-corrected).

Figure 3. 

Renderings of the left hemisphere cortical surface showing significant BOLD signal responses for the word and nonword stimuli relative to baseline according to effector type and conjunctions of same within the HMAT and IFG ROIs (p < .001 and p < .05, FWE cluster-corrected).

Conjunction Analyses of Words and Nonwords within Effector Type

We observed significant BOLD signal responses for all conjunction analyses within the HMAT delineated regions. Furthermore, the peak maxima for all the conjunction analyses were identical to those detected for the comparisons of the word stimuli versus baseline (see Table 4 and Figure 3). The spatial extents of the activations identified by the conjunctions were also similar, although with slightly fewer voxels in each case. Note this result is likely to reflect the conservative nature of the conjunction analysis employed with respect to corrections for multiple comparisons using the conjunction null (Friston et al., 2005). Following the logic of conjunctions (Friston et al., 2005; Nichols et al., 2005), this indicates the HMAT regions were responding to a processing component common to both word and nonword stimuli. This pattern of response was virtually the same for the IFG for all conjunctions, with the exception of the hand/arm-related stimuli response, the location of which more closely approximated the peak for the nonwords.

Contrasts of Word versus Nonword Activity for Each Effector Type

We failed to observe any significant BOLD signal responses for contrasts of either word > nonword or nonword > word within each effector type, indicating neither type of stimuli predominated in activating the regions encompassed by the HMAT or IFG. For the broader contrasts combining all words versus all nonwords, only the left IFG showed significant signal increases for nonwords with a peak at −56, 22, −4 (Z = 5.29; k = 463 voxels).

Conjunction Analyses of “Effector-specific” Contrasts

The conjunction of Mouth/Face > (Hand/Arm + Foot/Leg) words ∩ Mouth/Face > (Hand/Arm + Foot/Leg) nonwords revealed significant activity in the HMAT and IFG. The conjunction of Hand/Arm < (Mouth/Face + Foot/Leg) words ∩ Hand/Arm < (Mouth/Face + Foot/Leg) nonwords revealed significant activity in the HMAT. Finally, the conjunction of Foot/Leg > (Mouth/Face + Hand/Arm) words ∩ > Foot/Leg > (Mouth/Face + Hand/Arm) nonwords also revealed a significant peak in the HMAT. The significant results are summarized in Table 5 and displayed in Figures 4 and 5. We did not observe significant responses for any of the opposite contrasts.

Table 5. 
MNI Coordinates for Conjunctions of Effector-Specific Contrasts Showing Significant Motor Area or IFG Activity
t ContrastPeak MNI (x y z)Z ScoreCluster (Voxels)
Mouth/Face > (Hand/Arm + Foot/Leg) words ∩ Mouth/Face > (Hand/Arm + Foot/Leg) nonwords 
 Left precentral gyrus −22 −24 70 3.81 82 
 Left middle frontal gyrus −42 14 48 3.07 22 
 Left IFG −56 22 16 3.09 40 
  
Hand/Arm < (Mouth/Face + Foot/Leg) words ∩ Hand/Arm < (Mouth/Face + Foot/Leg) nonwords 
 Left precentral gyrus −20 −22 68 3.35 67 
 Left middle frontal gyrus −40 14 40 3.46 27 
  
Foot/Leg > (Mouth/Face + Hand/Arm) words ∩ > Foot/Leg > (Mouth/Face + Hand/Arm) nonwords 
 Left superior frontal gyrus −24 10 56 3.12 23 
t ContrastPeak MNI (x y z)Z ScoreCluster (Voxels)
Mouth/Face > (Hand/Arm + Foot/Leg) words ∩ Mouth/Face > (Hand/Arm + Foot/Leg) nonwords 
 Left precentral gyrus −22 −24 70 3.81 82 
 Left middle frontal gyrus −42 14 48 3.07 22 
 Left IFG −56 22 16 3.09 40 
  
Hand/Arm < (Mouth/Face + Foot/Leg) words ∩ Hand/Arm < (Mouth/Face + Foot/Leg) nonwords 
 Left precentral gyrus −20 −22 68 3.35 67 
 Left middle frontal gyrus −40 14 40 3.46 27 
  
Foot/Leg > (Mouth/Face + Hand/Arm) words ∩ > Foot/Leg > (Mouth/Face + Hand/Arm) nonwords 
 Left superior frontal gyrus −24 10 56 3.12 23 

p < .005 and > 20 voxels (K).

Figure 4. 

Rendering of the left hemisphere cortical surface showing peak maxima from the conjunction analyses of effector-specific contrasts (e.g., Mouth/Face vs. [Hand/Arm + Foot/Leg] words ∩ Mouth/Face vs. [Hand/Arm + Foot/Leg] nonwords), superimposed on the HMAT (in yellow) and IFG (in orange) ROIs and color coded according to effector type.

Figure 4. 

Rendering of the left hemisphere cortical surface showing peak maxima from the conjunction analyses of effector-specific contrasts (e.g., Mouth/Face vs. [Hand/Arm + Foot/Leg] words ∩ Mouth/Face vs. [Hand/Arm + Foot/Leg] nonwords), superimposed on the HMAT (in yellow) and IFG (in orange) ROIs and color coded according to effector type.

Figure 5. 

Mean percent BOLD signal responses (y-axis) extracted from significant clusters in the conjunction analyses of effector-specific contrasts (e.g., Mouth/Face > [Hand/Arm + Foot/Leg] words ∩ Mouth/Face > [Hand/Arm + Foot/Leg] nonwords).

Figure 5. 

Mean percent BOLD signal responses (y-axis) extracted from significant clusters in the conjunction analyses of effector-specific contrasts (e.g., Mouth/Face > [Hand/Arm + Foot/Leg] words ∩ Mouth/Face > [Hand/Arm + Foot/Leg] nonwords).

PT Values Regressor Analysis

The PT values were positively correlated with BOLD signal responses, eliciting significant linear graded activity in the HMAT in a cluster comprising 427 voxels with a Z score of 4.47, the peak of which was in the postcentral gyrus at −44, −32, and 60 (see Figure 6). Thus, more negative PT values were associated with BOLD signal reductions and vice versa. We did not observe any significant linear responses for the opposite contrast (i.e., we did not observe any negative correlations with the PT values).

Figure 6. 

(A) Rendering of the left hemisphere cortical surface showing significant linear graded activity elicited by the PT values for the different action words (thresholded at p < .005, uncorrected for visualization purposes), and (B) mean percent BOLD signal responses extracted from the cluster, color coded according to effector word type.

Figure 6. 

(A) Rendering of the left hemisphere cortical surface showing significant linear graded activity elicited by the PT values for the different action words (thresholded at p < .005, uncorrected for visualization purposes), and (B) mean percent BOLD signal responses extracted from the cluster, color coded according to effector word type.

DISCUSSION

Corpus and psycholinguistic evidence show that people can and do use statistical regularities in language as probabilistic cues during comprehension. In this study, we investigated whether the engagement of motor cortical areas during action word comprehension might reflect processing of these statistical regularities, instead of grounding of conceptual content or simulation as claimed by advocates of embodied cognition. In particular, we tested whether statistical regularities in sublexical phonological features could be responsible for action words engaging PMC and M1 and IFG. This hypothesis was supported via two approaches: In the first, we showed that nonwords matched to effector-specific action words according to phonotactic probabilities were able to activate common areas. Secondly, we showed that a regressor comprising just the variance from the PT values of the action words, a measure of the degree to which a word's phonology is typical of other words in its grammatical category, was able to account for a significant portion of the activation in M1 and PMC during their comprehension.

The majority of embodied cognition neuroimaging studies targeting semantic somatotopy have employed contrasts of action words with low-level baselines, interpreting the resulting motor area activation selectively in terms of grounding of conceptual content or simulation (see Kemmerer, 2015; de Zubicaray et al., 2013; Carota et al., 2012, for reviews), despite the fact that comprehension involves multiple processes (lexical, semantic, morphological, phonological). Our conjunction analyses of action words and nonwords indicate both types of stimuli when matched for phonotactic probabilities activate common voxel space in the HMAT and IFG. This was despite the use of the conjunction null that Friston et al. (2005) consider an overly conservative procedure, particularly in the context of multiple comparisons. This finding is consistent with prior work in perception and production showing nonword processing reliably engages these regions (for a review, see de Zubicaray et al., 2013). It seems that processing of probabilistic sublexical phonotactic information is sufficient to engage these cortical areas during action word comprehension, without the need to additionally propose processing of action semantics or any other mechanism. Indeed, we were unable to find any significant differences in BOLD signal responses when directly contrasting each effector-specific word set with their matched nonwords. However, when we contrasted the combined stimuli, we observed BOLD signal increases for the nonwords in the IFG. Studies of auditory lexical decision contrasting nonwords with concrete nouns have likewise reported IFG signal increases, so this result cannot be considered selective for verb semantics, and is often interpreted as reflecting the additional processing demands of parsing unfamiliar word forms (e.g., Heim, Eickhoff, Ischebeck, Supp, & Amunts, 2007).

After previous studies investigating semantic somatotopy (e.g., Raposo et al., 2009; Tettamanti et al., 2005; Hauk et al., 2004), we next employed conjunctions of contrasts designed to elicit “effector-specific activity” by comparing each effector type against the others within our word and nonword stimuli. Notably, the peak coordinates for the overlapping BOLD signal responses in the IFG for mouth/face word and nonword stimuli (−56, 22,16) are within 10 mm of those reported in multiple fMRI studies of action word comprehension including Hauk et al. (2004; −50, 10, 20); Kemmerer, Gonzalez-Castillo, Talavage, Patterson, and Wiley (2008; −50, 18, 20); Pulvermüller, Kherif, Hauk, Mohr, and Nimmo-Smith, (2009; −49, 11, 16); Pulvermüller, Cook, and Hauk (2012; −50, 20, 16); and Tettamanti et al. (2005; −56, 12, 12) where the activity was invariably interpreted as selectively reflecting mouth/face semantic content. Similarly, two peak coordinates for the conjunction of the contrasts of hand/arm versus other effector types were consistent with prior reports. The peak coordinates −20, −22, and 68 are within 10 mm of those reported by Kemmerer et al. (2008; −28, −30, 62) and Willems et al. (2010; −20, −29, 58), and the second peak at −40, 14, and 40 within 10 mm of those reported by Kemmerer et al. (2008; −46, 10, 40) and Boulenger et al. (2009; −54, 4, 44). Finally, the conjunction of foot/leg relative to the other effectors revealed a significant peak, the coordinates of which (−24, 10, 56) are again within 10 mm of those reported by Tettamanti et al. (2005; −26, 4, 64). Overall, these findings indicate that processing of probabilistic sublexical phonotactic information is sufficient to explain the activation observed during action word comprehension for these types of effector-specific contrasts within the HMAT and IFG ROIs, without the need to propose an additional semantic component.

Might there be a way to reconcile the above results with the embodied account? Unlike our previous investigation using disyllabic nonwords, in the current study, the action words and matched nonwords did constitute similarity neighborhoods because of their corresponding phonotactic probabilities. Could nonwords partially activating the meanings of their action word phonological neighbors have produced the BOLD signal overlaps in motor area and IFG (e.g., Pulvermüller, 2018; Lin et al., 2015)? We consider this inferential leap unlikely for several reasons. For example, there is broad consensus that spoken word recognition involves two processes, the first being activation of a phonological similarity neighborhood and, the second, competition for recognition among activated similar-sounding form-based representations. The major challenge to the engagement of “action–semantic links” in spoken word recognition is that current computational models show that it can be accomplished efficiently via merging of sublexical and lexical information without the need for semantic access (see Magnuson, Mirman, & Myers, 2013, for a review; Vitevitch & Luce, 2016). In addition, it is well established that when lexical word forms are not strongly activated because of the presentation of a nonword, the sublexical segmental and biphone frequencies control processing as reflected by the processing efficiencies observed with increased probabilistic phonotactics (Vitevitch & Luce, 2016). Even if one were to accept the premise that nonwords automatically activate lexical items from their phonological neighborhoods, despite participants being highly accurate in rejecting them as valid lexical forms, this would involve activating a variety of words/concepts, not just action meanings, and so induce additional lexical-semantic competition (e.g., along with sing, the lexical concepts for singlet, single, singular, singularity, Singapore, etc., would be activated). Thus, relying on a linguistic cognitive approach when performing a linguistic task such as auditory lexical decision is more efficient than relying on sensorimotor simulation or neural re-enactment approaches. Finally, given nonwords are able to activate motor cortical areas reliably in the absence of any action words in the experimental context (see the review by de Zubicaray et al., 2013), action–semantic links are clearly unnecessary to explain nonword motor activation as by definition, nonwords do not have a specific meaning—or any meaning—in the lexicon, despite conforming to English phonotactic rules. Thus, we recommend future studies seeking to demonstrate semantic somatotopy incorporate nonwords matched on probabilistic phonotactics in their design to test an alternative, nonsemantic explanation for motor cortex activation during action word comprehension.

The direct source of evidence that motor cortical engagement observed during action word comprehension reflects processing of statistical regularities comes from the regression analysis using the PT values of the effector-specific action words themselves. This analysis allowed us to separate the contribution of PT to motor cortex activation from other processing (e.g., lexical, semantic) during word comprehension. The PT values were positively correlated with BOLD signal responses in a significant portion of both M1 and PMC. Interestingly, although we selected the effector-specific action word stimuli from prior studies without reference to PT values, they differed significantly in terms of their typicality for the category of verbs. As more negative PT values are indicative of stronger typicality to grammatical class, the mouth/face words were the most verb-typical, followed by hand/arm words, and foot/leg words the least. Hence, processing of more typical verbs was associated with reduced BOLD signal responses and vice versa. The mouth/face words also showed the highest response accuracy in the lexical decision task, consistent with prior work indicating more typical lexical items show a processing advantage (e.g., Monaghan et al., 2010). Whether this pattern of typicality is limited to the effector-specific verb stimuli in the current study or is a more general property of these classes of action words warrants further investigation at the corpus level. Regardless, our findings clearly demonstrate the necessity for future embodied cognition studies to either match action word effector types according to their PT values or include PT values as covariates, particularly when the aim is to selectively attribute motor area involvement to action word semantics. This also applies to studies with lesion patients and patients with motor system disorders such as Parkinson's disease.

Our findings are consistent with prior work showing PMC and IFG are sensitive to statistical regularities in spoken language. For example, multiple neuroimaging studies of word segmentation in continuous speech in both children and adults have demonstrated these regions are engaged when syllable sequences vary in their transitional probabilities (e.g., Karuza et al., 2013; Tremblay, Baroni, & Hasson, 2012; McNealy, Mazziotta, & Dapretto, 2010; Cunillera et al., 2009). Why might motor areas and the IFG show sensitivity to probabilistic cues in language? Considerable evidence shows these regions are critical to statistical learning more generally (e.g., Wilkinson et al., 2017; Clerget, Poncin, Fadiga, & Olivier, 2012; Tobia, Iacovella, & Hasson, 2012). Hence, it might be more appropriate to characterize roles for these areas during action word comprehension in terms of the operation of a domain general rather than language-specific mechanism. Other researchers have noted the heterogeneity of language-specific and domain general processing regions within the IFG (e.g., Fedorenko, Duncan, & Kanwisher, 2012).

More broadly, our findings have implications for all contemporary neurobiological perspectives of language that assume, either implicitly or explicitly, that there is an arbitrary relationship between the sound of a word and its meaning (e.g., Lambon Ralph et al., 2017; Glenberg & Gallese, 2012; Meteyard et al., 2012; Barsalou, 2008; Gallese & Lakoff, 2005; Pulvermüller, 2005; Zwaan, 2004). In addition to action semantics, these perspectives typically emphasize roles for brain regions in processing meaning based on data from comparisons of abstract versus concrete words and words with positive or negative valence. Yet, statistical regularities in sound-to-meaning correspondences also distinguish these classes of words across languages (Adelman et al., 2018; Tillman & Louwerse, 2018; Louwerse & Qu, 2017; Reilly et al., 2012). Accounts that explicitly assume an arbitrary relationship therefore need amending (e.g., Glenberg & Kaschak, 2002), whereas those that omit roles for systematic sound-to-meaning correspondences might be considered underspecified at best (e.g., Lambon Ralph et al., 2017; Meteyard et al., 2012; Barsalou, 2008; Pulvermüller, 2005; Zwaan, 2004). It would also be helpful for future perspectives to explicitly acknowledge language statistics elicit equal or larger effect sizes in language processing than sensorimotor simulation (e.g., Hutchinson et al., 2015) and that statistical learning plays an important role in oral and written language acquisition (Arciuli, 2017, 2018; Erickson & Thiessen, 2015).

Author Contributions

Greig de Zubicaray: Conceptualization; Data curation; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration. Katie L. McMahon: Conceptualization; Data curation; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration. Joanne Arciuli: Conceptualization; Methodology; Resources; Writing - Original Draft; Writing - Review & Editing.

Funding Information

Greig de Zubicaray, Australian Research Council (http://dx.doi.org/10.13039/501100000923), Grant number: FT0991634.

Acknowledgments

We are grateful to Kori Johnson and Samuel Hansen for their help acquiring the data and Padraic Monaghan for providing us with the PT values from Monaghan et al. (2010).

Reprint requests should be sent to Greig de Zubicaray, Faculty of Health, Queensland University of Technology Faculty of Health, 110544 Victoria Park Road, 4059, Brisbane, Australia, or via e-mail: greig.dezubicaray@qut.edu.au.

Notes

1. 

Pulvermüller (2018) reported a reanalysis of Postle et al.'s (2018) data that entailed combining PM and M1 ROIs and omitting the execution localizer condition, apparently finding a significant interaction for the observation localizer condition. We were unable to replicate this result using the same data.

2. 

As the verbs were rated in written rather than auditory form, one of the hand/arm action words “tear” was phonologically and semantically ambiguous. Although the intended meaning was “to rend,” it was likely interpreted as “teer” as in “teardrop” by some participants, accounting for its 50:50 ratings across face and hand.

3. 

We acknowledge an arbitrary thresholding approach does not allow a principled determination of the false-positive risk in a given data set or across studies with different imaging parameters, processing, and so forth (see Eklund et al., 2016; Woo et al., 2014). However, the approach is used consistently by fMRI studies of semantic somatotopy.

REFERENCES

Adelman
,
J. S.
,
Estes
,
Z.
, &
Cossu
,
M.
(
2018
).
Emotional sound symbolism: Languages rapidly signal valence via phonemes
.
Cognition
,
175
,
122
130
.
Arciuli
,
J.
(
2017
).
The multi-component nature of statistical learning
.
Philosophical Transactions of the Royal Society B: Biological Sciences
,
372
,
20160058
.
Arciuli
,
J.
(
2018
).
Reading as statistical learning
.
Language, Speech, and Hearing Services in Schools
,
49
,
634
643
.
Arciuli
,
J.
, &
Cupples
,
L.
(
2006
).
The processing of lexical stress during visual word recognition: Typicality effects and orthographic correlates
.
Quarterly Journal of Experimental Psychology
,
59
,
920
948
.
Arciuli
,
J.
, &
Cupples
,
L.
(
2007
).
Would you rather “embert a cudsert” or “cudsert an embert”? How spelling patterns at the beginning of English disyllables can cue grammatical category
. In
A.
Schalley
&
D.
Khlentzos
(Eds.),
Language and cognitive structure. Mental states
(
Vol. 2
, pp.
213
237
).
Amsterdam, The Netherlands
:
John Benjamins Publishing
.
Arciuli
,
J.
, &
Monaghan
,
P.
(
2009
).
Probabilistic cues to grammatical category in English orthography and their influence during reading
.
Scientific Studies of Reading
,
13
,
73
93
.
Argiris
,
G.
,
Budai
,
R.
,
Maieron
,
M.
,
Ius
,
T.
,
Skrap
,
M.
, &
Tomasino
,
B.
(
2020
).
Neurosurgical lesions to sensorimotor cortex do not impair action verb processing
.
Scientific Reports
,
10
,
523
.
Ashburner
,
J.
(
2007
).
A fast diffeomorphic image registration algorithm
.
Neuroimage
,
38
,
95
113
.
Baayen
,
R. H.
,
Piepenbrock
,
R.
, &
Gulikers
,
L.
(
1995
).
The CELEX lexical database [CD-ROM]
.
Philadelphia
:
University of Pennsylvania, Linguistic Data Consortium
.
Balota
,
D. A.
,
Yap
,
M.J.
,
Cortese
,
M. J.
,
Hutchison
,
K. A.
,
Kessler
,
B.
,
Loftis
,
B.
, et al
(
2007
).
The English lexicon project
.
Behavior Research Methods
,
39
,
445
459
.
Barsalou
,
L.W.
(
2008
).
Grounded cognition
.
Annual Review of Psychology
,
59
,
617
645
.
Barsalou
,
L.W.
(
2016
).
On staying grounded and avoiding Quixotic dead ends
.
Psychonomic Bulletin & Review
,
23
,
1122
1142
.
Boulenger
,
V.
,
Hauk
,
O.
, &
Pulvermüller
,
F.
(
2009
).
Grasping ideas with the motor system: Semantic somatotopy in idiom comprehension
.
Cerebral Cortex
,
19
,
1905
1914
.
Brett
,
M.
,
Romain Valabregue
,
J.-L. A.
, &
Poline
,
J.-P.
(
2002
).
Region of interest analysis using an SPM toolbox [abstract]
.
Paper Presented at the 8th International Conference on Functional Mapping of the Human Brain, June 2–6, 2002
,
Sendai, Japan
.
Available on CD-ROM in NeuroImage, Vol 16, No 2
.
Brysbaert
,
M.
, &
New
,
B.
(
2009
).
Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English
.
Behavior Research Methods
,
41
,
977
990
.
Brysbaert
,
M.
,
New
,
B.
, &
Keuleers
,
E.
(
2012
).
Adding Part-of-Speech information to the SUBTLEX-US word frequencies
.
Behavior Research Methods
,
44
,
991
997
.
Caramazza
,
A.
,
Anzellotti
,
S.
,
Strnad
,
L.
, &
Lingnau
,
A.
(
2014
).
Embodied cognition and mirror neurons: A critical assessment
.
Annual Review of Neuroscience
,
37
,
1
15
.
Carota
,
F.
,
Moseley
,
R.
, &
Pulvermüller
,
F.
(
2012
).
Body-part-specific representations of semantic noun categories
.
Journal of Cognitive Neuroscience
,
24
,
1492
1509
.
Cassidy
,
K. W.
, &
Kelly
,
M. H.
(
1991
).
Phonological information for grammatical category assignments
.
Journal of Memory and Language
,
30
,
348
369
.
Clerget
,
E.
,
Poncin
,
W.
,
Fadiga
,
L.
, &
Olivier
,
E.
(
2012
).
Role of Broca's area in implicit motor skill learning: Evidence from continuous theta-burst magnetic stimulation
.
Journal of Cognitive Neuroscience
,
24
,
80
92
.
Cortese
,
M. J.
, &
Fugett
,
A.
(
2004
).
Imageability ratings for 3,000 monosyllabic words
.
Behavior Research Methods, Instruments, & Computers
,
36
,
384
387
.
Courson
,
M.
, &
Tremblay
,
P.
(
2020
)
Neural correlates of manual action language: comparative review, meta-analyses and ROI analysis
.
Neuroscience and Biobehavioral Reviews
,
116
,
221
238
.
Cunillera
,
T.
,
Camara
,
E.
,
Toro
,
J. M.
,
Marco-Pallares
,
J.
,
Sebastian-Galles
,
N.
,
Ortiz
,
H.
, et al
(
2009
).
Time course and functional neuroanatomy of speech segmentation in adults
.
Neuroimage
,
48
,
541
553
.
De Deyne
,
S.
,
Navarro
,
D. J.
,
Perfors
,
A.
,
Brysbaert
,
M.
, &
Storms
,
G.
(
2019
).
The “small world of words” English word association norms for over 12,000 cue words
.
Behavior Research Methods
,
51
,
987
1006
.
de Zubicaray
,
G
,
Arciuli
,
J.
, &
McMahon
,
K
. (
2013
).
Putting an “end” to the motor cortex representations of action words
.
Journal of Cognitive Neuroscience
,
25
,
1957
1974
.
Eklund
,
A.
,
Nichols
,
T. E.
, &
Knutsson
,
H.
(
2016
).
Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates
.
Proceedings of the National Academy of Sciences, U.S.A.
,
113
,
7900
7905
.
Erickson
,
L. C.
, &
Thiessen
,
E. D.
(
2015
).
Statistical learning of language: Theory, validity, and predictions of a statistical learning account of language acquisition
.
Developmental Review
,
37
,
66
108
.
Fedorenko
,
E.
,
Duncan
,
J.
, &
Kanwisher
,
N.
(
2012
)
Language-selective and domain-general regions lie side by side within Broca's area
.
Current Biology
,
22
,
2059
2062
.
Freire
,
L.
,
Roche
,
A.
, &
Mangin
,
J. F.
(
2002
).
What is the best similarity measure for motion correction in fMRI time series?
IEEE Transactions on Medical Imaging
,
21
,
470
484
.
Friston
,
K. J.
,
Glaser
,
D. E.
,
Henson
,
R. N. A.
,
Kiebel
,
S.
,
Phillips
,
C.
, &
Ashburner
,
J.
(
2002
).
Classical and Bayesian inference in neuroimaging: Applications
.
Neuroimage
,
16
,
484
512
.
Friston
,
K. J.
,
Penny
,
W. D.
, &
Glaser
,
D. E.
(
2005
).
Conjunction revisited
,
Neuroimage
,
25
,
661
667
.
Gallese
,
V.
, &
Lakoff
,
G.
(
2005
).
The brain's concepts: The role of the sensory-motor system in reason and language
.
Cognitive Neuropsychology
,
22
,
455
479
.
Glenberg
,
A. M.
(
2015
).
Few believe the world is flat: How embodiment is changing the scientific understanding of cognition
.
Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale
,
69
,
165
171
.
Glenberg
,
A. M.
, &
Gallese
,
V.
(
2012
).
Action-based language: A theory of language acquisition, comprehension, and production
.
Cortex
,
48
,
905
922
.
Glenberg
,
A. M.
, &
Kaschak
,
M. P.
(
2002
).
Grounding language in action
.
Psychonomic Bulletin & Review
,
9
,
558
565
.
Goldinger
,
S. D.
(
1996
).
Auditory lexical decision
.
Language and Cognitive Processes
,
11
,
559
567
.
Hammers
,
A.
,
Allom
,
R.
,
Koepp
,
M. J.
,
Free
,
S. L.
,
Myers
,
R.
,
Lemieux
,
L.
, et al
(
2003
).
Three-dimensional maximum probability atlas of the human brain, with particular reference to the temporal lobe
.
Human Brain Mapping
,
19
,
224
247
.
Hauk
,
O.
,
Johnsrude
,
I.
, &
Pulvermüller
,
F.
(
2004
).
Somatotopic representation of action words in the motor and premotor cortex
.
Neuron
,
41
,
301
307
.
Hauk
,
O.
, &
Pulvermüller
,
F.
(
2011
).
The lateralization of motor cortex activation to action-words
.
Frontiers in Human Neuroscience
,
5
,
149
.
Heim
,
S.
,
Eickhoff
,
S. B.
,
Ischebeck
,
A. K.
,
Supp
,
G.
, &
Amunts
,
K.
(
2007
).
Modality-independent involvement of the left BA 44 during lexical decision making
.
Brain Structure & Function
,
212
,
95
106
.
Karuza
,
E. A.
,
Newport
,
E. L.
,
Aslin
,
R. N.
,
Starling
,
S. J.
,
Tivarus
,
M. E.
, &
Bavelier
,
D.
(
2013
).
The neural correlates of statistical learning in a word segmentation task: An fMRI study
.
Brain & Language
,
127
,
46
54
.
Kelly
,
M. H.
(
1992
).
Using sound to solve syntactic problems: The role of phonology in grammatical category assignments
.
Psychological Review
,
99
,
349
364
.
Kemmerer
,
D.
(
2015
).
Are the motor features of verb meanings represented in the precentral motor cortices? Yes, but within the context of a flexible, multilevel architecture for conceptual knowledge
.
Psychonomic Bulletin & Review
,
22
,
1068
1075
.
Kemmerer
,
D.
,
Gonzalez-Castillo
,
J.
,
Talavage
,
T.
,
Patterson
,
S.
, &
Wiley
,
C.
(
2008
).
Neuroanatomical distribution of five semantic components of verbs: Evidence from fMRI
.
Brain and Language
,
107
,
16
43
.
Köhler
,
W.
(
1947
).
Gestalt psychology
(2nd ed.).
New York
:
Liveright
.
Kuperman
,
V.
,
Stadthagen-Gonzalez
,
H.
, &
Brysbaert
,
M.
(
2012
).
Age-of-acquisition ratings for 30 thousand English words
.
Behavior Research Methods
,
44
,
978
990
.
Lambon Ralph
,
M.
,
Jefferies
,
E.
,
Patterson
,
K.
, &
Rogers
,
T.
(
2017
).
The neural and computational bases of semantic cognition
.
Nature Reviews Neuroscience
,
18
,
42
55
.
Lichtheim
,
L.
(
1885
).
On aphasia
.
Brain
,
7
,
433
484
.
Lin
,
N.
,
Wang
,
X.
,
Zhao
,
Y.
,
Liu
,
Y.
,
Li
,
X.
, &
Bi
,
Y.
(
2015
).
Premotor cortex activation elicited during word comprehension relies on access of specific action concepts
.
Journal of Cognitive Neuroscience
,
27
,
2051
2062
.
Louwerse
,
M. M.
,
Hutchinson
,
S.
,
Tillman
,
R.
, &
Recchia
,
G.
(
2015
).
Effect size matters: The role of language statistics and perceptual simulation in conceptual processing
.
Language, Cognition and Neuroscience
,
30
,
430
47
.
Louwerse
,
M.
, &
Qu
,
Z.
(
2017
).
Estimating valence from the sound of a word: Computational, experimental, and cross-linguistic evidence
.
Psychonomic Bulletin & Review
,
24
,
849
855
.
Magnuson
,
J. S.
,
Mirman
,
D.
, &
Myers
,
E.
(
2013
).
Spoken word recognition
. In
D.
Reisberg
(Ed.),
The Oxford handbook of cognitive psychology
(pp.
412
441
).
New York
:
Oxford University Press
.
Maurer
,
D.
,
Pathman
,
T.
, &
Mondloch
,
C. J.
(
2006
).
The shape of boubas: Sound-shape correspondences in toddlers and adults
.
Developmental Science
,
9
,
316
322
.
Mayka
,
M. A.
,
Corcos
,
D. M.
,
Leurgans
,
S. E.
, &
Vaillancourt
,
D. E.
(
2006
).
Three-dimensional locations and boundaries of motor and premotor cortices as defined by functional brain imaging: A meta-analysis
.
Neuroimage
,
31
,
1453
1474
.
McNealy
,
K.
,
Mazziotta
,
J.C.
, &
Dapretto
,
M.
(
2010
).
The neural basis of speech parsing in children and adults
.
Developmental Science
,
13
,
385
406
.
Meteyard
,
L.
,
Cuadrado
,
S. R.
,
Bahrami
,
B.
, &
Vigliocco
,
G.
(
2012
).
Coming of age: A review of embodiment and the neuroscience of semantics
.
Cortex
,
48
,
788
804
.
Minicucci
,
D.
,
Guediche
,
S.
, &
Blumstein
,
S. E.
(
2013
).
An fMRI examination of the effects of acoustic-phonetic and lexical competition on access to the lexical-semantic network
.
Neuropsychologia
,
51
,
1980
1988
.
Monaghan
,
P.
,
Chater
,
N.
, &
Christiansen
,
M.
(
2005
).
The differential contribution of phonological and distributional cues in grammatical categorization
.
Cognition
,
96
,
143
182
.
Monaghan
,
P.
,
Christiansen
,
M. H.
, &
Chater
,
N.
(
2007
).
The phonological distributional coherence hypothesis: Cross-linguistic evidence in language acquisition
.
Cognitive Psychology
,
55
,
259
305
.
Monaghan
,
P.
,
Christiansen
,
M. H.
,
Farmer
,
T. A.
, &
Fitneva
,
S. A.
(
2010
).
Measures of phonological typicality: Robust coherence and psychological validity
.
Mental Lexicon
,
5
,
281
299
.
Moseley
,
R. L.
,
Pulvermüller
,
F.
, &
Shtyrov
,
Y.
(
2013
).
Sensorimotor semantics on the spot: brain activity dissociates between conceptual categories within 150 ms
.
Scientific Reports
,
3
,
1928
.
Mumford
,
J. A.
,
Poline
,
J.-B.
, &
Poldrack
,
R. A.
(
2015
).
Orthogonalization of regressors in fMRI models
.
PLoS One
,
10
,
e0126255
.
Nichols
,
T.
,
Brett
,
M.
,
Andersson
,
J.
,
Wager
,
T.
, &
Poline
,
J. B.
(
2005
).
Valid conjunction inference with the minimum statistic
.
Neuroimage
,
25
,
653
660
.
Nuckolls
,
J. B.
(
1999
).
The case for sound symbolism
.
Annual Review of Anthropology
,
28
,
225
252
.
Papeo
,
L.
,
Vallesi
A.
,
Isaja
A.
, &
Rumiati
R.I.
(
2009
).
Effects of TMS on different stages of motor and non-motor verb processing in the primary motor cortex
.
PLoS One
,
4
,
e4508
.
Postle
,
N.
,
McMahon
,
K. L.
,
Ashton
,
R.
,
Meredith
,
M.
, &
de Zubicaray
,
G. I.
(
2008
).
Action word meaning representations in cytoarchitectonically defined primary and premotor cortices
.
Neuroimage
,
43
,
634
644
.
Pritchett
,
B. L.
,
Hoeflin
,
C.
,
Koldewyn
,
K.
,
Dechter
,
E.
, &
Fedorenko
,
E.
(
2018
).
High-level language processing regions are not engaged in action observation or imitation
.
Journal of Neurophysiology
,
120
,
2555
2570
.
Pulvermüller
,
F.
(
2005
).
Brain mechanisms linking language and action
.
Nature Reviews Neuroscience
,
6
,
576
582
.
Pulvermüller
,
F.
(
2018
).
Neural reuse of action perception circuits for language, concepts and communication
.
Progress in Neurobiology
,
160
,
1
44
.
Pulvermüller
,
F.
,
Cook
,
C.
, &
Hauk
,
O.
(
2012
).
Inflection in action: Semantic motor system activation to noun and verb-containing phrases is modulated by the presence of overt grammatical markers
.
Neuroimage
,
60
,
1367
1379
.
Pulvermüller
,
F.
,
Hauk
,
O.
,
Nikulin
,
V. V.
, &
Ilmoniemi
,
R. J.
(
2005
).
Functional links between motor and language systems
.
European Journal of Neuroscience
,
21
,
793
797
.
Pulvermüller
,
F.
,
Kherif
,
F.
,
Hauk
,
O.
,
Mohr
,
B.
, &
Nimmo-Smith
,
I.
(
2009
).
Cortical cell assemblies for general lexical and category-specific semantic processing as revealed by fMRI cluster analysis
.
Human Brain Mapping
,
30
,
3837
3850
.
Pulvermüller
,
F.
,
Lutzenberger
,
W.
, &
Preissl
,
H.
(
1999
).
Nouns and verbs in the intact brain: Evidence from event-related potentials and high-frequency cortical responses
.
Cerebral Cortex
,
9
,
498
508
.
Raposo
,
A.
,
Moss
,
H. E.
,
Stamatakis
,
E. A.
, &
Tyler
,
L. K.
(
2009
).
Modulation of motor and premotor cortices by actions, action words and action sentences
.
Neuropsychologia
,
47
,
388
396
.
Reilly
,
M.
, &
Desai
,
R. H.
(
2017
).
Effects of semantic neighborhood density in abstract and concrete words
.
Cognition
169
,
46
53
.
Reilly
,
J.
,
Westbury
,
C.
,
Kean
,
J.
, &
Peele
,
J.
(
2012
).
Arbitrary symbolism in natural language revisited: When word forms carry meaning
.
PLoS One
,
7
,
e42286
.
Rüschemeyer
,
S. A.
,
Brass
,
M.
, &
Friederici
,
A. D.
(
2007
).
Comprehending prehending: Neural correlates of processing verbs with motor systems
.
Journal of Cognitive Neuroscience
,
19
,
855
865
.
Schuil
,
K. D. I.
,
Smits
,
M.
, &
Zwaan
,
R. A.
(
2013
)
Sentential context modulates the involvement of the motor cortex in action language processing: An fMRI study
.
Frontiers in Human Neuroscience
,
7
,
100
.
Sereno
,
J. A.
, &
Jongman
,
A.
(
1990
).
Phonological and form class relations in the lexicon
.
Journal of Psycholinguistic Research
,
19
,
387
404
.
Tettamanti
,
M.
,
Buccino
,
G.
,
Saccuman
,
M. C.
,
Gallese
,
V.
,
Danna
,
M.
,
Scifo
,
P.
, et al
(
2005
).
Listening to action-related sentences activates fronto-parietal motor circuits
.
Journal of Cognitive Neuroscience
,
17
,
273
281
.
Thompson
,
P. D.
, &
Estes
,
Z.
(
2011
).
Sound symbolic naming of novel objects is a graded function
.
Quarterly Journal of Experimental Psychology
,
64
,
2392
2404
.
Tillman
,
R.
, &
Louwerse
,
M.
(
2018
).
Estimating emotions through language statistics and embodied cognition
.
Journal of Psycholinguistic Research
,
47
,
159
167
.
Tobia
,
M. J.
,
Iacovella
,
V.
, &
Hasson
,
U.
(
2012
).
Multiple sensitivity profiles to diversity and transition structure in non-stationary input
.
Neuroimage
,
60
,
991
1005
.
Tremblay
,
P.
,
Baroni
,
M.
, &
Hasson
,
U.
(
2012
).
Processing of speech and non-speech sounds in the supratemporal plane: Auditory input preference does not predict sensitivity to statistical structure
.
Neuroimage
,
66C
,
318
332
.
Vaden
,
K. I.
,
Halpin
,
H. R.
, &
Hickok
,
G. S.
(
2009
).
Irvine phonotactic online dictionary, (Version 2.0). [Data file]
.
Retrieved from http://www.iphod.com
.
van Casteren
,
M.
, &
Davis
,
M.H.
(
2006
).
Mix, a program for pseudorandomization
.
Behavior Research Methods
,
38
,
584
589
.
Vitevitch
,
M.S.
, &
Luce
,
P.A.
(
2004
).
A web-based interface to calculate phonotactic probability for words and nonwords in English
.
Behavior Research Methods, Instruments, and Computers
,
36
,
481
487
.
Vitevitch
,
M. S.
, &
Luce
,
P. A.
(
2016
)
Phonological neighborhood effects in spoken word perception and production
.
Annual Review of Linguistics
,
2
,
75
94
.
Watson
,
C. E.
,
Cardillo
,
E. R.
,
Ianni
,
G. R.
, &
Chatterjee
,
A.
(
2013
).
Action concepts in the brain: An activation likelihood estimation meta-analysis
.
Journal of Cognitive Neuroscience
,
25
,
1191
1205
.
Wilkinson
,
L.
,
Koshy
,
P. J.
,
Steel
,
A.
,
Bageac
,
D.
,
Schintu
,
S.
, &
Wassermann
,
E. M.
(
2017
).
Motor cortex inhibition by TMS reduces cognitive non-motor procedural learning when immediate incentives are present
.
Cortex
,
97
,
70
80
.
Willems
,
R.
,
Toni
,
I.
,
Hagoort
,
P.
, &
Casasanto
,
D.
(
2010
).
Neural dissociations between action verb understanding and motor imagery
.
Journal of Cognitive Neuroscience
,
22
,
2387
2400
.
Woo
,
C.W.
,
Krishnan
,
A.
, &
Wager
,
T.D.
(
2014
).
Cluster-extent based thresholding in fMRI analyses: Pitfalls and recommendations
.
Neuroimage
,
91
,
412
419
.
Zaitsev
,
M.
,
Hennig
,
J.
, &
Speck
,
O.
(
2004
).
Point spread function mapping with parallel imaging techniques and high acceleration factors: Fast, robust, and flexible method for echo-planar imaging distortion correction
.
Magnetic Resonance in Medicine
,
52
,
1156
1166
.
Zhang
,
Z.
,
Sun
,
Y.
, &
Wang
,
Z.
(
2018
).
Representation of action semantics in the motor cortex and Broca's area
.
Brain and Language
,
179
,
33
41
.
Zhuang
,
J.
,
Tyler
,
L. K.
,
Randall
,
B.
,
Stamatakis
,
E. A.
, &
Marslen-Wilson
,
W. D.
(
2014
).
Optimally efficient neural systems for processing spoken language
.
Cerebral Cortex
,
24
,
908
918
.
Zwaan
,
R. A.
(
2004
).
The immersed experiencer: Toward an embodied theory of language comprehension
. In
B. H.
Ross
(Ed.),
The psychology of learning and motivation
(pp.
35
62
).
New York
:
Academic Press
.