Abstract

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites—PMC, posterior superior temporal gyrus, and occipital pole—and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.

INTRODUCTION

A key controversy within the neuroscience of language concerns whether speech perception relies on purely auditory mechanisms or sensorimotor processing. The motor theory of speech perception states that processes involved in producing speech also participate in understanding spoken language under normal circumstances (Liberman & Mattingly, 1985; Liberman, Cooper, Shankwei, & Studdert, 1967). Within cognitive neuroscience, researchers have suggested that brain areas involved in the production of speech, such as aspects of premotor cortex (PMC), also contribute to speech perception (Galantucci, Fowler, & Turvey, 2006; Rizzolatti & Craighero, 2004; Liberman & Mattingly, 1985). On the other hand, many current models of the neurobiology of language propose two parallel processing streams: one that runs dorsally for auditory–motor integration and a second, ventral route within the temporal lobes for “comprehension” of spoken words (Rauschecker & Scott, 2009; Hickok & Poeppel, 2004, 2007). Tasks involving speech perception differentially recruit these two routes depending on the extent to which they involve access to articulatory representations and concepts. According to some authors, the ventral route may be sufficient for the comprehension of clear auditory input, in the absence of a contribution from motor speech areas within PMC (Osnes, Hugdahl, & Specht, 2011; Hickok, 2009; Scott, McGettigan, & Eisner, 2009; Spitsyna, Warren, Scott, Turkheimer, & Wise, 2006).

Neuroimaging studies have provided strong support for the engagement of motor speech areas in speech perception: many studies have reported dorsal PMC recruitment during tasks involving phonemic judgments (e.g., Zatorre, Meyer, Gjedde, & Evans, 1996; Zatorre, Evans, Meyer, & Gjedde, 1992), passive speech listening for meaningless monosyllables (i.e., Pulvermuller et al., 2006; Uppenkamp, Johnsrude, Norris, Marslen-Wilson, & Patterson, 2006; Wilson, Saygin, Sereno, & Iacoboni, 2004), and contrasts of synthetic vowel sounds over non speech stimuli (musical rain; Uppenkamp et al., 2006). Nevertheless, when participants listen to naturalistic sentences as opposed to meaningless auditory stimuli with matched acoustic complexity, the neural activity is confined to temporal lobe areas within the ventral route (Spitsyna et al., 2006; Scott, Blank, Rosen, & Wise, 2000). These findings suggest that motor speech areas in PMC may be recruited in demanding task contexts and/or when explicit perception or manipulation of phonemes is required.

Moreover, functional neuroimaging studies cannot determine whether the PMC activation seen in many studies is essential to speech perception, and the neuropsychological literature largely contradicts this view. Patients with expressive aphasia have severe deficits of language production following lesions to left frontal cortex, and the motor theory would predict that these patients should also be impaired on auditory comprehension; however, this is often not the case (e.g., Miceli, Gainotti, Caltagirone, & Masullo, 1980). They do show impairments on explicit perceptual categorization and phoneme awareness tasks (e.g., identifying the boundary between two phonemes; performing explicit phoneme segmentation), which require access to explicit/categorical phonological representations, but these impairments are not reflected in general comprehension (Rogalsky, Love, Driscoll, Anderson, & Hickok, 2011; Moineau, Dronkers, & Bates, 2005; Bishop, Brown, & Robson, 1990; Miceli et al., 1980; Basso, Casati, & Vignolo, 1977; Blumstein, Cooper, Zurif, & Caramazza, 1977). This dissociation is captured by studies in which patients were impaired on judgments based on perceptual features (same/different judgments) but showed no deficit for spoken word–picture judgments based on the semantic content of the word (Rogalsky et al., 2011; Bishop et al., 1990).

Recent TMS research has confirmed a focal role for dorsal PMC in some speech perception tasks, including speech discrimination of syllables embedded in noise (D'Ausilio et al., 2009; Meister, Wilson, Deblieck, Wu, & Iacoboni, 2007), categorical perception (M1, measured by motor-evoked potentials; Mottonen & Watkins, 2009), and phoneme discrimination of nonsense syllables (Sato, Tremblay, & Gracco, 2009). Additionally, some TMS studies have demonstrated motor recruitment for certain aspects of speech perception in the absence of an explicit task (Mottonen, Dutton, & Watkins, 2013; Watkins, Strafella, & Paus, 2003; Fadiga, Craighero, Buccino, & Rizzolatti, 2002). For example, a recent study combined repetitive TMS with EEG recordings and found that TMS to lip area of M1, but not hand M1, suppressed responses for phonetic discrimination, but not for piano tones (Mottonen et al., 2013). Furthermore, Roy, Craighero, Fabbri-Destor, and Fadiga (2008) found larger motor-evoked potentials in motor cortex, following TMS, for both pseudo and rare words, compared with frequent words (see also Fadiga et al., 2002); therefore, this site may make a specific contribution to phonological processing for rare/new speech stimuli, which are not strongly supported by the activation of meaning within the ventral route. Although the aforementioned TMS studies indicate that PMC plays an important role in phoneme discrimination, an important caveat remains: other aspects of speech processing, such as mapping sounds onto meaning (Marslen-Wilson & Warren, 1994; Morais & Kolinsky, 1994), may proceed without the involvement of motor speech areas.

Recent behavioral studies (e.g., McMurray, Tanenhaus, & Aslin, 2009; Davis, Marslen-Wilson, & Gaskell, 2002) have shown that spoken word recognition does not operate on categorical phonemic representations. Instead, phonetic details and ambiguities in the signal are cascaded through the recognition system, such that they can provide as much information as possible about the nature of the input during word recognition (Hawkins, 2003). Equally, there is good evidence that listeners generate categorical representations of phonemes automatically during the course of spoken word recognition (Gaskell, Quinlan, Tamminen, & Cleland, 2008). Gaskell et al. (2008) suggested that these two properties can be reconciled using a model of language perception in which detailed auditory representations are mapped simultaneously onto two systems. One of these systems deals with word recognition and extraction of meaning, whereas the other generates categorical representations, perhaps facilitating links with the production system (Gaskell & Marslen-Wilson, 1997). Thus, PMC may be involved in the mechanism that generates categorical representations rather than the main process that extracts word meanings from detailed (noncategorical) representations of speech. This view finds support in studies where “naturalistic” speech comprehension (e.g., listening to intelligible natural sentences) does not recruit motor areas (e.g., Spitsyna et al., 2006; Scott et al., 2000); however, when perceptual and also semantic difficulty increases (i.e., acoustically degraded speech where the degree of semantic relatedness between words is weak), activation can be seen in frontal and parietal areas (e.g., Sharp et al., 2010).

The current study used TMS to examine the contribution of left dorsal PMC to speech perception in different task contexts, providing a test of these conflicting claims about its role. We used an inhibitory TMS paradigm, in which low-frequency repetitive trains of TMS were used to transiently disrupt neural processing: This has been shown to produce subsequent behavioral interference in tasks that rely on the stimulated region of cortex (Whitney, Kirk, O'Sullivan, Lambon Ralph, & Jefferies, 2012; Hoffman, Pobric, Drakesmith, & Lambon Ralph, 2011; Devlin & Watkins, 2007; Pobric, Jefferies, & Lambon Ralph, 2007; Walsh & Cowey, 2000). The two speech perception tasks used an identical two-alternative forced-choice task format but evaluated either participants' judgments about phoneme categories or their access to word meanings for the same auditory speech stimuli. Therefore, for the first time in the TMS literature, we were able to directly contrast two tasks that required explicit phoneme decisions and the mapping of speech sounds onto semantic categories. If left dorsal PMC plays a critical role in speech processing irrespective of task context, these judgment types should show equivalent disruption. If, in contrast, ventral stream activity is sufficient for access to the meanings of spoken words and left dorsal PMC makes a selective contribution to explicit phoneme decisions, stimulation of this region with TMS should produce a dissociation between our experimental tasks.

The effects of TMS to dorsal PMC were compared with two additional stimulation sites. We applied TMS to posterior superior temporal gyrus (pSTG), a site that is uncontroversially recruited during normal auditory processing (Hickok & Poeppel, 2007; Scott, 2005; Seghier et al., 2004; Scott & Johnsrude, 2003; Buchsbaum, Hickok, & Humphries, 2001) and was expected to produce equivalent disruption for the two speech perception tasks. The comparison of PMC and pSTG can therefore be used to confirm that any differential effects of TMS across tasks are not explicable in terms of differing sensitivity to TMS-induced disruption. A third site, the occipital pole (OP), was not expected to disrupt any of our experimental tasks and therefore allowed us to characterize any nonspecific effects of stimulation.

METHODS

Design

A within-subject 2 × 3 × 3 factorial design was employed, including TMS (no stimulation vs. stimulation), task (phonological, semantic, visual control) and site (OP, PMC, pSTG) as factors. We delivered a low-frequency (1 Hz) train of rTMS pulses offline. Participants then performed the task immediately after stimulation, allowing us to rule out the possibility that the loud clicks associated with each pulse, jaw contractions, or eye blinks following peripheral nerve stimulation disrupted performance on the behavioral tasks. Participants performed the baseline testing (without TMS) either before TMS stimulation or completed baseline testing 30 min after TMS stimulation (by which time, the effects should no longer be present; Whitney, Kirk, O'Sullivan, Lambon Ralph, & Jefferies, 2011; Lambon Ralph, Pobric, & Jefferies, 2009; Pobric, Lambon Ralph, & Jefferies, 2009; Pobric et al., 2007). The order of baseline testing was counterbalanced across sessions for each participant. The study made use of a nonlinguistic control task (scrambled pictures to ensure that disruption was task-specific) and a control stimulation site (OP; to ensure that the effects were not because of nonspecific effects of TMS).

Participants

Fifteen right-handed, native English speakers, recruited from the University of York, were examined in the study (nine men; mean age = 21.8 years, SD = 2.4 years). All participants were reimbursed £30 for their time. Four participants were replaced because of difficulties coregistering brain images with scalp locations, and one because of technical problems during testing. One participant from our final sample, who was identified as an outlier in the phonological and semantic conditions for PMC and OP, was excluded from further analysis. All participants passed safety screening for MRI and TMS, were free from any history of neurological disease or mental illness, and were not taking any medication. Each participant gave their informed consent before each TMS testing session began, and the experiment was reviewed and approved by the research ethics committee of the York Neuroimaging Centre.

Tasks

The probe words for the phonological and semantic tasks were presented auditorily, with the targets presented visually. A two-alternative forced-choice (2AFC) format was used across all three tasks (phonological, semantic, visual control; see Figure 1). In the phonological task, participants had to decide which phoneme they had heard at the end of a word (e.g., auditory probe “cart,” with the answer choices “t” and “p” on the left- and right-hand sides of the screen; both response options produced real words). The types of contrasting phoneme decisions were “k”–“t,” “p”–“t,” “p”–“k,” “d”–“g,” “b”–“g,” and “b”–“d.” In the semantic task, participants had to make a decision about which semantic category the auditory probe word belonged to (e.g., auditory probe “cart,” with choices “man-made” and “natural”). There were six types of semantic decision within the experiment (concrete/abstract, man-made/natural, nice/nasty, hear/see, large/small, and action/object). In the visual control task, a probe image of a scrambled face appeared at the top of the screen, and participants were asked which of two scrambled figures below was identical to the probe. The nonidentical figures were produced by rotating the target image through 90°.

Figure 1. 

Task conditions and procedure. The target item is underlined.

Figure 1. 

Task conditions and procedure. The target item is underlined.

Stimuli

The auditory stimuli were cross-spliced spoken words taken from a previous study; these were modified to increase their sensitivity to TMS effects (Gaskell et al., 2008). The stimuli were constructed from word pairs (such as job–jog): The final phoneme from one word (i.e., /b/) was attached to the onset and vowel of the second word (i.e., /jo/ of “jog”), and the final phoneme was then attenuated, to increase task difficulty when making explicit phoneme judgments. In pilot testing, task performance at different levels of attenuation (12.5%, 25%, 50%) was examined for each item, and the final level of attenuation was selected to maximize difficulty while ensuring that participants could perceive the stimulus (given that our primary dependent measure was response time [RT]; median level of attenuation = 12.5%). The same materials were used across tasks but were never repeated within one testing session, for example, items presented in the phonological task in Week 1 were not presented in the semantic task in Week 1 but could occur in the opposite order (semantic/phonological) in Week 2. The stimuli in the visual control task were pictures of faces, scrambled into 100 blocks rendering them unrecognizable.

Procedure

A PC running E-Prime software (Psychology Tools, Inc., Pittsburgh, PA) was used to present the tasks and record accuracy and RT. Responses were given with left and right index fingers corresponding to the positions of the two response options on the screen. The language tasks started with a fixation screen for 250 msec followed by the presentation of the target and distractor (e.g., for “carp,” “p” is the target and “t” is the distractor) for 500 msec, followed by the auditory probe, after which participants were required to make a response. The participant's response triggered the next trial. For the visual control task, the probe and targets appeared on screen simultaneously. The experiment began with a practice block, to familiarize participants with the tasks (six trials per task type). There were 30 experimental trials per task (semantic, phonological, control), with participants performing 90 trials per condition (baseline, post-TMS). No trials were repeated within a session, but some trials (less than 20%) were repeated across sessions (i.e., 1 week later). The order in which the trials occurred was randomized, and the order in which the tasks were presented was pseudorandomized across participants. Each task block was preceded by a screen, which informed participants of the new task type, and participants pressed the space bar to continue. The different categories within the semantic task were presented in miniblocks, and again, there was an instruction screen at the start of each one, indicating the type of decision participants would be making (e.g., concrete or abstract).

Selection of TMS Sites

Structural T1-weighted MRI scans were used to identify sites for stimulation in each participant's brain. Sites were identified from previous functional neuroimaging and TMS studies of speech perception, and an average peak coordinate was taken. The coordinates contributing to the left dorsal PMC site came from D'Ausilio et al. (2009), Sato et al. (2009), Meister et al. (2007), Vigneau et al. (2006), and Wilson et al. (2004), allowing us to be confident that we were targeting a site that makes a necessary contribution to speech perception. This produced the following coordinates: −52.67, −6.67, 43 (Montreal Neurological Institute). The left pSTG site was taken from Meister et al. (2007), Okada and Hickok (2006), Dehaene-Lambertz et al. (2005), and Zevin and McCandliss (2005) producing the following coordinates: −59.56, −30.53, 7.08 (Montreal Neurological Institute; Figure 2). These sites were then transformed into each participant's individual brain space. The left OP was measured as 20 mm superior and 10 mm left of the inion, as in previous TMS studies (e.g., Ishibashi, Lambon Ralph, Saito, & Pobric, 2011).

Figure 2. 

Coordinates contributing to stimulation peaks for PMC (blue) and pSTG (green). The averaged coordinate stimulated in our study is indicated in red. Image created using DataViewer3D (Gouws, Woods, Millman, Morland, & Green, 2009).

Figure 2. 

Coordinates contributing to stimulation peaks for PMC (blue) and pSTG (green). The averaged coordinate stimulated in our study is indicated in red. Image created using DataViewer3D (Gouws, Woods, Millman, Morland, & Green, 2009).

For 11 participants, the MRI structural image was coregistered to the participant's scalp using an Ascension Minibird magnetic tracking device (www.ascension-tech.com) in conjunction with MRIreg software (www.mricro.com/mrireg.html). Five anatomical landmarks were identified for coregistration (tip of nose, bridge of nose, vertex, left/right tragus). Stimulation coordinates were transformed into individual participant space using the transformation matrix from the “segment” function in SPM5. For the remaining participants, Brainsight 2 (Rogue Research, Montreal, Canada, www.rogue-research.com/) was used to coregister participant brains and to identify stimulation sites before rTMS administration. Four landmarks were used for coregistering the participants head to their brain image (tip of the nose, bridge of the nose, left/right tragus).

Stimulation Parameters

Before TMS testing began, the individual active motor threshold was established in each testing session. This was determined by the lowest stimulation intensity required to elicit visible contraction of the first dorsal interosseous muscle in the contralateral hand. Motor thresholds ranged between 38% and 65% of maximum stimulator output, with an average of 49% of stimulator output. A 70-mm figure-of-eight coil, attached to a MagStim Rapid2 stimulator, was used to deliver the magnetic pulses. Repetitive trains of TMS were applied at 1 Hz for 10 min; participants were stimulated at 120% of their motor threshold. We used a coil orientation established as the least uncomfortable for participants before stimulation, as it has been shown that orientation does not reliably influence behavioral effects (Niyazov, Butler, Kadah, Epstein, & Hu, 2005).

Data Analysis

TMS disruption was expected to manifest itself in delayed RT rather than a decline in accuracy (Whitney et al., 2011; Pobric et al., 2007; Devlin, Matthews, & Rushworth, 2003; Walsh & Cowey, 2000), especially given that the accuracy for the behavioral task was high, allowing us to maximize the number of trials used in the RT analysis. The analyses therefore examined RT for correct responses, within 1.5 SDs of the mean (accuracy data are provided in Table 1). The predictions of this study were confirmed using planned paired t tests to examine if the predicted TMS effects were significant at each site (one-tailed). These tests were supplemented with a series of within-participant ANOVAs (all two-tailed) to test for interactions between TMS and Task and between TMS and Site. All significant TMS effects are reported below.

Table 1. 

Accuracy Data


PMC
pSTG
OP
Baseline
TMS
Baseline
TMS
Baseline
TMS
Control 96.99 (.92) 94.84 (2.01) 96.99 (1.2) 96.77 (1.0) 97.20 (.82) 96.77 (1.09) 
Phonological 94.62 (1.77) 92.26 (1.86) 90.54 (2.1) 91.61 (2.51) 94.62 (1.16) 92.69 (1.56) 
Semantic 87.96 (2.15) 85.16 (1.88) 88.82 (1.66) 82.15 (2.33) 87.96 (2.17) 87.53 (2.2) 

PMC
pSTG
OP
Baseline
TMS
Baseline
TMS
Baseline
TMS
Control 96.99 (.92) 94.84 (2.01) 96.99 (1.2) 96.77 (1.0) 97.20 (.82) 96.77 (1.09) 
Phonological 94.62 (1.77) 92.26 (1.86) 90.54 (2.1) 91.61 (2.51) 94.62 (1.16) 92.69 (1.56) 
Semantic 87.96 (2.15) 85.16 (1.88) 88.82 (1.66) 82.15 (2.33) 87.96 (2.17) 87.53 (2.2) 

Average accuracy, with standard error in parentheses. The only paired comparison that reached significance was between the TMS and no-TMS conditions for pSTG and the semantic task (t(14) = 2.981, p = .01).

RESULTS

Premotor Cortex

Paired sample t tests confirmed our prediction that PMC is involved in phoneme judgments but not in semantic judgments: Phonological judgments were significantly slowed by TMS to this site (t(14) = −2.03, p < .05), whereas , crucially, the semantic task was unaffected (t(14) = 1.07, p > .1). There was also no disruption of the control task after TMS to PMC (t(14) < 1). A within-participant ANOVA was used to confirm that the two language tasks were affected differently by TMS: This analysis revealed a significant main effect of Task (F(1, 14) = 34.67, p < .001) and a significant interaction of Task × TMS (F(1, 14) = 4.66, p < .05; see Figure 3).

Figure 3. 

PMC: TMS to PMC produced significant slowing of the phonological task but not the semantic task. pSTG: TMS to pSTG shows significant slowing for both phonological and semantic tasks. OP: TMS to OP shows no effect for any of the tasks. Error bars represent SEM. Stars represent significant slowing after TMS (p < .05); n = 15. Phon = phonological; Sem = semantic.

Figure 3. 

PMC: TMS to PMC produced significant slowing of the phonological task but not the semantic task. pSTG: TMS to pSTG shows significant slowing for both phonological and semantic tasks. OP: TMS to OP shows no effect for any of the tasks. Error bars represent SEM. Stars represent significant slowing after TMS (p < .05); n = 15. Phon = phonological; Sem = semantic.

One potential concern relating to the previous analysis is that anatomical landmarks might not be a good guide to localization of function in specific individuals, and therefore, TMS may have been applied to a nonrelevant site in at least some of the participants (potentially masking its effect on both tasks). To confirm that TMS failed to disrupt the semantic task, even when it was applied to a site confirmed to be functionally relevant, we selected those participants (n = 11) who showed the expected inhibition (slowing of 1 msec or more) in the phoneme judgment task following TMS to PMC. We were then able to establish if there were TMS effects on the other two tasks. When the analysis was restricted to these participants, the phonological task did, unsurprisingly, show a significant disruption after TMS to PMC (t(10) = −3.78, p < .01). More importantly, both control and semantic tasks showed no hint of an effect of TMS to PMC (t(10) < 1), and a direct comparison of the two language tasks confirmed a significant interaction of Task × TMS (F(1, 10) = 7.36, p < .05).

Posterior Superior Temporal Gyrus

Paired sample t tests confirmed our prediction that pSTG is involved in both phonological and semantic judgments to spoken words. TMS had a significant effect on both phoneme judgments (t(14) = −1.77, p < .05) and semantic judgments (t(14) = −2.40, p < .05), but there was no effect on the control task (t(14) < 1). A within-participant ANOVA confirmed that the two language tasks were equally sensitive to disruption by TMS: There was a significant main effect of Task (F(1, 14) = 42.54, p < .001) and TMS (F(1, 14) = 5.47, p < .05) but no interaction (F(1, 14) = 2.93, p > .1; see Figure 3).

Occipital Pole

As predicted, there was no disruption to any task after TMS to OP: Paired t tests were nonsignificant for all tasks (t(14) < 1 in all cases). A direct comparison between the two language tasks showed a significant main effect of task (F(1, 14) = 63.62, p < .001), no effect of TMS (F(1, 14) < 1), and no interaction (F(1, 14) < 1; see Figure 3).

Between-Sites Comparison

As the control task revealed no significant TMS effects for any of the sites, it was not included in this analysis. A 3 × 2 × 2 within-participant ANOVA exploring the interactions between site, task, and TMS revealed a significant Site × TMS interaction (F(2, 28) = 5.61, p < .01), confirming that the TMS effects were site specific (i.e., disruption following stimulation of PMC and pSTG, not OP). There was also a significant three-way interaction (F(2, 28) = 3.69, p = .038), confirming that the interaction of task and TMS was site specific (i.e., phonological task disruption for PMC, both language tasks disrupted by TMS to pSTG). Furthermore, there was no Site × Task interaction in the absence of TMS (F(2, 28) = 1.492, p = .242), confirming that these effects were specific to TMS disruption and did not reflect a global difference in RT between sites.

DISCUSSION

This study reveals that PMC makes a contribution to the perception of spoken language, which is critically dependent on task context. We explored the effects of TMS stimulation on phoneme judgments and semantic decisions to the same spoken words: Both involved auditory–verbal processing, but the phoneme judgment task required access to explicit phoneme categories, whereas the semantic task involved matching auditory words to meaning. TMS to PMC disrupted explicit phonological judgments but not semantic access for the same auditory verbal stimuli. Stimulation of a second region, pSTG, containing auditory association cortex, produced disruption of both tasks. Given that TMS effects at this site were equivalent for phonological and semantic decisions, we can be confident that the selective effects of PMC stimulation do not reflect general susceptibility of the phoneme judgment task to interference. A control site, OP, confirmed that the TMS effects were site specific: TMS to OP did not affect performance on any of the tasks. Moreover, there were no effects of TMS on the visual control task at any of the sites, confirming that the effects we observed were specific to the auditory domain.

The key contribution of this study is to provide novel evidence that, although PMC makes a necessary contribution to speech perception in some circumstances, these effects do not extend to situations where spoken words must be perceived to allow comprehension; rather, PMC appears to play a critical role only in tasks requiring explicit access to phoneme categories, such as deciding if a /k/ or /p/ was presented. In contrast, some theories advocate a necessary and automatic role for motor speech representations in speech perception more generally, an idea which has received support from the discovery of mirror neurons (Rizzolatti & Craighero, 2004; but see, Gallese, Gernsbacher, Heyes, Hickok, & Iacoboni, 2011) and neuroimaging studies showing PMC activation during speech perception (Pulvermuller et al., 2006; Uppenkamp et al., 2006; Wilson et al., 2004). As functional neuroimaging methods cannot confirm that this activity plays a necessary role in speech perception, TMS has been used in several studies to show that stimulation of PMC does disrupt speech perception tasks (D'Ausilio et al., 2009; Mottonen & Watkins, 2009; Sato et al., 2009; Meister et al., 2007; Watkins & Paus, 2004; Watkins et al., 2003; Fadiga et al., 2002). However, all of these TMS studies, as well as the majority of fMRI studies, have used tasks that require explicit access to and/or manipulation of phonemes (e.g., Pulvermuller et al., 2006; Uppenkamp et al., 2006; Wilson et al., 2004). This research cannot demonstrate, therefore, that PMC plays a vital role in speech perception for comprehension. Additionally, evidence from patient studies suggests that motor areas may only be crucial for tasks that require overt segmentation or explicit phoneme awareness and not for speech comprehension (e.g., Rogalsky et al., 2011; Bishop et al., 1990; Basso et al., 1977). However, patients typically have large and variable lesions, and consequently, these studies lack spatial resolution. Neither functional neuroimaging nor neuropsychological methods are ideally placed to confirm an essential role for a specific region such as PMC in aspects of speech recognition. In the current study, we overcame these limitations through the use of TMS to produce relatively focal disruption of processing within PMC in healthy participants.

The current findings are consistent with previous TMS findings by confirming the role of the PMC in explicit phoneme judgment tasks (e.g., D'Ausilio et al., 2009; Mottonen & Watkins, 2009; Meister et al., 2007), but our study reports a novel interaction with task and crucially reveals that PMC is not necessary for mapping sound to meaning. The dissociation that we observed between auditory comprehension and explicit phoneme discrimination tasks fits well with a current model of spoken word recognition, which suggests that ambiguities present in auditory input are cascaded to downstream lexical/semantic areas but that phonemic categorization recruits an additional mechanism that does not play a central role in language understanding (Gaskell et al., 2008). Understood in this way, the effects of TMS to pSTG are to increase the ambiguity of auditory input to the system, which necessarily impacts on processing at all levels (cf. Marslen-Wilson & Warren, 1994). In contrast, TMS to PMC affects only one peripheral route in the network, implying that access to meaning is unaffected, despite poorer performance in speech categorization. Note that any ambiguities in the spoken input must still be resolved to access the appropriate meaning, but the resolution of these ambiguities presumably takes place at a purely lexical or semantic level. This dissociation has clear parallels in the neuropsychological literature (Miceli et al., 1980; Blumstein et al., 1977). For example, Miceli et al. (Miceli et al., 1980) reported 19 patients whose performance on phonological discrimination tasks was pathological, but their performance on word (or sentence level) comprehension tasks was normal. Patient studies in which the same stimuli are used across phonological discrimination and comprehension tasks have also confirmed this discrimination/comprehension dissociation (e.g., Rogalsky et al., 2011; Bishop et al., 1990). Patients with impaired speech production performed more poorly than controls on syllable discrimination (i.e., same or different? “boy”–“voy”) but, crucially, not on picture–syllable matching (i.e., a picture of a boy, and asked “Is this a voy?” or “Is this a boy?”; Bishop et al., 1990). The current study shows a similar dissociation but with higher anatomical specificity, confirming that this pattern follows stimulation of PMC in healthy participants.

There is strong connectivity between pSTG and PMC (Osnes et al., 2011; Pulvermuller & Fadiga, 2010; Saur et al., 2008; Jacquemot & Scott, 2006; Catani, Jones, & Ffytche, 2005); therefore, what might account for the selective recruitment of motor areas in this large-scale distributed language network? (1) PMC may be involved in strategic modulation of the speech perception process during the repetition and learning of new words, when it is necessary to generate and maintain a novel sequence of articulatory gestures (Hickok, 2009; Burton, Small, & Blumstein, 2000; Demonet et al., 1992). (2) It could also provide a backup mechanism for processing degraded auditory stimuli. Recent support for this explanation comes from Osnes et al. (2011), who saw a decrease in PMC activation as speech became less distorted (see also Devlin & Aydelott, 2009; Scott et al., 2009). (3) As noted above, PMC recruitment may be necessitated when explicit knowledge of phoneme segments is required (Rogalsky et al., 2011; Hickok, 2009; Sato et al., 2009; Hickok & Poeppel, 2000), for example, in tasks such as explicit phoneme judgment, where access to categorical representations of speech sounds is used to guide phoneme segmentation and manipulation (Rogalsky et al., 2011). Early support for this comes from Zatorre et al. (1992) who found syllable judgments, but not passive listening, revealed activation in Broca's area bordering PMC (also corroborated by Burton et al., 2000). Difficult explicit judgments about the constituent sounds of words may be aided by mental simulation within action systems. To establish that there is a /t/ and not a /p/ at the end of “cart,” for example, participants may generate the motor plan for “cart” and decide if this overlaps with the articulation of /t/ (Yuen, Davis, Brysbaert, & Rastle, 2009; Halle & Stevens, 1962). In contrast, when listening to “cart” and deciding if this is a natural or man-made object, auditory representations may be mapped to meaning more directly along the ventral language route (Hickok & Poeppel, 2007).

In most circumstances, task difficulty and the requirement to employ explicit phoneme knowledge are correlated. The TMS study of Sato et al. (2009) revealed that PMC was not recruited for simple phoneme and syllable discriminations; it was only essential for difficult phoneme discrimination tasks requiring segmentation. Although the results of this study are consistent with ours, difficult judgments are often thought of as more vulnerable to TMS effects in a variety of tasks (Devlin & Watkins, 2007), and Sato et al. (2009) did not include a control site to demonstrate that disruption of the difficult phonological task was specific to PMC. The current findings address these issues, as the selective pattern of interference seen for PMC in the current study was not reproduced following TMS to another site involved in auditory processing (pSTG) or a nonlanguage control site (OP).

In summary, the current study made use of two auditory language tasks to examine whether PMC recruitment is necessary for all speech perception processes, given the existing discrepant views in the literature (e.g., Gallese et al., 2011; Hickok & Poeppel, 2007; Galantucci et al., 2006; Rizzolatti & Craighero, 2004). We revealed that, although previous research has implicated PMC in speech perception, its role is confined to explicit phoneme judgment tasks and does not extend to semantic access.

Acknowledgments

This research was supported by a BBSRC studentship to K.K.-R. We would like to thank Jeroen Visser for his help with the preparation of the scrambled images.

Reprint requests should be sent to Beth Jefferies, Department of Psychology, University of York, Heslington, York, YO10 5DD, United Kingdom, or via e-mail: beth.jefferies@york.ac.uk.

REFERENCES

Basso
,
A.
,
Casati
,
G.
, &
Vignolo
,
L. A.
(
1977
).
Phonemic identification defect in aphasia.
Cortex
,
13
,
85
95
.
Bishop
,
D. V. M.
,
Brown
,
B. B.
, &
Robson
,
J.
(
1990
).
The relationship between phoneme discrimination, speech production, and language comprehension in cerebral-palsied individuals.
Journal of Speech and Hearing Research
,
33
,
210
219
.
Blumstein
,
S. E.
,
Cooper
,
W. E.
,
Zurif
,
E. B.
, &
Caramazza
,
A.
(
1977
).
The perception and production of voice-onset time in aphasia.
Neuropsychologia
,
15
,
371
372
.
Buchsbaum
,
B.
,
Hickok
,
G.
, &
Humphries
,
C.
(
2001
).
Role of left posterior superior temporal gyrus in phonological processing for speech perception and production.
Cognitive Science
,
25
,
663
678
.
Burton
,
M. W.
,
Small
,
S. L.
, &
Blumstein
,
S. E.
(
2000
).
The role of segmentation in phonological processing: An fMRI investigation.
Journal of Cognitive Neuroscience
,
12
,
679
690
.
Catani
,
M.
,
Jones
,
D. K.
, &
Ffytche
,
D. H.
(
2005
).
Perisylvian language networks of the human brain.
Annals of Neurology
,
57
,
8
16
.
D'Ausilio
,
A.
,
Pulvermuller
,
F.
,
Salmas
,
P.
,
Bufalari
,
I.
,
Begliomini
,
C.
, &
Fadiga
,
L.
(
2009
).
The motor somatotopy of speech perception.
Current Biology
,
19
,
381
385
.
Davis
,
M. H.
,
Marslen-Wilson
,
W. D.
, &
Gaskell
,
M. G.
(
2002
).
Leading up the lexical garden path: Segmentation and ambiguity in spoken word recognition.
Journal of Experimental Psychology: Human Perception and Performance
,
28
,
218
244
.
Dehaene-Lambertz
,
G.
,
Pallier
,
C.
,
Serniclaes
,
W.
,
Sprenger-Charolles
,
L.
,
Jobert
,
A.
, &
Dehaene
,
S.
(
2005
).
Neural correlates of switching from auditory to speech perception.
Neuroimage
,
24
,
21
33
.
Demonet
,
J. F.
,
Chollet
,
F.
,
Ramsay
,
S.
,
Cardebat
,
D.
,
Nespoulous
,
J. L.
,
Wise
,
R.
,
et al
(
1992
).
The anatomy of phonological and semantic processing in normal subjects.
Brain
,
115
,
1753
1768
.
Devlin
,
J. T.
, &
Aydelott
,
J.
(
2009
).
Speech perception: Motoric contributions versus the motor theory.
Current Biology
,
19
,
R198
R200
.
Devlin
,
J. T.
,
Matthews
,
P. M.
, &
Rushworth
,
M. F. S.
(
2003
).
Semantic processing in the left inferior prefrontal cortex: A combined functional magnetic resonance imaging and transcranial magnetic stimulation study.
Journal of Cognitive Neuroscience
,
15
,
71
84
.
Devlin
,
J. T.
, &
Watkins
,
K. E.
(
2007
).
Stimulating language: Insights from TMS.
Brain
,
130
,
610
622
.
Fadiga
,
L.
,
Craighero
,
L.
,
Buccino
,
G.
, &
Rizzolatti
,
G.
(
2002
).
Speech listening specifically modulates the excitability of tongue muscles: A TMS study.
European Journal of Neuroscience
,
15
,
399
402
.
Galantucci
,
B.
,
Fowler
,
C. A.
, &
Turvey
,
M. T.
(
2006
).
The motor theory of speech perception reviewed.
Psychonomic Bulletin & Review
,
13
,
361
377
.
Gallese
,
V.
,
Gernsbacher
,
M. A.
,
Heyes
,
C.
,
Hickok
,
G.
, &
Iacoboni
,
M.
(
2011
).
Mirror neuron forum.
Perspectives on Psychological Science
,
6
,
369
407
.
Gaskell
,
M. G.
, &
Marslen-Wilson
,
W. D.
(
1997
).
Integrating form and meaning: A distributed model of speech perception.
Language and Cognitive Processes
,
12
,
613
656
.
Gaskell
,
M. G.
,
Quinlan
,
P. T.
,
Tamminen
,
J.
, &
Cleland
,
A. A.
(
2008
).
The nature of phoneme representation in spoken word recognition.
Journal of Experimental Psychology
,
137
,
282
302
.
Gouws
,
A.
,
Woods
,
W.
,
Millman
,
R.
,
Morland
,
A.
, &
Green
,
G.
(
2009
).
DataViewer3D: An open-source, cross-platform multi-modal neuroimaging data visualization tool.
Front Neuroinformatics
,
3
,
9
.
Halle
,
M.
, &
Stevens
,
K. N.
(
1962
).
Speech recognition: A model and a program for research.
IRE Transactions of the Professional Group on Information Theory
,
8
,
155
159
.
Hawkins
,
S.
(
2003
).
Roles and representations of systematic fine phonetic detail in speech understanding.
Journal of Phonetics
,
31
,
373
405
.
Hickok
,
G.
(
2009
).
The functional neuroanatomy of language.
Physics of Life Reviews
,
6
,
121
143
.
Hickok
,
G.
, &
Poeppel
,
D.
(
2000
).
Towards a functional neuroanatomy of speech perception.
Trends in Cognitive Sciences
,
4
,
131
138
.
Hickok
,
G.
, &
Poeppel
,
D.
(
2004
).
Dorsal and ventral streams: A framework for understanding aspects of the functional anatomy of language.
Cognition
,
92
,
67
99
.
Hickok
,
G.
, &
Poeppel
,
D.
(
2007
).
The cortical organization of speech processing.
Nature Reviews Neuroscience
,
8
,
393
402
.
Hoffman
,
P.
,
Pobric
,
G.
,
Drakesmith
,
M.
, &
Lambon Ralph
,
M. A.
(
2011
).
Posterior middle temporal gyrus is involved in verbal and non-verbal semantic cognition: Evidence from rTMS.
Aphasiology
,
26
,
1119
1130
.
Ishibashi
,
R.
,
Lambon Ralph
,
M. A.
,
Saito
,
S.
, &
Pobric
,
G.
(
2011
).
Different roles of lateral anterior temporal lobe and inferior parietal lobule in coding function and manipulation tool knowledge: Evidence from an rTMS study.
Neuropsychologia
,
49
,
1128
1135
.
Jacquemot
,
C.
, &
Scott
,
S. K.
(
2006
).
What is the relationship between phonological short-term memory and speech processing?
Trends in Cognitive Sciences
,
10
,
480
486
.
Lambon Ralph
,
M. A.
,
Pobric
,
G.
, &
Jefferies
,
E.
(
2009
).
Conceptual knowledge is underpinned by the temporal pole bilaterally: Convergent evidence from rTMS.
Cerebral Cortex
,
19
,
832
838
.
Liberman
,
A. M.
,
Cooper
,
F. S.
,
Shankwei
,
D. P.
, &
Studdert
,
M.
(
1967
).
Perception of speech code.
Psychological Review
,
74
,
431
461
.
Liberman
,
A. M.
, &
Mattingly
,
I. G.
(
1985
).
The motor theory of speech-perception revised.
Cognition
,
21
,
1
36
.
Marslen-Wilson
,
W.
, &
Warren
,
P.
(
1994
).
Levels of perceptual representation and process in lexical access: Words, phonemes, and features.
Psychological Review
,
101
,
653
675
.
McMurray
,
B.
,
Tanenhaus
,
M. K.
, &
Aslin
,
R. N.
(
2009
).
Within-category VOT affects recovery from “lexical” garden-paths: Evidence against phoneme-level inhibition.
Journal of Memory and Language
,
60
,
65
91
.
Meister
,
I. G.
,
Wilson
,
S. M.
,
Deblieck
,
C.
,
Wu
,
A. D.
, &
Iacoboni
,
M.
(
2007
).
The essential role of premotor cortex in speech perception.
Current Biology: CB
,
17
,
1692
1696
.
Miceli
,
G.
,
Gainotti
,
G.
,
Caltagirone
,
C.
, &
Masullo
,
C.
(
1980
).
Some aspects of phonological impairment in aphasia.
Brain and Language
,
11
,
159
169
.
Moineau
,
S.
,
Dronkers
,
N. F.
, &
Bates
,
E.
(
2005
).
Exploring the processing continuum of single-word comprehension in aphasia.
Journal of Speech, Language, and Hearing Research
,
48
,
884
896
.
Morais
,
J.
, &
Kolinsky
,
R.
(
1994
).
Perception and awareness in phonological processing: The case of the phoneme.
Cognition
,
50
,
287
297
.
Mottonen
,
R.
,
Dutton
,
R.
, &
Watkins
,
K. E.
(
2013
).
Auditory–motor processing of speech sounds.
Cerebral Cortex
,
23
,
1190
1197
.
Mottonen
,
R.
, &
Watkins
,
K. E.
(
2009
).
Motor representations of articulators contribute to categorical perception of speech sounds.
Journal of Neuroscience
,
29
,
9819
9825
.
Niyazov
,
D. M.
,
Butler
,
A. J.
,
Kadah
,
Y. M.
,
Epstein
,
C. M.
, &
Hu
,
X. P.
(
2005
).
Functional magnetic resonance imaging and transcranial magnetic stimulation: Effects of motor imagery, movement and coil orientation.
Clinical Neurophysiology
,
116
,
1601
1610
.
Okada
,
K.
, &
Hickok
,
G.
(
2006
).
Left posterior auditory-related cortices participate both in speech perception and speech production: Neural overlap revealed by fMRI.
Brain and Language
,
98
,
112
117
.
Osnes
,
B.
,
Hugdahl
,
K.
, &
Specht
,
K.
(
2011
).
Effective connectivity analysis demonstrates involvement of premotor cortex during speech perception.
Neuroimage
,
54
,
2437
2445
.
Pobric
,
G.
,
Jefferies
,
E.
, &
Lambon Ralph
,
M. A.
(
2007
).
Anterior temporal lobes mediate semantic representation: Mimicking semantic dementia by using rTMS in normal participants.
Proceedings of the National Academy of Sciences, U.S.A.
,
104
,
20137
20141
.
Pobric
,
G.
,
Lambon Ralph
,
M. A.
, &
Jefferies
,
E.
(
2009
).
The role of the anterior temporal lobes in the comprehension of concrete and abstract words: rTMS evidence.
Cortex
,
45
,
1104
1110
.
Pulvermuller
,
F.
, &
Fadiga
,
L.
(
2010
).
Active perception: Sensorimotor circuits as a cortical basis for language.
Nature Reviews Neuroscience
,
11
,
351
360
.
Pulvermuller
,
F.
,
Huss
,
M.
,
Kherif
,
F.
,
Moscoso del Prado Martin
,
F.
,
Hauk
,
O.
, &
Shtyrov
,
Y.
(
2006
).
Motor cortex maps articulatory features of speech sounds.
Proceedings of the National Academy of Sciences, U.S.A.s
,
103
,
7865
7870
.
Rauschecker
,
J. P.
, &
Scott
,
S. K.
(
2009
).
Maps and streams in the auditory cortex: Nonhuman primates illuminate human speech processing.
Nature Neuroscience
,
12
,
718
724
.
Rizzolatti
,
G.
, &
Craighero
,
L.
(
2004
).
The mirror-neuron system.
Annual Review of Neuroscience
,
27
,
169
192
.
Rogalsky
,
C.
,
Love
,
T.
,
Driscoll
,
D.
,
Anderson
,
S. W.
, &
Hickok
,
G.
(
2011
).
Are mirror neurons the basis of speech perception? Evidence from five cases with damage to the purported human mirror system.
Neurocase
,
17
,
178
187
.
Roy
,
A. C.
,
Craighero
,
L.
,
Fabbri-Destro
,
M.
, &
Fadiga
,
L.
(
2008
).
Phonological and lexical motor facilitation during speech listening: A transcranial magnetic stimulation study.
Journal of Physiology—Paris
,
102
,
101
105
.
Sato
,
M.
,
Tremblay
,
P.
, &
Gracco
,
V. L.
(
2009
).
A mediating role of the premotor cortex in phoneme segmentation.
Brain and Language
,
111
,
1
7
.
Saur
,
D.
,
Kreher
,
B. W.
,
Schnell
,
S.
,
Kümmerer
,
D.
,
Kellmeyer
,
P.
,
Vry
,
M. S.
,
et al
(
2008
).
Ventral and dorsal pathways for language.
Proceedings of the National Academy of Sciences, U.S.A.
,
105
,
18035
18040
.
Scott
,
S. K.
(
2005
).
Auditory processing—Speech, space and auditory objects.
Current Opinion in Neurobiology
,
15
,
197
201
.
Scott
,
S. K.
,
Blank
,
C. C.
,
Rosen
,
S.
, &
Wise
,
R. J. S.
(
2000
).
Identification of a pathway for intelligible speech in the left temporal lobe.
Brain
,
123
,
2400
2406
.
Scott
,
S. K.
, &
Johnsrude
,
I. S.
(
2003
).
The neuroanatomical and functional organization of speech perception.
Trends in Neurosciences
,
26
,
100
107
.
Scott
,
S. K.
,
McGettigan
,
C.
, &
Eisner
,
F.
(
2009
).
A little more conversation, a little less action—Candidate roles for the motor cortex in speech perception.
Nature Reviews Neuroscience
,
10
,
295
302
.
Seghier
,
M. L.
,
Lazeyras
,
F.
,
Pegna
,
A. J.
,
Annoni
,
J.-M.
,
Zimine
,
I.
,
Mayer
,
E.
,
et al
(
2004
).
Variability of fMRI activation during a phonological and semantic language task in healthy subjects.
Human Brain Mapping
,
23
,
140
155
.
Sharp
,
D. J.
,
Awad
,
M.
,
Warren
,
J. E.
,
Wise
,
R. J. S.
,
Vigliocco
,
G.
, &
Scott
,
S. K.
(
2010
).
The neural response to changing semantic and perceptual complexity during language processing.
Human Brain Mapping
,
31
,
365
377
.
Spitsyna
,
G.
,
Warren
,
J. E.
,
Scott
,
S. K.
,
Turkheimer
,
F. E.
, &
Wise
,
R. J.
(
2006
).
Converging language streams in the human temporal lobe.
Journal of Neuroscience
,
26
,
7328
7336
.
Uppenkamp
,
S.
,
Johnsrude
,
I. S.
,
Norris
,
D.
,
Marslen-Wilson
,
W.
, &
Patterson
,
R. D.
(
2006
).
Locating the initial stages of speech-sound processing in human temporal cortex.
Neuroimage
,
31
,
1284
1296
.
Vigneau
,
M.
,
Beaucousin
,
V.
,
Herve
,
P. Y.
,
Duffau
,
H.
,
Crivello
,
F.
,
Houde
,
O.
,
et al
(
2006
).
Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing.
Neuroimage
,
30
,
1414
1432
.
Walsh
,
V.
, &
Cowey
,
A.
(
2000
).
Transcranial magnetic stimulation and cognitive neuroscience.
Nature Reviews Neuroscience
,
1
,
73
80
.
Watkins
,
K. E.
, &
Paus
,
T.
(
2004
).
Modulation of motor excitability during speech perception: The role of Broca's area.
Journal of Cognitive Neuroscience
,
16
,
978
987
.
Watkins
,
K. E.
,
Strafella
,
A. P.
, &
Paus
,
T.
(
2003
).
Seeing and hearing speech excites the motor system involved in speech production.
Neuropsychologia
,
41
,
989
994
.
Whitney
,
C.
,
Kirk
,
M.
,
O'Sullivan
,
J.
,
Lambon Ralph
,
M. A.
, &
Jefferies
,
E.
(
2011
).
The neural organization of semantic control: TMS evidence for a distributed network in left inferior frontal and posterior middle temporal gyrus.
Cerebral Cortex
,
21
,
1066
1075
.
Whitney
,
C.
,
Kirk
,
M.
,
O'Sullivan
,
J.
,
Lambon Ralph
,
M. A.
, &
Jefferies
,
E.
(
2012
).
Executive semantic processing is underpinned by a large-scale neural network: Revealing the contribution of left prefrontal, posterior temporal, and parietal cortex to controlled retrieval and selection using TMS.
Journal of Cognitive Neuroscience
,
24
,
133
147
.
Wilson
,
S. M.
,
Saygin
,
A. P.
,
Sereno
,
M. I.
, &
Iacoboni
,
M.
(
2004
).
Listening to speech activates motor areas involved in speech production.
Nature Neuroscience
,
7
,
701
702
.
Yuen
,
I.
,
Davis
,
M. H.
,
Brysbaert
,
M.
, &
Rastle
,
K.
(
2009
).
Activation of articulatory information in speech perception.
Proceedings of the National Academy of Sciences, U.S.A.
,
107
,
592
597
.
Zatorre
,
R. J.
,
Evans
,
A. C.
,
Meyer
,
E.
, &
Gjedde
,
A.
(
1992
).
Lateralization of phonetic and pitch discrimination in speech processing.
Science
,
256
,
846
849
.
Zatorre
,
R. J.
,
Meyer
,
E.
,
Gjedde
,
A.
, &
Evans
,
A. C.
(
1996
).
PET studies of phonetic processing of speech: Review, replication, and reanalysis.
Cerebral Cortex
,
6
,
21
30
.
Zevin
,
J. D.
, &
McCandliss
,
B. D.
(
2005
).
Dishabituation of the BOLD response to speech sounds.
Behavioral and Brain Functions
,
1
,
1
12
.