Abstract

Models of speech production posit a role for the motor system, predominantly the posterior inferior frontal gyrus, in encoding complex phonological representations for speech production, at the phonemic, syllable, and word levels [Roelofs, A. A dorsal-pathway account of aphasic language production: The WEAVER++/ARC model. Cortex, 59(Suppl. C), 33–48, 2014; Hickok, G. Computational neuroanatomy of speech production. Nature Reviews Neuroscience, 13, 135–145, 2012; Guenther, F. H. Cortical interactions underlying the production of speech sounds. Journal of Communication Disorders, 39, 350–365, 2006]. However, phonological theory posits subphonemic units of representation, namely phonological features [Chomsky, N., & Halle, M. The sound pattern of English, 1968; Jakobson, R., Fant, G., & Halle, M. Preliminaries to speech analysis. The distinctive features and their correlates. Cambridge, MA: MIT Press, 1951], that specify independent articulatory parameters of speech sounds, such as place and manner of articulation. Therefore, motor brain systems may also incorporate phonological features into speech production planning units. Here, we add support for such a role with an fMRI experiment of word sequence production using a phonemic similarity manipulation. We adapted and modified the experimental paradigm of Oppenheim and Dell [Oppenheim, G. M., & Dell, G. S. Inner speech slips exhibit lexical bias, but not the phonemic similarity effect. Cognition, 106, 528–537, 2008; Oppenheim, G. M., & Dell, G. S. Motor movement matters: The flexible abstractness of inner speech. Memory & Cognition, 38, 1147–1160, 2010]. Participants silently articulated words cued by sequential visual presentation that varied in degree of phonological feature overlap in consonant onset position: high overlap (two shared phonological features; e.g., /r/ and /l/) or low overlap (one shared phonological feature, e.g., /r/ and /b/). We found a significant repetition suppression effect in the left posterior inferior frontal gyrus, with increased activation for phonologically dissimilar words compared with similar words. These results suggest that phonemes, particularly phonological features, are part of the planning units of the motor speech system.

INTRODUCTION

Speech production in everyday conversation involves planning over multiple hierarchical levels of analysis, including phonemic, syllabic, morphemic/lexical, phrasal/sentential, and discourse levels (Levelt, Roelofs, & Meyer, 1999; Dell & O'Seaghdha, 1992; Garrett, 1975; Fromkin, 1971). Much of the work on the neural basis of speech production has focused on phonemic levels, which have aimed to identify where and how phonemes or sequences of phonemes (syllables) are represented (Dell, 2013; Peeva et al., 2010; Bohland & Guenther, 2006; Wilson, Saygin, Sereno, & Iacoboni, 2004). But according to standard linguistic accounts, phonemes are not theorized to be holistic units but rather sets of phonological features (Chomsky & Halle, 1968; Jakobson, Fant, & Halle, 1951) with independent articulatory parameters such as place and manner of articulation. Phonological features are integral to phonological theory, because they define relevant groupings of speech sounds for various phonological processes. Our aim here was to use fMRI to investigate the neural representation of phonological features during speech production, with particular emphasis on determining where in the cortical hierarchy they might be found, if at all.

Previous research on phonological processes more broadly has identified a set of candidate regions where one might expect to find more fine-grained neural codes of phonological features. The broader network includes the inferior frontal gyrus, premotor cortex, (pre-)SMA, ventral motor cortex, anterior insula, and the temporal–parietal junction, including area Spt (Deschamps, Baum, & Gracco, 2015; Roelofs, 2014; Dell, 2013; Tourville & Guenther, 2011; Guenther, 2006; Indefrey & Levelt, 2004; Hickok, Buchsbaum, Humphries, & Muftuler, 2003; Buchsbaum, Hickok, & Humphries, 2001; Dronkers, 1996). Some studies have identified regions in primary sensorimotor cortex corresponding to lip, tongue, jaw, and larynx effectors using direct cortical recordings and noninvasive functional activation methods (Strijkers, Costa, & Pulvermüller, 2017; Conant, Bouchard, & Chang, 2014; Mesgarani, Cheung, Johnson, & Chang, 2014; Bouchard, Mesgarani, Johnson, & Chang, 2013; Simonyan & Horwitz, 2011; Chang et al., 2010; Pulvermüller et al., 2006), but it is not entirely clear whether these maps represent purely incidental sensory or motor activations associated with producing and perceiving the sensory effects of articulation (cf. Chang et al., 2010, who identified categorical responses in auditory cortex). Crucially, phonological features are theorized to be abstract computational units of speech, rather than isolated, continuous patterns of sensory stimulation or motor configurations. To illustrate this, in English there is a continuum of possible motor tract configurations associated with the sound /t/, each with slight variations that might produce distinct motor/somatosensory activation, yet all of these variations correspond to the same set of phonemic features. Thus, it is important to more clearly identify neural responses belonging to these more abstract computational units.

Here we approach the task of trying to identify phonological features using repetition suppression during fMRI recording (rsFMRI). Repetition suppression is the reduced BOLD response in fMRI studies due to the overlap of information across repeated stimuli (Grill-Spector, Henson, & Martin, 2006). If a particular brain region shows a repetition suppression effect for a stimulus characteristic, this is taken as evidence that this brain region encodes information about this characteristic (Barron, Garvert, & Behrens, 2016). Repetition suppression designs allow for precise targeting of particular representational levels without changing task demands. In language research, there have been several rsFMRI studies targeting speech perception (Vaden, Muftuler, & Hickok, 2010; Hasson, Skipper, Nusbaum, & Small, 2007).

In speech production, we are not aware of any explicitly designed rsFMRI experiments. However, two studies (Tremblay, Sato, & Deschamps, 2017; Bohland & Guenther, 2006) employed a design that included a kind of repetition suppression as part of a manipulation of syllable sequence complexity. They compared the production of three identical syllables (e.g., ta-ta-ta) to three different syllables (e.g., ka-ru-ti) and report similar results: effects of syllable sequence complexity in several frontal-parietal regions, including the left posterior inferior frontal gyrus (pIFG). Presumably, some of this effect is being driven by repetition suppression in the same syllable condition relative to the different syllables condition. However, it is impossible in these studies to distinguish the repetition effect from the effect of sequence complexity, that is, the difficulty involved in sequencing distinct syllables as compared with identical syllables.

All of these studies involve repetition of identical speech sounds across conditions. However, given that phonemes are theoretically composed of sets of articulatory features (Chomsky & Halle, 1968; Jakobson et al., 1951), it should be possible to observe subphonemic effects of repetition suppression in the motor system. For instance, the speech sounds /p/ and /b/, although distinct phonemes, share two features: place (bilabial) and manner (stop). Therefore, these speech sounds are more phonologically similar to each other than /p/ and /g/, which only share the manner feature (stop). This suggests that phonological features themselves, as components of higher-order phonological representations, should be susceptible to repetition effects and that repetition suppression should be seen in brain areas encoding representations that incorporate phonological features. Identifying repetition effects for phonemic features was the goal of the present rsFMRI study.

Although repetition of perceptual features affects performance (Huber, Tian, Curran, O'Reilly, & Woroch, 2008; Henson & Rugg, 2003; Schacter & Buckner, 1998), underlying the neural repetition suppression effect (Grill-Spector et al., 2006), additional evidence from speech production indicates that feature similarity can induce speech errors. Spoonerisms and tongue twisters are examples of this, for example, she sells sea shells down by the seashore. Tongue twisters are difficult not because of the actual motor difficulty of the sounds in question but because of phonological similarity among the speech sounds. In the example above, the sounds /s/ and /ʃ/ (the “sh” sound) cause such difficulty because they share high feature overlap (voice and manner features), and they occupy onset position in each word. The articulatory difficulty caused by phonological similarity is due to predictive planning of the utterance by confusing the consonants that are intended to occupy the same positions of different words (Dell, 1995; Fromkin, 1971). Thus, the phonological similarity effect elicited by tongue twisters occurs because of interference on speech planning by speech sounds that are related in their phonological representations. The possibility of speech errors due to phonological similarity complicates potential experiments of phonological feature repetition suppression, as the propensity to produce speech errors or to correct articulation results in increased activation in motor speech regions (Tourville, Reilly, & Guenther, 2008; Bohland & Guenther, 2006; Guenther, 2006). The propensity to produce speech errors due to phonological feature repetition might cancel or even reverse effects of repetition suppression.

To demonstrate phonological feature repetition suppression but avoid inducing speech errors, we adapted the materials of Oppenheim and Dell (2008, 2010) and modified their stimulus presentation. The goal of the Oppenheim and Dell studies was to induce different amounts of speech errors through different degrees of phonological feature similarity across four-word tongue twister stimuli. Crucially, their experiment relied on the simultaneous presentation of all four words to induce interfering effects of phonological planning across words. To retain the phonological feature overlap manipulation for repetition suppression but prevent interference in phonological planning, we modified their experimental design and presented each word in isolation, asking participants to produce each word before the presentation of subsequent words. Because each word had to be produced in isolation, all stages of encoding, planning, and production were completed in isolation from the presentation of future words, preventing potential interfering effects of phonological similarity. This design allowed us to investigate which regions would exhibit repetition suppression for phonological features with little risk of speech errors due to interfering effects of phonological similarity.

We used silent articulation (rather than overt articulation) because we did not want to inadvertently obtain activation in the temporal lobe due to participants hearing their own speech. We were confident that silent articulation would adequately activate phonological representations given the results of Oppenheim and Dell (2010), who observed an equivalent phonemic similarity effect for overt and silent articulation, but no such effect for purely imagined speech without any articulation. This suggests that phonological representations are equally active during production, whether vocalized or not, as long as participants move their vocal tract effectors. In addition, we did not attempt to distinguish among different phonemic feature parameters such as voicing or place of articulation. Given that the feature contrast in the Oppenheim and Dell studies was extremely subtle, that is, 1 feature different (similar condition) vs. 2 features different (dissimilar condition), we decided to maximize statistical power and broadly investigate the neuroanatomy of phonemic features. The framework of Hickok (2012, 2014a) posits that phonemic features are mostly in the domain of the motor system; in accord with this hypothesis, we predicted that phonemic feature repetition suppression would produce effects in the motor system, particularly the pIFG and premotor cortex.

METHODS

Participants

Twenty-one participants were included in the final analyses. Twenty-five participants (17 women) aged between 18 and 40 years (mean age = 21.2 years, SD = 4 years) were recruited from the University of California, Irvine, community and received monetary compensation for their time. The volunteers were right-handed, native English speakers with normal or corrected-to-normal vision, no known history of neurological disease, and no other contraindications for MRI. Informed consent was obtained from each participant before the study in accordance with guidelines from the University of California, Irvine, institutional review board, which approved this study. Four participants were omitted from data analysis: one participant made excessive errors on the task (>47% error rate and >22% no responses), one participant failed to follow directions, and images from two participants contained streaking artifacts.

Stimuli and Task

Participants were presented with a sequence of visually presented tongue twister stimuli that was presented in rapid succession one word at a time and were asked to silently articulate. Thirty-two sets of tongue twisters known to behaviorally elicit a phonemic similarity effect were used in the study (see Table 1). Each set consisted of a four-word sequence. Oppenheim and Dell (2008) created lists of words in which the onset of the words either shared two features, for example, voicing and manner of articulation (similar condition), or shared only one feature, for example, voicing only (dissimilar condition). The specific metrics of the stimuli are described elsewhere (Oppenheim & Dell, 2008).

Table 1. 

List of Tongue Twister Stimuli Used in the Experiment

SimilarDissimilar
nod mod mock knob jog mod mock job 
yore wan watt yawn gore wan watt gone 
log rob rock lot jog rob rock jot 
daft gab gash dam raft gab gash ram 
zoom nab nap czar goon nab nap gar 
nab match mat nerve van match mat verve 
dull budge buck dove lull budge buck love 
pub bust bun puff tub bust bun tough 
gun bulb but gull nun bulb but null 
pen bunk bus pair than bunk bus there 
lust rum rug lump just rum rug jump 
mine bikes bite mice nine bikes bite nice 
pies tile tine pyre shies tile tine shire 
bike wild wise bile guide wild wise guile 
zing that then zed ding that then dead 
weld yell yet when meld yell yet men 
paid cage cane pace chafe cage cane chase 
bane gave gang bait rein gave gang rate 
name make mail nag vane make mail vague 
sage tame take sale beige tame take bale 
wing bib bit whip zing bib bit zip 
six finch fill sin chicks finch fill chin 
singe fib fit sip hinge fib fit hip 
sing hitch hill sick king hitch hill kick 
chip jilt gin chill bib jilt gin bill 
size tide tight sign five tide tight fine 
jail cheek cheap jean kale cheek cheap keen 
zinc niece kneel zest verb niece kneel vest 
lean reed reef leech bean reed reef beech 
bees wheat week big geese wheat week gig 
home sole soap hone loam sole soap lone 
poke tote toll pose folk tote toll foes 
nod mod mop knob jog mod mop job 
yore wan wok yawn gore wan wok gone 
log rob rod lot jog rob rod jot 
daft gab gas dam raft gab gas ram 
zoom nab knack czar goon nab knack gar 
nab match mass nerve van match mass verve 
dull budge but dove lull budge but love 
pub bust bum puff tub bust bum tough 
gun bulb buck gull nun bulb buck null 
pen bunk buzz pair than bunk buzz there 
lust rum rub lump just rum rub jump 
mine bikes bide mice nine bikes bide nice 
pies tile time pyre shies tile time shire 
bike wild wine bile guide wild wine guile 
zing that them zed ding that them dead 
weld yell yep when meld yell yep men 
paid cage came pace chafe cage came chase 
bane gave game bait rein gave game rate 
name make maid nag vane make maid vague 
sage tame tape sale beige tame tape bale 
wing bib bid whip zing bib bid zip 
six finch fizz sin chicks finch fizz chin 
singe fib fish sip hinge fib fish hip 
sing hitch his sick king hitch his kick 
chip jilt gym chill bib jilt gym bill 
size tide type sign five tide type fine 
jail cheek cheat jean kale cheek cheat keen 
zinc niece need zest verb niece need vest 
lean reed wreath leech bean reed wreath beech 
bees wheat wheel big geese wheat wheel gig 
home sole soak hone loam sole soak lone 
poke tote towed pose folk tote towed foes 
SimilarDissimilar
nod mod mock knob jog mod mock job 
yore wan watt yawn gore wan watt gone 
log rob rock lot jog rob rock jot 
daft gab gash dam raft gab gash ram 
zoom nab nap czar goon nab nap gar 
nab match mat nerve van match mat verve 
dull budge buck dove lull budge buck love 
pub bust bun puff tub bust bun tough 
gun bulb but gull nun bulb but null 
pen bunk bus pair than bunk bus there 
lust rum rug lump just rum rug jump 
mine bikes bite mice nine bikes bite nice 
pies tile tine pyre shies tile tine shire 
bike wild wise bile guide wild wise guile 
zing that then zed ding that then dead 
weld yell yet when meld yell yet men 
paid cage cane pace chafe cage cane chase 
bane gave gang bait rein gave gang rate 
name make mail nag vane make mail vague 
sage tame take sale beige tame take bale 
wing bib bit whip zing bib bit zip 
six finch fill sin chicks finch fill chin 
singe fib fit sip hinge fib fit hip 
sing hitch hill sick king hitch hill kick 
chip jilt gin chill bib jilt gin bill 
size tide tight sign five tide tight fine 
jail cheek cheap jean kale cheek cheap keen 
zinc niece kneel zest verb niece kneel vest 
lean reed reef leech bean reed reef beech 
bees wheat week big geese wheat week gig 
home sole soap hone loam sole soap lone 
poke tote toll pose folk tote toll foes 
nod mod mop knob jog mod mop job 
yore wan wok yawn gore wan wok gone 
log rob rod lot jog rob rod jot 
daft gab gas dam raft gab gas ram 
zoom nab knack czar goon nab knack gar 
nab match mass nerve van match mass verve 
dull budge but dove lull budge but love 
pub bust bum puff tub bust bum tough 
gun bulb buck gull nun bulb buck null 
pen bunk buzz pair than bunk buzz there 
lust rum rub lump just rum rub jump 
mine bikes bide mice nine bikes bide nice 
pies tile time pyre shies tile time shire 
bike wild wine bile guide wild wine guile 
zing that them zed ding that them dead 
weld yell yep when meld yell yep men 
paid cage came pace chafe cage came chase 
bane gave game bait rein gave game rate 
name make maid nag vane make maid vague 
sage tame tape sale beige tame tape bale 
wing bib bid whip zing bib bid zip 
six finch fizz sin chicks finch fizz chin 
singe fib fish sip hinge fib fish hip 
sing hitch his sick king hitch his kick 
chip jilt gym chill bib jilt gym bill 
size tide type sign five tide type fine 
jail cheek cheat jean kale cheek cheat keen 
zinc niece need zest verb niece need vest 
lean reed wreath leech bean reed wreath beech 
bees wheat wheel big geese wheat wheel gig 
home sole soak hone loam sole soak lone 
poke tote towed pose folk tote towed foes 

In the present experiment, words in each list were presented to participants one at a time at a rate of 400 msec per word separated by 100 msec of a blank screen. Participants were instructed to rapidly articulate each monosyllabic word as soon as it appeared on screen. After articulating each sequence, the participant indicated whether or not he or she correctly articulated the sequence using a button box. Word lists were separated by a fixation that varied in length with an SOA of 3.5–9 sec. An example of a trial is illustrated in Figure 1.

Figure 1. 

Example of a single trial. Words were presented in rapid succession separated by 100-msec intervals. Participants articulated the words as they streamed on screen. Each sequence was separated by a fixation that remained on screen for 3.5–9 sec.

Figure 1. 

Example of a single trial. Words were presented in rapid succession separated by 100-msec intervals. Participants articulated the words as they streamed on screen. Each sequence was separated by a fixation that remained on screen for 3.5–9 sec.

There was a total of 64 trials in each run, and there was a total of six runs in the study. Half of the word lists were from the similar condition, and half of the word lists were from the dissimilar condition. The word lists were randomly presented to each participant. The study started with a short practice run with eight trials to familiarize participants with the task. Participants were scanned during the practice run to acclimatize them to the fMRI environment. The study ended with a high-resolution structural scan, and the entire experiment was 1 hr in length. Stimuli presentation and timing were controlled using Cogent software (www.vislab.ucl.ac.uk/cogent_2000.php) implemented in Matlab 7.1 (Mathworks, Inc.) running on a dual-core IBM Thinkpad laptop.

Imaging

MR images were obtained in a Philips Achieva 3T (Philips Medical Systems) fitted with an eight-channel radio frequency receiver head coil at the Research Imaging Center scanning facility at the University of California, Irvine. Images during the experimental runs were collected using Fast Echo EPI (sense reduction factor = 1.7, matrix = 112 × 112 mm, repetition time = 2.5 sec, echo time = 25 msec, size = 2.5 mm × 2.5 mm × 2.5 mm, flip angle = 90°). A total of 840 EPIs were collected over six runs, and 40 slices provided whole-brain coverage. After the functional scans, a high-resolution T1-weighted anatomical image was acquired with an MPRAGE pulse sequence in axial plane (matrix = 256 × 256 mm, repetition time = 8 msec, echo time = 3.6 msec, flip angle = 8°, size = 1 × 1 × 1 mm).

Data Analysis

Data preprocessing and analyses were performed using AFNI software (Cox, 1996). First, motion correction was performed by creating a mean image from all of the volumes in the experiment and then realigning all volumes to that mean image using a six-parameter rigid body model. The images were then high-pass filtered at 0.008 Hz and spatially smoothed with an isotropic 8 mm FWHM Gaussian kernel. The anatomical image for each participant was coregistered to his or her mean EPI image. Data analysis proceeded in two steps. First, regression analysis was performed at the single participant level, parameter estimates for events of interest were obtained, and then these data were transformed into standardized space for group-level analysis.

First-level analysis was performed on the time course of each voxel's BOLD response for each participant using AFNI software (Cox, 1996). Regression analysis was performed using AFNI's 3dDeconvolve function, and the regressors were created by convolving the predictor variables representing the time course of stimulus presentation with a gamma variate function. A total of nine regressors were entered in the model. The first regressor corresponded to the presentation of phonemically similar words. The second regressor corresponded to presentation of phonemically dissimilar words. The third regressor corresponded to all of the trials in which the participants reported making an error. Thus, the regressors modeling the articulation of phonemically similar and dissimilar words only include trials in which participants report accurately articulating the sequence of words. An additional six regressors representing the motion parameters determined during the realignment stage of processing were entered into the model. Parameter estimates for the events of interest were obtained, and statistical maps were created.

For group-level analysis, these statistical maps for each participant were transformed into standardized space (Talairach & Tournoux, 1988) using a Talairach template and resampled into 2 mm3 voxels. First, a group-level t test was performed to identify regions activated during articulation across both conditions. This was done to ensure that our activation maps associated with speech articulation was consistent with previous studies. To identify brain regions showing a phonological repetition suppression effect, group-level voxel-wise t tests were performed to identify areas significantly more activated for dissimilar > similar, that is, regions showing significantly less activation for increased phonological repetition. To identify potential effects of repetition enhancement, the opposite contrast (similar > dissimilar) was performed, that is, regions showing significantly more activation for increased phonological repetition. Multiple testing correction was performed, and cluster-wise significance level was calculated using 3dFWHMx and 3dClustStim (Cox & Jesmanowicz, 1999) to estimate the smoothness of the noise and combined minimum cluster size (27 voxels) and p threshold (p < .001).

RESULTS

Consistent with previous research, subvocal speech articulation yielded activation in a peri-sylvian network of bilateral inferior frontal cortex, superior and middle temporal gyri, and left insula. These regions are consistently reported in speech production studies generally, both covert and overt (Tremblay et al., 2017; Deschamps et al., 2015; Roelofs, 2014; McGettigan et al., 2011; Tourville & Guenther, 2011; Bohland & Guenther, 2006; Indefrey & Levelt, 2004; Hickok et al., 2003; Buchsbaum et al., 2001). Although there is some indication that subvocal speech production does not activate all areas to the same extent as overt speech in fMRI, particularly the motor and premotor cortex, the temporal lobe, the insula, and subcortical structures (Shuster & Lemieux, 2005), our experimental task activated the main speech production network identified by most researchers (Hickok, Houde, & Rong, 2011; Price, 2010; Rauschecker & Scott, 2009; Guenther, 2006). Figure 2 illustrates the group activation map, and Table 2 lists the peak Talairach coordinates of the significant clusters.

Figure 2. 

Rapid articulation. Map of regions significantly activated during rapid articulation of similar and dissimilar word lists (p < .05, corrected) compared with rest. Significant activations include the bilateral IFG, superior and middle temporal cortex, cerebellum, insula, occipital cortex, inferior parietal cortex (in the vicinity of area Spt), and dorsal and ventral precentral gyrus. n = 21.

Figure 2. 

Rapid articulation. Map of regions significantly activated during rapid articulation of similar and dissimilar word lists (p < .05, corrected) compared with rest. Significant activations include the bilateral IFG, superior and middle temporal cortex, cerebellum, insula, occipital cortex, inferior parietal cortex (in the vicinity of area Spt), and dorsal and ventral precentral gyrus. n = 21.

Table 2. 

Talairach Coordinates for Regions Significantly Activated during Speech Articulation

RegionPeak VoxelCluster SizeT Score
xyz
Bilateral inferior and middle occipital gyri includes bilateral cerebellum and cuneus/precuneus −47 −71 −16 30057 5.43 
Left superior temporal gyrus includes left frontal, precentral, and postcentral gyri −59 −2 2587 7.36 
Right precentral and postcentral gyri 49 −33 58 1167 3.83 
Right middle temporal gyrus 49 −37 574 5.43 
Left precentral gyrus −45 −3 56 204 3.87 
Bilateral cingulate −1 23 26 186 3.83 
Right middle frontal gyrus 41 39 24 145 3.94 
Left middle frontal gyrus −39 45 22 144 3.95 
Left supramarginal and angular gyri −33 −51 36 127 4.36 
Right inferior parietal cortex 41 −63 54 110 3.96 
Right superior frontal gyrus 19 51 −14 37 4.02 
Left inferior frontal gyrus −35 27 −2 36 3.96 
Right superior frontal gyrus 37 45 30 36 4.03 
RegionPeak VoxelCluster SizeT Score
xyz
Bilateral inferior and middle occipital gyri includes bilateral cerebellum and cuneus/precuneus −47 −71 −16 30057 5.43 
Left superior temporal gyrus includes left frontal, precentral, and postcentral gyri −59 −2 2587 7.36 
Right precentral and postcentral gyri 49 −33 58 1167 3.83 
Right middle temporal gyrus 49 −37 574 5.43 
Left precentral gyrus −45 −3 56 204 3.87 
Bilateral cingulate −1 23 26 186 3.83 
Right middle frontal gyrus 41 39 24 145 3.94 
Left middle frontal gyrus −39 45 22 144 3.95 
Left supramarginal and angular gyri −33 −51 36 127 4.36 
Right inferior parietal cortex 41 −63 54 110 3.96 
Right superior frontal gyrus 19 51 −14 37 4.02 
Left inferior frontal gyrus −35 27 −2 36 3.96 
Right superior frontal gyrus 37 45 30 36 4.03 

Reported are peak coordinates for the significant clusters.

The contrast disimilar > similar, effects of repetition suppression, yielded one significant cluster (k = 29, T = 3.84) in the pIFG (Figure 3). Peak activation was observed in the sulcus on the border between the pars opercularis and pars triangularis ([−53, 13, 18]). The contrast similar > dissimilar yielded no significant voxels, indicating that there were no effects of phonological repetition enhancement. Our behavioral data are consistent with these results. A dependent samples t test was performed on the self-reported error rates, and it revealed a significant difference between phonologically similar versus dissimilar sequences (p < .05). Although participants were generally accurate on this task and the overall correct response rate was 93.4%, when participants made an error, they made significantly more errors on phonologically dissimilar sequences (7.3%) compared with similar sequences (5.9%).

Figure 3. 

Repetition suppression effect. Percent signal change of the one significant cluster for the contrast of [DISSIMILAR > SIMILAR] in the left IFG (p < .05, corrected) on the border of the pars opercularis and pars triangularis. Left: sagittal view; right: axial view. n = 21.

Figure 3. 

Repetition suppression effect. Percent signal change of the one significant cluster for the contrast of [DISSIMILAR > SIMILAR] in the left IFG (p < .05, corrected) on the border of the pars opercularis and pars triangularis. Left: sagittal view; right: axial view. n = 21.

DISCUSSION

We found a repetition suppression effect for phonological features of consonants in onset position of words during silent speech production in one brain area: the pIFG. This finding supports the notion that the pIFG represents articulatory phonological features, consistent with previous experiments finding effects for the repetition of whole phonemes/syllables in this area (Vaden et al., 2010; Hasson et al., 2007; Bohland & Guenther, 2006). Our findings point toward a role for the pIFG in coding phonological feature parameters as part of articulatory planning units, rather than responding to the repetition of whole speech sounds or syllables without incorporating features. Note that this does not imply that the planning unit in the pIFG is limited to phonological features—rather that feature information is part of the relevant planning units. This supports the hypothesis of Hickok and colleagues (Hickok, 2012, 2014b) that articulatory phonological features are a component of the speech planning system in the motor system.

One potential limitation of our study is that the repetition of phonemic features is confounded with the repetition of lower-level phonetic and/or articulatory features, that is, more similarity of phonemic representations entails more similarity at lower levels of phonetic and articulatory representation. Thus, in principle, the repetition effect we observed in pIFG may reflect these lower levels of representations. Similarly, repetition of phonemic features is also confounded with reduced motor complexity of the speech sequences. However, we can constrain our interpretation of these results by noting that previous research strongly supports a higher-level function of the IFG. For instance, Basilakos, Rorden, Bonilha, Moser, and Fridriksson (2015) showed that apraxia of speech, a low-level articulatory speech motor disorder, involves damage to the pre- and postcentral gyrus and not the IFG, whereas aphasia, that is, higher-level language disorders, are associated with the IFG and not the pre- and postcentral gyrus. Similarly, an electrocorticography study by Flinker et al. (2015) showed that the IFG is not active during speech articulation as would be expected for articulatory features but rather activated before the onset of speech, suggestive of a higher-level speech planning role at the phonemic level of representation. Consistent with this, other studies have shown that the IFG does not respond to motor complexity itself (Tremblay et al., 2017; McGettigan et al., 2011; Tremblay & Small, 2011; Bohland & Guenther, 2006). Interestingly, we did not observe any repetition suppression effects in the premotor cortex that might be expected for the repetition of these lower levels of phonetic and/or articulatory representations. It may be the case that such articulatory features are always necessarily activated for speech articulation and are therefore less susceptible to repetition effects. Alternatively, it may be that these levels of representation were not strongly activated due to the fact that participants articulated the words silently rather than aloud. In more naturalistic contexts when participants overtly vocalize, clearer effects in these regions might be observed.

The lack of a suppression effect in the temporal lobe in this study differs from previous studies of phonological processing in the STS. As mentioned above, a previous phonological repetition suppression perception study (Vaden et al., 2010) found effects for repeated phonemes in bilateral STS in addition to the pIFG and a wider frontal–parietal network. The obvious difference between these studies is the task, listening in Vaden et al. (2010) and speaking in this study. Thus, unless phonological features were uniquely coded in the STS, an effect of feature repetition in the auditory system is not necessarily expected. Our results are also consistent with the speech production experiments that found that repetition of three identical syllables during production resulted in repetition suppression in a broad network of motor brain regions as well as postcentral gyrus and superior parietal cortex, but few or no effects in the superior or middle temporal lobe (Tremblay et al., 2017; Bohland & Guenther, 2006).

Two previous studies (Tremblay et al., 2017; Bohland & Guenther, 2006) implicitly identified phonological repetition suppression effects due to their sequence complexity manipulation for whole phonemes/syllables in a wide frontal motor and parietal network, whereas we found phonological feature repetition suppression effects only in the pIFG. The broader network identified in these studies likely reflects the wide range of similarity traversed, from identical to completely different syllables. Such a manipulation will modulate speech planning networks across a correspondingly broad range of levels from features to phonemes to syllables to sequences of syllables. Our study, conversely, spanned a much more restricted similarity range in that what was varied was only onset feature similarity of otherwise different syllables. This likely explains our much more restricted finding. In addition, the participants in these previous studies produced speech sounds aloud; this may have induced additional planning at lower-level phonetic representations that were not activated in our study because the participants articulated silently. This is broadly consistent with the results of Oppenheim and Dell (2008), who identified distinct error patterns associated with imagined versus articulated speech, suggesting that the degree of planning at different representational levels is influenced by whether participants imagine speech, articulate silently, or articulate out loud.

It is perhaps surprising that we did not find repetition suppression in lower-level motor cortices (e.g., M1), given that phonological features correspond to fairly low-level articulatory states. But in the context of this study where an effect would be observed only if a neural code has some memory trace for previous planned utterances, it perhaps makes sense that a higher-order region that has been implicated in sequence-level processes (Meyer, Obleser, Anwander, & Friederici, 2012; Gelfand & Bookheimer, 2003) would show sensitivity to feature similarity across distinct syllable utterances. This line of reasoning suggests that our study is not tapping into a level of planning that is coding individual phonological features, but a higher-level plan that incorporates feature-relevant information.

Conclusions

Phonological linguistic theory, particularly the concept of abstract phoneme representations, has played an important role in guiding behavioral and neuroscience research for decades. Therefore, the question of how to incorporate phonological theory into psychological models of speech processing is an important one. The Hickok model (Hickok, 2012, 2014b) places phonemes as important levels of representation for speech production; that is, as part of a system that maps articulatory gestures onto speech sound targets. Here we have shown that crucial components of phonological theory, phonological features, appear to be a part of motor planning units. These results are consistent with the Hickok model and suggest that it might be possible to explore deeper connections between phonological theory and motor control research. Future research could test different sets of articulatory parameters than those tested here, including attempts to identify whether specific phonemic parameters (e.g., voicing, manner, and place) are localized in distinct cortical locations, as well as testing phonological repetition suppression effects that cross syllable onset and coda positions.

Reprint requests should be sent to Gregory Hickok, Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697-3800, or via e-mail: greg.hickok@uci.edu.

REFERENCES

REFERENCES
Barron
,
H. C.
,
Garvert
,
M. M.
, &
Behrens
,
T. E. J.
(
2016
).
Repetition suppression: A means to index neural representations using BOLD?
Philosophical Transactions of the Royal Society, Series B, Biological Sciences
,
371
,
20150355
.
Basilakos
,
A.
,
Rorden
,
C.
,
Bonilha
,
L.
,
Moser
,
D.
, &
Fridriksson
,
J.
(
2015
).
Patterns of post-stroke brain damage that predict speech production errors in apraxia of speech and aphasia dissociate
.
Stroke
,
46
,
1561
1566
.
Bohland
,
J. W.
, &
Guenther
,
F. H.
(
2006
).
An fMRI investigation of syllable sequence production
.
Neuroimage
,
32
,
821
841
.
Bouchard
,
K. E.
,
Mesgarani
,
N.
,
Johnson
,
K.
, &
Chang
,
E. F.
(
2013
).
Functional organization of human sensorimotor cortex for speech articulation
.
Nature
,
495
,
327
332
.
Buchsbaum
,
B. R.
,
Hickok
,
G.
, &
Humphries
,
C.
(
2001
).
Role of left posterior superior temporal gyrus in phonological processing for speech perception and production
.
Cognitive Science
,
25
,
663
678
.
Chang
,
E. F.
,
Rieger
,
J. W.
,
Johnson
,
K.
,
Berger
,
M. S.
,
Barbaro
,
N. M.
, &
Knight
,
R. T.
(
2010
).
Categorical speech representation in human superior temporal gyrus
.
Nature Neuroscience
,
13
,
1428
1432
.
Chomsky
,
N.
, &
Halle
,
M.
(
1968
).
The sound pattern of English
.
New York
:
Harper & Row
.
Conant
,
D.
,
Bouchard
,
K. E.
, &
Chang
,
E. F.
(
2014
).
Speech map in the human ventral sensory-motor cortex
.
Current Opinion in Neurobiology
,
24
,
63
67
.
Cox
,
R. W.
(
1996
).
AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages
.
Computers in Biomedical Research
,
29
,
162
173
.
Cox
,
R. W.
, &
Jesmanowicz
,
A.
(
1999
).
Real-time 3D image registration for functional MRI
.
Magnetic Resonance in Medicine
,
42
,
1014
1018
.
Dell
,
G. S.
(
1995
).
Speaking and misspeaking
. In
L.
Gleitman
&
M.
Liberman
(Eds.),
Invitation to cognitive science, part I, language
(pp.
183
208
).
Cambridge, MA
:
MIT Press
.
Dell
,
G. S.
(
2013
).
Cascading and feedback in interactive models of production: A reflection of forward modeling?
Behavioral and Brain Sciences
,
36
,
351
352
.
Dell
,
G. S.
, &
O'Seaghdha
,
P. G.
(
1992
).
Stages of lexical access in language production
.
Cognition
,
42
,
287
314
.
Deschamps
,
I.
,
Baum
,
S. R.
, &
Gracco
,
V. L.
(
2015
).
Phonological processing in speech perception: What do sonority differences tell us?
Brain and Language
,
149(Suppl. C)
,
77
83
.
Dronkers
,
N. F.
(
1996
).
A new brain region for coordinating speech articulation
.
Nature
,
384
,
159
161
.
Flinker
,
A.
,
Korzeniewska
,
A.
,
Shestyuk
,
A. Y.
,
Franaszczuk
,
P. J.
,
Dronkers
,
N. F.
,
Knight
,
R. T.
, et al
(
2015
).
Redefining the role of Broca's area in speech
.
Proceedings of the National Academy of Sciences, U.S.A.
,
112
,
2871
2875
.
Fromkin
,
V. A.
(
1971
).
The non-anomalous nature of anomalous utterances
.
Language
,
47
,
27
52
.
Garrett
,
M. F.
(
1975
).
The analysis of sentence production
. In
G. H.
Bower
(Ed.),
Psychology of learning and motivation
(
Vol. 9
, pp.
133
177
).
New York
:
Academic Press
.
Gelfand
,
J. R.
, &
Bookheimer
,
S. Y.
(
2003
).
Dissociating neural mechanisms of temporal sequencing and processing phonemes
.
Neuron
,
38
,
831
842
.
Grill-Spector
,
K.
,
Henson
,
R.
, &
Martin
,
A.
(
2006
).
Repetition and the brain: Neural models of stimulus-specific effects
.
Trends in Cognitive Sciences
,
10
,
14
23
.
Guenther
,
F. H.
(
2006
).
Cortical interactions underlying the production of speech sounds
.
Journal of Communication Disorders
,
39
,
350
365
.
Hasson
,
U.
,
Skipper
,
J. I.
,
Nusbaum
,
H. C.
, &
Small
,
S. L.
(
2007
).
Abstract coding of audiovisual speech: Beyond sensory representation
.
Neuron
,
56
,
1116
1126
.
Henson
,
R. N. A.
, &
Rugg
,
M. D.
(
2003
).
Neural response suppression, haemodynamic repetition effects, and behavioural priming
.
Neuropsychologia
,
41
,
263
270
.
Hickok
,
G.
(
2012
).
Computational neuroanatomy of speech production
.
Nature Reviews Neuroscience
,
13
,
135
145
.
Hickok
,
G.
(
2014a
).
The architecture of speech production and the role of the phoneme in speech processing
.
Language and Cognitive Processes
,
29
,
2
20
.
Hickok
,
G.
(
2014b
).
Toward an integrated psycholinguistic, neurolinguistic, sensorimotor framework for speech production
.
Language and Cognitive Processes
,
29
,
52
59
.
Hickok
,
G.
,
Buchsbaum
,
B.
,
Humphries
,
C.
, &
Muftuler
,
T.
(
2003
).
Auditory-motor interaction revealed by fMRI: Speech, music, and working memory in area Spt
.
Journal of Cognitive Neuroscience
,
15
,
673
682
.
Hickok
,
G.
,
Houde
,
J.
, &
Rong
,
F.
(
2011
).
Sensorimotor integration in speech processing: Computational basis and neural organization
.
Neuron
,
69
,
407
422
.
Huber
,
D. E.
,
Tian
,
X.
,
Curran
,
T.
,
O'Reilly
,
R. C.
, &
Woroch
,
B.
(
2008
).
The dynamics of integration and separation: ERP, MEG, and neural network studies of immediate repetition effects
.
Journal of Experimental Psychology: Human Perception and Performance
,
34
,
1389
1416
.
Indefrey
,
P.
, &
Levelt
,
W. J.
(
2004
).
The spatial and temporal signatures of word production components
.
Cognition
,
92
,
101
144
.
Jakobson
,
R.
,
Fant
,
G.
, &
Halle
,
M.
(
1951
).
Preliminaries to speech analysis. The distinctive features and their correlates
.
Cambridge, MA
:
MIT Press
.
Levelt
,
W. J. M.
,
Roelofs
,
A.
, &
Meyer
,
A. S.
(
1999
).
A theory of lexical access in speech production
.
Behavioral and Brain Sciences
,
22
,
1
38
.
McGettigan
,
C.
,
Warren
,
J. E.
,
Eisner
,
F.
,
Marshall
,
C. R.
,
Shanmugalingam
,
P.
, &
Scott
,
S. K.
(
2011
).
Neural correlates of sublexical processing in phonological working memory
.
Journal of Cognitive Neuroscience
,
23
,
961
977
.
Mesgarani
,
N.
,
Cheung
,
C.
,
Johnson
,
K.
, &
Chang
,
E. F.
(
2014
).
Phonetic feature encoding in human superior temporal gyrus
.
Science
,
343
,
1006
1010
.
Meyer
,
L.
,
Obleser
,
J.
,
Anwander
,
A.
, &
Friederici
,
A. D.
(
2012
).
Linking ordering in Broca's area to storage in left temporo-parietal regions: The case of sentence processing
.
Neuroimage
,
62
,
1987
1998
.
Oppenheim
,
G. M.
, &
Dell
,
G. S.
(
2008
).
Inner speech slips exhibit lexical bias, but not the phonemic similarity effect
.
Cognition
,
106
,
528
537
.
Oppenheim
,
G. M.
, &
Dell
,
G. S.
(
2010
).
Motor movement matters: The flexible abstractness of inner speech
.
Memory & Cognition
,
38
,
1147
1160
.
Peeva
,
M. G.
,
Guenther
,
F. H.
,
Tourville
,
J. A.
,
Nieto-Castanon
,
A.
,
Anton
,
J.-L.
,
Nazarian
,
B.
, et al
(
2010
).
Distinct representations of phonemes, syllables, and supra-syllabic sequences in the speech production network
.
Neuroimage
,
50
,
626
638
.
Price
,
C. J.
(
2010
).
The anatomy of language: A review of 100 fMRI studies published in 2009
.
Annals of the New York Academy of Sciences
,
1191
,
62
88
.
Pulvermüller
,
F.
,
Huss
,
M.
,
Kherif
,
F.
,
Moscoso del Prado Martin
,
F.
,
Hauk
,
O.
, &
Shtyrov
,
Y.
(
2006
).
Motor cortex maps articulatory features of speech sounds
.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
7865
7870
.
Rauschecker
,
J. P.
, &
Scott
,
S. K.
(
2009
).
Maps and streams in the auditory cortex: Nonhuman primates illuminate human speech processing
.
Nature Neuroscience
,
12
,
718
724
.
Roelofs
,
A.
(
2014
).
A dorsal-pathway account of aphasic language production: The WEAVER++/ARC model
.
Cortex
,
59(Suppl. C)
,
33
48
.
Schacter
,
D. L.
, &
Buckner
,
R. L.
(
1998
).
Priming and the brain
.
Neuron
,
20
,
185
195
.
Shuster
,
L. I.
, &
Lemieux
,
S. K.
(
2005
).
An fMRI investigation of covertly and overtly produced mono- and multisyllabic words
.
Brain and Language
,
93
,
20
31
.
Simonyan
,
K.
, &
Horwitz
,
B.
(
2011
).
Laryngeal motor cortex and control of speech in humans
.
The Neuroscientist
,
17
,
197
208
.
Strijkers
,
K.
,
Costa
,
A.
, &
Pulvermüller
,
F.
(
2017
).
The cortical dynamics of speaking: Lexical and phonological knowledge simultaneously recruit the frontal and temporal cortex within 200 ms
.
Neuroimage
,
163
,
206
219
.
Talairach
,
J.
, &
Tournoux
,
P.
(
1988
).
Co-planar stereotaxic atlas of the human brain. 3-Dimensional proportional system: An approach to cerebral imaging
.
New York
:
Thieme
.
Tourville
,
J. A.
, &
Guenther
,
F. H.
(
2011
).
The DIVA model: A neural theory of speech acquisition and production
.
Language and Cognitive Processes
,
26
,
952
981
.
Tourville
,
J. A.
,
Reilly
,
K. J.
, &
Guenther
,
F. H.
(
2008
).
Neural mechanisms underlying auditory feedback control of speech
.
Neuroimage
,
39
,
1429
1443
.
Tremblay
,
P.
,
Sato
,
M.
, &
Deschamps
,
I.
(
2017
).
Age differences in the motor control of speech: An fMRI study of healthy aging
.
Human Brain Mapping
,
38
,
2751
2771
.
Tremblay
,
P.
, &
Small
,
S. L.
(
2011
).
On the context-dependent nature of the contribution of the ventral premotor cortex to speech perception
.
Neuroimage
,
57
,
1561
1571
.
Vaden
,
K. I.
,
Muftuler
,
L. T.
, &
Hickok
,
G.
(
2010
).
Phonological repetition-suppression in bilateral superior temporal sulci
.
Neuroimage
,
49
,
1018
1023
.
Wilson
,
S. M.
,
Saygin
,
A. P.
,
Sereno
,
M. I.
, &
Iacoboni
,
M.
(
2004
).
Listening to speech activates motor areas involved in speech production
.
Nature Neuroscience
,
7
,
701
702
.

Author notes

*

These authors contributed equally to this work.