Abstract
Models of speech production posit a role for the motor system, predominantly the posterior inferior frontal gyrus, in encoding complex phonological representations for speech production, at the phonemic, syllable, and word levels [Roelofs, A. A dorsal-pathway account of aphasic language production: The WEAVER++/ARC model. Cortex, 59(Suppl. C), 33–48, 2014; Hickok, G. Computational neuroanatomy of speech production. Nature Reviews Neuroscience, 13, 135–145, 2012; Guenther, F. H. Cortical interactions underlying the production of speech sounds. Journal of Communication Disorders, 39, 350–365, 2006]. However, phonological theory posits subphonemic units of representation, namely phonological features [Chomsky, N., & Halle, M. The sound pattern of English, 1968; Jakobson, R., Fant, G., & Halle, M. Preliminaries to speech analysis. The distinctive features and their correlates. Cambridge, MA: MIT Press, 1951], that specify independent articulatory parameters of speech sounds, such as place and manner of articulation. Therefore, motor brain systems may also incorporate phonological features into speech production planning units. Here, we add support for such a role with an fMRI experiment of word sequence production using a phonemic similarity manipulation. We adapted and modified the experimental paradigm of Oppenheim and Dell [Oppenheim, G. M., & Dell, G. S. Inner speech slips exhibit lexical bias, but not the phonemic similarity effect. Cognition, 106, 528–537, 2008; Oppenheim, G. M., & Dell, G. S. Motor movement matters: The flexible abstractness of inner speech. Memory & Cognition, 38, 1147–1160, 2010]. Participants silently articulated words cued by sequential visual presentation that varied in degree of phonological feature overlap in consonant onset position: high overlap (two shared phonological features; e.g., /r/ and /l/) or low overlap (one shared phonological feature, e.g., /r/ and /b/). We found a significant repetition suppression effect in the left posterior inferior frontal gyrus, with increased activation for phonologically dissimilar words compared with similar words. These results suggest that phonemes, particularly phonological features, are part of the planning units of the motor speech system.
INTRODUCTION
Speech production in everyday conversation involves planning over multiple hierarchical levels of analysis, including phonemic, syllabic, morphemic/lexical, phrasal/sentential, and discourse levels (Levelt, Roelofs, & Meyer, 1999; Dell & O'Seaghdha, 1992; Garrett, 1975; Fromkin, 1971). Much of the work on the neural basis of speech production has focused on phonemic levels, which have aimed to identify where and how phonemes or sequences of phonemes (syllables) are represented (Dell, 2013; Peeva et al., 2010; Bohland & Guenther, 2006; Wilson, Saygin, Sereno, & Iacoboni, 2004). But according to standard linguistic accounts, phonemes are not theorized to be holistic units but rather sets of phonological features (Chomsky & Halle, 1968; Jakobson, Fant, & Halle, 1951) with independent articulatory parameters such as place and manner of articulation. Phonological features are integral to phonological theory, because they define relevant groupings of speech sounds for various phonological processes. Our aim here was to use fMRI to investigate the neural representation of phonological features during speech production, with particular emphasis on determining where in the cortical hierarchy they might be found, if at all.
Previous research on phonological processes more broadly has identified a set of candidate regions where one might expect to find more fine-grained neural codes of phonological features. The broader network includes the inferior frontal gyrus, premotor cortex, (pre-)SMA, ventral motor cortex, anterior insula, and the temporal–parietal junction, including area Spt (Deschamps, Baum, & Gracco, 2015; Roelofs, 2014; Dell, 2013; Tourville & Guenther, 2011; Guenther, 2006; Indefrey & Levelt, 2004; Hickok, Buchsbaum, Humphries, & Muftuler, 2003; Buchsbaum, Hickok, & Humphries, 2001; Dronkers, 1996). Some studies have identified regions in primary sensorimotor cortex corresponding to lip, tongue, jaw, and larynx effectors using direct cortical recordings and noninvasive functional activation methods (Strijkers, Costa, & Pulvermüller, 2017; Conant, Bouchard, & Chang, 2014; Mesgarani, Cheung, Johnson, & Chang, 2014; Bouchard, Mesgarani, Johnson, & Chang, 2013; Simonyan & Horwitz, 2011; Chang et al., 2010; Pulvermüller et al., 2006), but it is not entirely clear whether these maps represent purely incidental sensory or motor activations associated with producing and perceiving the sensory effects of articulation (cf. Chang et al., 2010, who identified categorical responses in auditory cortex). Crucially, phonological features are theorized to be abstract computational units of speech, rather than isolated, continuous patterns of sensory stimulation or motor configurations. To illustrate this, in English there is a continuum of possible motor tract configurations associated with the sound /t/, each with slight variations that might produce distinct motor/somatosensory activation, yet all of these variations correspond to the same set of phonemic features. Thus, it is important to more clearly identify neural responses belonging to these more abstract computational units.
Here we approach the task of trying to identify phonological features using repetition suppression during fMRI recording (rsFMRI). Repetition suppression is the reduced BOLD response in fMRI studies due to the overlap of information across repeated stimuli (Grill-Spector, Henson, & Martin, 2006). If a particular brain region shows a repetition suppression effect for a stimulus characteristic, this is taken as evidence that this brain region encodes information about this characteristic (Barron, Garvert, & Behrens, 2016). Repetition suppression designs allow for precise targeting of particular representational levels without changing task demands. In language research, there have been several rsFMRI studies targeting speech perception (Vaden, Muftuler, & Hickok, 2010; Hasson, Skipper, Nusbaum, & Small, 2007).
In speech production, we are not aware of any explicitly designed rsFMRI experiments. However, two studies (Tremblay, Sato, & Deschamps, 2017; Bohland & Guenther, 2006) employed a design that included a kind of repetition suppression as part of a manipulation of syllable sequence complexity. They compared the production of three identical syllables (e.g., ta-ta-ta) to three different syllables (e.g., ka-ru-ti) and report similar results: effects of syllable sequence complexity in several frontal-parietal regions, including the left posterior inferior frontal gyrus (pIFG). Presumably, some of this effect is being driven by repetition suppression in the same syllable condition relative to the different syllables condition. However, it is impossible in these studies to distinguish the repetition effect from the effect of sequence complexity, that is, the difficulty involved in sequencing distinct syllables as compared with identical syllables.
All of these studies involve repetition of identical speech sounds across conditions. However, given that phonemes are theoretically composed of sets of articulatory features (Chomsky & Halle, 1968; Jakobson et al., 1951), it should be possible to observe subphonemic effects of repetition suppression in the motor system. For instance, the speech sounds /p/ and /b/, although distinct phonemes, share two features: place (bilabial) and manner (stop). Therefore, these speech sounds are more phonologically similar to each other than /p/ and /g/, which only share the manner feature (stop). This suggests that phonological features themselves, as components of higher-order phonological representations, should be susceptible to repetition effects and that repetition suppression should be seen in brain areas encoding representations that incorporate phonological features. Identifying repetition effects for phonemic features was the goal of the present rsFMRI study.
Although repetition of perceptual features affects performance (Huber, Tian, Curran, O'Reilly, & Woroch, 2008; Henson & Rugg, 2003; Schacter & Buckner, 1998), underlying the neural repetition suppression effect (Grill-Spector et al., 2006), additional evidence from speech production indicates that feature similarity can induce speech errors. Spoonerisms and tongue twisters are examples of this, for example, she sells sea shells down by the seashore. Tongue twisters are difficult not because of the actual motor difficulty of the sounds in question but because of phonological similarity among the speech sounds. In the example above, the sounds /s/ and /ʃ/ (the “sh” sound) cause such difficulty because they share high feature overlap (voice and manner features), and they occupy onset position in each word. The articulatory difficulty caused by phonological similarity is due to predictive planning of the utterance by confusing the consonants that are intended to occupy the same positions of different words (Dell, 1995; Fromkin, 1971). Thus, the phonological similarity effect elicited by tongue twisters occurs because of interference on speech planning by speech sounds that are related in their phonological representations. The possibility of speech errors due to phonological similarity complicates potential experiments of phonological feature repetition suppression, as the propensity to produce speech errors or to correct articulation results in increased activation in motor speech regions (Tourville, Reilly, & Guenther, 2008; Bohland & Guenther, 2006; Guenther, 2006). The propensity to produce speech errors due to phonological feature repetition might cancel or even reverse effects of repetition suppression.
To demonstrate phonological feature repetition suppression but avoid inducing speech errors, we adapted the materials of Oppenheim and Dell (2008, 2010) and modified their stimulus presentation. The goal of the Oppenheim and Dell studies was to induce different amounts of speech errors through different degrees of phonological feature similarity across four-word tongue twister stimuli. Crucially, their experiment relied on the simultaneous presentation of all four words to induce interfering effects of phonological planning across words. To retain the phonological feature overlap manipulation for repetition suppression but prevent interference in phonological planning, we modified their experimental design and presented each word in isolation, asking participants to produce each word before the presentation of subsequent words. Because each word had to be produced in isolation, all stages of encoding, planning, and production were completed in isolation from the presentation of future words, preventing potential interfering effects of phonological similarity. This design allowed us to investigate which regions would exhibit repetition suppression for phonological features with little risk of speech errors due to interfering effects of phonological similarity.
We used silent articulation (rather than overt articulation) because we did not want to inadvertently obtain activation in the temporal lobe due to participants hearing their own speech. We were confident that silent articulation would adequately activate phonological representations given the results of Oppenheim and Dell (2010), who observed an equivalent phonemic similarity effect for overt and silent articulation, but no such effect for purely imagined speech without any articulation. This suggests that phonological representations are equally active during production, whether vocalized or not, as long as participants move their vocal tract effectors. In addition, we did not attempt to distinguish among different phonemic feature parameters such as voicing or place of articulation. Given that the feature contrast in the Oppenheim and Dell studies was extremely subtle, that is, 1 feature different (similar condition) vs. 2 features different (dissimilar condition), we decided to maximize statistical power and broadly investigate the neuroanatomy of phonemic features. The framework of Hickok (2012, 2014a) posits that phonemic features are mostly in the domain of the motor system; in accord with this hypothesis, we predicted that phonemic feature repetition suppression would produce effects in the motor system, particularly the pIFG and premotor cortex.
METHODS
Participants
Twenty-one participants were included in the final analyses. Twenty-five participants (17 women) aged between 18 and 40 years (mean age = 21.2 years, SD = 4 years) were recruited from the University of California, Irvine, community and received monetary compensation for their time. The volunteers were right-handed, native English speakers with normal or corrected-to-normal vision, no known history of neurological disease, and no other contraindications for MRI. Informed consent was obtained from each participant before the study in accordance with guidelines from the University of California, Irvine, institutional review board, which approved this study. Four participants were omitted from data analysis: one participant made excessive errors on the task (>47% error rate and >22% no responses), one participant failed to follow directions, and images from two participants contained streaking artifacts.
Stimuli and Task
Participants were presented with a sequence of visually presented tongue twister stimuli that was presented in rapid succession one word at a time and were asked to silently articulate. Thirty-two sets of tongue twisters known to behaviorally elicit a phonemic similarity effect were used in the study (see Table 1). Each set consisted of a four-word sequence. Oppenheim and Dell (2008) created lists of words in which the onset of the words either shared two features, for example, voicing and manner of articulation (similar condition), or shared only one feature, for example, voicing only (dissimilar condition). The specific metrics of the stimuli are described elsewhere (Oppenheim & Dell, 2008).
List of Tongue Twister Stimuli Used in the Experiment
Similar . | Dissimilar . |
---|---|
nod mod mock knob | jog mod mock job |
yore wan watt yawn | gore wan watt gone |
log rob rock lot | jog rob rock jot |
daft gab gash dam | raft gab gash ram |
zoom nab nap czar | goon nab nap gar |
nab match mat nerve | van match mat verve |
dull budge buck dove | lull budge buck love |
pub bust bun puff | tub bust bun tough |
gun bulb but gull | nun bulb but null |
pen bunk bus pair | than bunk bus there |
lust rum rug lump | just rum rug jump |
mine bikes bite mice | nine bikes bite nice |
pies tile tine pyre | shies tile tine shire |
bike wild wise bile | guide wild wise guile |
zing that then zed | ding that then dead |
weld yell yet when | meld yell yet men |
paid cage cane pace | chafe cage cane chase |
bane gave gang bait | rein gave gang rate |
name make mail nag | vane make mail vague |
sage tame take sale | beige tame take bale |
wing bib bit whip | zing bib bit zip |
six finch fill sin | chicks finch fill chin |
singe fib fit sip | hinge fib fit hip |
sing hitch hill sick | king hitch hill kick |
chip jilt gin chill | bib jilt gin bill |
size tide tight sign | five tide tight fine |
jail cheek cheap jean | kale cheek cheap keen |
zinc niece kneel zest | verb niece kneel vest |
lean reed reef leech | bean reed reef beech |
bees wheat week big | geese wheat week gig |
home sole soap hone | loam sole soap lone |
poke tote toll pose | folk tote toll foes |
nod mod mop knob | jog mod mop job |
yore wan wok yawn | gore wan wok gone |
log rob rod lot | jog rob rod jot |
daft gab gas dam | raft gab gas ram |
zoom nab knack czar | goon nab knack gar |
nab match mass nerve | van match mass verve |
dull budge but dove | lull budge but love |
pub bust bum puff | tub bust bum tough |
gun bulb buck gull | nun bulb buck null |
pen bunk buzz pair | than bunk buzz there |
lust rum rub lump | just rum rub jump |
mine bikes bide mice | nine bikes bide nice |
pies tile time pyre | shies tile time shire |
bike wild wine bile | guide wild wine guile |
zing that them zed | ding that them dead |
weld yell yep when | meld yell yep men |
paid cage came pace | chafe cage came chase |
bane gave game bait | rein gave game rate |
name make maid nag | vane make maid vague |
sage tame tape sale | beige tame tape bale |
wing bib bid whip | zing bib bid zip |
six finch fizz sin | chicks finch fizz chin |
singe fib fish sip | hinge fib fish hip |
sing hitch his sick | king hitch his kick |
chip jilt gym chill | bib jilt gym bill |
size tide type sign | five tide type fine |
jail cheek cheat jean | kale cheek cheat keen |
zinc niece need zest | verb niece need vest |
lean reed wreath leech | bean reed wreath beech |
bees wheat wheel big | geese wheat wheel gig |
home sole soak hone | loam sole soak lone |
poke tote towed pose | folk tote towed foes |
Similar . | Dissimilar . |
---|---|
nod mod mock knob | jog mod mock job |
yore wan watt yawn | gore wan watt gone |
log rob rock lot | jog rob rock jot |
daft gab gash dam | raft gab gash ram |
zoom nab nap czar | goon nab nap gar |
nab match mat nerve | van match mat verve |
dull budge buck dove | lull budge buck love |
pub bust bun puff | tub bust bun tough |
gun bulb but gull | nun bulb but null |
pen bunk bus pair | than bunk bus there |
lust rum rug lump | just rum rug jump |
mine bikes bite mice | nine bikes bite nice |
pies tile tine pyre | shies tile tine shire |
bike wild wise bile | guide wild wise guile |
zing that then zed | ding that then dead |
weld yell yet when | meld yell yet men |
paid cage cane pace | chafe cage cane chase |
bane gave gang bait | rein gave gang rate |
name make mail nag | vane make mail vague |
sage tame take sale | beige tame take bale |
wing bib bit whip | zing bib bit zip |
six finch fill sin | chicks finch fill chin |
singe fib fit sip | hinge fib fit hip |
sing hitch hill sick | king hitch hill kick |
chip jilt gin chill | bib jilt gin bill |
size tide tight sign | five tide tight fine |
jail cheek cheap jean | kale cheek cheap keen |
zinc niece kneel zest | verb niece kneel vest |
lean reed reef leech | bean reed reef beech |
bees wheat week big | geese wheat week gig |
home sole soap hone | loam sole soap lone |
poke tote toll pose | folk tote toll foes |
nod mod mop knob | jog mod mop job |
yore wan wok yawn | gore wan wok gone |
log rob rod lot | jog rob rod jot |
daft gab gas dam | raft gab gas ram |
zoom nab knack czar | goon nab knack gar |
nab match mass nerve | van match mass verve |
dull budge but dove | lull budge but love |
pub bust bum puff | tub bust bum tough |
gun bulb buck gull | nun bulb buck null |
pen bunk buzz pair | than bunk buzz there |
lust rum rub lump | just rum rub jump |
mine bikes bide mice | nine bikes bide nice |
pies tile time pyre | shies tile time shire |
bike wild wine bile | guide wild wine guile |
zing that them zed | ding that them dead |
weld yell yep when | meld yell yep men |
paid cage came pace | chafe cage came chase |
bane gave game bait | rein gave game rate |
name make maid nag | vane make maid vague |
sage tame tape sale | beige tame tape bale |
wing bib bid whip | zing bib bid zip |
six finch fizz sin | chicks finch fizz chin |
singe fib fish sip | hinge fib fish hip |
sing hitch his sick | king hitch his kick |
chip jilt gym chill | bib jilt gym bill |
size tide type sign | five tide type fine |
jail cheek cheat jean | kale cheek cheat keen |
zinc niece need zest | verb niece need vest |
lean reed wreath leech | bean reed wreath beech |
bees wheat wheel big | geese wheat wheel gig |
home sole soak hone | loam sole soak lone |
poke tote towed pose | folk tote towed foes |
In the present experiment, words in each list were presented to participants one at a time at a rate of 400 msec per word separated by 100 msec of a blank screen. Participants were instructed to rapidly articulate each monosyllabic word as soon as it appeared on screen. After articulating each sequence, the participant indicated whether or not he or she correctly articulated the sequence using a button box. Word lists were separated by a fixation that varied in length with an SOA of 3.5–9 sec. An example of a trial is illustrated in Figure 1.
Example of a single trial. Words were presented in rapid succession separated by 100-msec intervals. Participants articulated the words as they streamed on screen. Each sequence was separated by a fixation that remained on screen for 3.5–9 sec.
There was a total of 64 trials in each run, and there was a total of six runs in the study. Half of the word lists were from the similar condition, and half of the word lists were from the dissimilar condition. The word lists were randomly presented to each participant. The study started with a short practice run with eight trials to familiarize participants with the task. Participants were scanned during the practice run to acclimatize them to the fMRI environment. The study ended with a high-resolution structural scan, and the entire experiment was 1 hr in length. Stimuli presentation and timing were controlled using Cogent software (www.vislab.ucl.ac.uk/cogent_2000.php) implemented in Matlab 7.1 (Mathworks, Inc.) running on a dual-core IBM Thinkpad laptop.
Imaging
MR images were obtained in a Philips Achieva 3T (Philips Medical Systems) fitted with an eight-channel radio frequency receiver head coil at the Research Imaging Center scanning facility at the University of California, Irvine. Images during the experimental runs were collected using Fast Echo EPI (sense reduction factor = 1.7, matrix = 112 × 112 mm, repetition time = 2.5 sec, echo time = 25 msec, size = 2.5 mm × 2.5 mm × 2.5 mm, flip angle = 90°). A total of 840 EPIs were collected over six runs, and 40 slices provided whole-brain coverage. After the functional scans, a high-resolution T1-weighted anatomical image was acquired with an MPRAGE pulse sequence in axial plane (matrix = 256 × 256 mm, repetition time = 8 msec, echo time = 3.6 msec, flip angle = 8°, size = 1 × 1 × 1 mm).
Data Analysis
Data preprocessing and analyses were performed using AFNI software (Cox, 1996). First, motion correction was performed by creating a mean image from all of the volumes in the experiment and then realigning all volumes to that mean image using a six-parameter rigid body model. The images were then high-pass filtered at 0.008 Hz and spatially smoothed with an isotropic 8 mm FWHM Gaussian kernel. The anatomical image for each participant was coregistered to his or her mean EPI image. Data analysis proceeded in two steps. First, regression analysis was performed at the single participant level, parameter estimates for events of interest were obtained, and then these data were transformed into standardized space for group-level analysis.
First-level analysis was performed on the time course of each voxel's BOLD response for each participant using AFNI software (Cox, 1996). Regression analysis was performed using AFNI's 3dDeconvolve function, and the regressors were created by convolving the predictor variables representing the time course of stimulus presentation with a gamma variate function. A total of nine regressors were entered in the model. The first regressor corresponded to the presentation of phonemically similar words. The second regressor corresponded to presentation of phonemically dissimilar words. The third regressor corresponded to all of the trials in which the participants reported making an error. Thus, the regressors modeling the articulation of phonemically similar and dissimilar words only include trials in which participants report accurately articulating the sequence of words. An additional six regressors representing the motion parameters determined during the realignment stage of processing were entered into the model. Parameter estimates for the events of interest were obtained, and statistical maps were created.
For group-level analysis, these statistical maps for each participant were transformed into standardized space (Talairach & Tournoux, 1988) using a Talairach template and resampled into 2 mm3 voxels. First, a group-level t test was performed to identify regions activated during articulation across both conditions. This was done to ensure that our activation maps associated with speech articulation was consistent with previous studies. To identify brain regions showing a phonological repetition suppression effect, group-level voxel-wise t tests were performed to identify areas significantly more activated for dissimilar > similar, that is, regions showing significantly less activation for increased phonological repetition. To identify potential effects of repetition enhancement, the opposite contrast (similar > dissimilar) was performed, that is, regions showing significantly more activation for increased phonological repetition. Multiple testing correction was performed, and cluster-wise significance level was calculated using 3dFWHMx and 3dClustStim (Cox & Jesmanowicz, 1999) to estimate the smoothness of the noise and combined minimum cluster size (27 voxels) and p threshold (p < .001).
RESULTS
Consistent with previous research, subvocal speech articulation yielded activation in a peri-sylvian network of bilateral inferior frontal cortex, superior and middle temporal gyri, and left insula. These regions are consistently reported in speech production studies generally, both covert and overt (Tremblay et al., 2017; Deschamps et al., 2015; Roelofs, 2014; McGettigan et al., 2011; Tourville & Guenther, 2011; Bohland & Guenther, 2006; Indefrey & Levelt, 2004; Hickok et al., 2003; Buchsbaum et al., 2001). Although there is some indication that subvocal speech production does not activate all areas to the same extent as overt speech in fMRI, particularly the motor and premotor cortex, the temporal lobe, the insula, and subcortical structures (Shuster & Lemieux, 2005), our experimental task activated the main speech production network identified by most researchers (Hickok, Houde, & Rong, 2011; Price, 2010; Rauschecker & Scott, 2009; Guenther, 2006). Figure 2 illustrates the group activation map, and Table 2 lists the peak Talairach coordinates of the significant clusters.
Rapid articulation. Map of regions significantly activated during rapid articulation of similar and dissimilar word lists (p < .05, corrected) compared with rest. Significant activations include the bilateral IFG, superior and middle temporal cortex, cerebellum, insula, occipital cortex, inferior parietal cortex (in the vicinity of area Spt), and dorsal and ventral precentral gyrus. n = 21.
Rapid articulation. Map of regions significantly activated during rapid articulation of similar and dissimilar word lists (p < .05, corrected) compared with rest. Significant activations include the bilateral IFG, superior and middle temporal cortex, cerebellum, insula, occipital cortex, inferior parietal cortex (in the vicinity of area Spt), and dorsal and ventral precentral gyrus. n = 21.
Talairach Coordinates for Regions Significantly Activated during Speech Articulation
Region . | Peak Voxel . | Cluster Size . | T Score . | ||
---|---|---|---|---|---|
x . | y . | z . | |||
Bilateral inferior and middle occipital gyri includes bilateral cerebellum and cuneus/precuneus | −47 | −71 | −16 | 30057 | 5.43 |
Left superior temporal gyrus includes left frontal, precentral, and postcentral gyri | −59 | 9 | −2 | 2587 | 7.36 |
Right precentral and postcentral gyri | 49 | −33 | 58 | 1167 | 3.83 |
Right middle temporal gyrus | 49 | −37 | 4 | 574 | 5.43 |
Left precentral gyrus | −45 | −3 | 56 | 204 | 3.87 |
Bilateral cingulate | −1 | 23 | 26 | 186 | 3.83 |
Right middle frontal gyrus | 41 | 39 | 24 | 145 | 3.94 |
Left middle frontal gyrus | −39 | 45 | 22 | 144 | 3.95 |
Left supramarginal and angular gyri | −33 | −51 | 36 | 127 | 4.36 |
Right inferior parietal cortex | 41 | −63 | 54 | 110 | 3.96 |
Right superior frontal gyrus | 19 | 51 | −14 | 37 | 4.02 |
Left inferior frontal gyrus | −35 | 27 | −2 | 36 | 3.96 |
Right superior frontal gyrus | 37 | 45 | 30 | 36 | 4.03 |
Region . | Peak Voxel . | Cluster Size . | T Score . | ||
---|---|---|---|---|---|
x . | y . | z . | |||
Bilateral inferior and middle occipital gyri includes bilateral cerebellum and cuneus/precuneus | −47 | −71 | −16 | 30057 | 5.43 |
Left superior temporal gyrus includes left frontal, precentral, and postcentral gyri | −59 | 9 | −2 | 2587 | 7.36 |
Right precentral and postcentral gyri | 49 | −33 | 58 | 1167 | 3.83 |
Right middle temporal gyrus | 49 | −37 | 4 | 574 | 5.43 |
Left precentral gyrus | −45 | −3 | 56 | 204 | 3.87 |
Bilateral cingulate | −1 | 23 | 26 | 186 | 3.83 |
Right middle frontal gyrus | 41 | 39 | 24 | 145 | 3.94 |
Left middle frontal gyrus | −39 | 45 | 22 | 144 | 3.95 |
Left supramarginal and angular gyri | −33 | −51 | 36 | 127 | 4.36 |
Right inferior parietal cortex | 41 | −63 | 54 | 110 | 3.96 |
Right superior frontal gyrus | 19 | 51 | −14 | 37 | 4.02 |
Left inferior frontal gyrus | −35 | 27 | −2 | 36 | 3.96 |
Right superior frontal gyrus | 37 | 45 | 30 | 36 | 4.03 |
Reported are peak coordinates for the significant clusters.
The contrast disimilar > similar, effects of repetition suppression, yielded one significant cluster (k = 29, T = 3.84) in the pIFG (Figure 3). Peak activation was observed in the sulcus on the border between the pars opercularis and pars triangularis ([−53, 13, 18]). The contrast similar > dissimilar yielded no significant voxels, indicating that there were no effects of phonological repetition enhancement. Our behavioral data are consistent with these results. A dependent samples t test was performed on the self-reported error rates, and it revealed a significant difference between phonologically similar versus dissimilar sequences (p < .05). Although participants were generally accurate on this task and the overall correct response rate was 93.4%, when participants made an error, they made significantly more errors on phonologically dissimilar sequences (7.3%) compared with similar sequences (5.9%).
Repetition suppression effect. Percent signal change of the one significant cluster for the contrast of [DISSIMILAR > SIMILAR] in the left IFG (p < .05, corrected) on the border of the pars opercularis and pars triangularis. Left: sagittal view; right: axial view. n = 21.
DISCUSSION
We found a repetition suppression effect for phonological features of consonants in onset position of words during silent speech production in one brain area: the pIFG. This finding supports the notion that the pIFG represents articulatory phonological features, consistent with previous experiments finding effects for the repetition of whole phonemes/syllables in this area (Vaden et al., 2010; Hasson et al., 2007; Bohland & Guenther, 2006). Our findings point toward a role for the pIFG in coding phonological feature parameters as part of articulatory planning units, rather than responding to the repetition of whole speech sounds or syllables without incorporating features. Note that this does not imply that the planning unit in the pIFG is limited to phonological features—rather that feature information is part of the relevant planning units. This supports the hypothesis of Hickok and colleagues (Hickok, 2012, 2014b) that articulatory phonological features are a component of the speech planning system in the motor system.
One potential limitation of our study is that the repetition of phonemic features is confounded with the repetition of lower-level phonetic and/or articulatory features, that is, more similarity of phonemic representations entails more similarity at lower levels of phonetic and articulatory representation. Thus, in principle, the repetition effect we observed in pIFG may reflect these lower levels of representations. Similarly, repetition of phonemic features is also confounded with reduced motor complexity of the speech sequences. However, we can constrain our interpretation of these results by noting that previous research strongly supports a higher-level function of the IFG. For instance, Basilakos, Rorden, Bonilha, Moser, and Fridriksson (2015) showed that apraxia of speech, a low-level articulatory speech motor disorder, involves damage to the pre- and postcentral gyrus and not the IFG, whereas aphasia, that is, higher-level language disorders, are associated with the IFG and not the pre- and postcentral gyrus. Similarly, an electrocorticography study by Flinker et al. (2015) showed that the IFG is not active during speech articulation as would be expected for articulatory features but rather activated before the onset of speech, suggestive of a higher-level speech planning role at the phonemic level of representation. Consistent with this, other studies have shown that the IFG does not respond to motor complexity itself (Tremblay et al., 2017; McGettigan et al., 2011; Tremblay & Small, 2011; Bohland & Guenther, 2006). Interestingly, we did not observe any repetition suppression effects in the premotor cortex that might be expected for the repetition of these lower levels of phonetic and/or articulatory representations. It may be the case that such articulatory features are always necessarily activated for speech articulation and are therefore less susceptible to repetition effects. Alternatively, it may be that these levels of representation were not strongly activated due to the fact that participants articulated the words silently rather than aloud. In more naturalistic contexts when participants overtly vocalize, clearer effects in these regions might be observed.
The lack of a suppression effect in the temporal lobe in this study differs from previous studies of phonological processing in the STS. As mentioned above, a previous phonological repetition suppression perception study (Vaden et al., 2010) found effects for repeated phonemes in bilateral STS in addition to the pIFG and a wider frontal–parietal network. The obvious difference between these studies is the task, listening in Vaden et al. (2010) and speaking in this study. Thus, unless phonological features were uniquely coded in the STS, an effect of feature repetition in the auditory system is not necessarily expected. Our results are also consistent with the speech production experiments that found that repetition of three identical syllables during production resulted in repetition suppression in a broad network of motor brain regions as well as postcentral gyrus and superior parietal cortex, but few or no effects in the superior or middle temporal lobe (Tremblay et al., 2017; Bohland & Guenther, 2006).
Two previous studies (Tremblay et al., 2017; Bohland & Guenther, 2006) implicitly identified phonological repetition suppression effects due to their sequence complexity manipulation for whole phonemes/syllables in a wide frontal motor and parietal network, whereas we found phonological feature repetition suppression effects only in the pIFG. The broader network identified in these studies likely reflects the wide range of similarity traversed, from identical to completely different syllables. Such a manipulation will modulate speech planning networks across a correspondingly broad range of levels from features to phonemes to syllables to sequences of syllables. Our study, conversely, spanned a much more restricted similarity range in that what was varied was only onset feature similarity of otherwise different syllables. This likely explains our much more restricted finding. In addition, the participants in these previous studies produced speech sounds aloud; this may have induced additional planning at lower-level phonetic representations that were not activated in our study because the participants articulated silently. This is broadly consistent with the results of Oppenheim and Dell (2008), who identified distinct error patterns associated with imagined versus articulated speech, suggesting that the degree of planning at different representational levels is influenced by whether participants imagine speech, articulate silently, or articulate out loud.
It is perhaps surprising that we did not find repetition suppression in lower-level motor cortices (e.g., M1), given that phonological features correspond to fairly low-level articulatory states. But in the context of this study where an effect would be observed only if a neural code has some memory trace for previous planned utterances, it perhaps makes sense that a higher-order region that has been implicated in sequence-level processes (Meyer, Obleser, Anwander, & Friederici, 2012; Gelfand & Bookheimer, 2003) would show sensitivity to feature similarity across distinct syllable utterances. This line of reasoning suggests that our study is not tapping into a level of planning that is coding individual phonological features, but a higher-level plan that incorporates feature-relevant information.
Conclusions
Phonological linguistic theory, particularly the concept of abstract phoneme representations, has played an important role in guiding behavioral and neuroscience research for decades. Therefore, the question of how to incorporate phonological theory into psychological models of speech processing is an important one. The Hickok model (Hickok, 2012, 2014b) places phonemes as important levels of representation for speech production; that is, as part of a system that maps articulatory gestures onto speech sound targets. Here we have shown that crucial components of phonological theory, phonological features, appear to be a part of motor planning units. These results are consistent with the Hickok model and suggest that it might be possible to explore deeper connections between phonological theory and motor control research. Future research could test different sets of articulatory parameters than those tested here, including attempts to identify whether specific phonemic parameters (e.g., voicing, manner, and place) are localized in distinct cortical locations, as well as testing phonological repetition suppression effects that cross syllable onset and coda positions.
Reprint requests should be sent to Gregory Hickok, Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697-3800, or via e-mail: greg.hickok@uci.edu.
REFERENCES
Author notes
These authors contributed equally to this work.