Abstract
In two fMRI experiments, participants named pictures with superimposed distractors that were high or low in frequency or varied in terms of age of acquisition. Pictures superimposed with low-frequency words were named more slowly than those superimposed with high-frequency words, and late-acquired words interfered with picture naming to a greater extent than early-acquired words. The distractor frequency effect (Experiment 1) was associated with increased activity in left premotor and posterior superior temporal cortices, consistent with the operation of an articulatory response buffer and verbal self-monitoring system. Conversely, the distractor age-of-acquisition effect (Experiment 2) was associated with increased activity in the left middle and posterior middle temporal cortex, consistent with the operation of lexical level processes such as lemma and phonological word form retrieval. The spatially dissociated patterns of activity across the two experiments indicate that distractor effects in picture–word interference may occur at lexical or postlexical levels of processing in speech production.
INTRODUCTION
Forty years of psycholinguistic research have demonstrated that saying a word, the most fundamental task in speaking, requires selecting from among a set of activated word candidates (see Goldrick, 2007; Dell & Sullivan, 2004). Thus, if a speaker wants to say dog, other words are activated in addition to the target word dog. To the extent that multiple lexical candidates are activated, theories of word production need to identify the nature of these candidates as well as the degree to which they interfere with target word production.
To meet this challenge, researchers have developed paradigms that permit introducing word competitors while varying their characteristics (e.g., semantic, syntactic, or phonological). A widely used paradigm has been picture–word interference (PWI; Rosinski, Golinkoff, & Kukish, 1975), in which participants named a picture of a target object in the context of a superimposed distractor word that they are instructed to ignore. An important empirical observation in PWI is semantic interference (SI), the relative slowing of naming latencies to target pictures in the context of categorically related compared with unrelated distractors, for example, picture–word pairs like dog–fox.
Informative results about the locus of distractor interference were obtained by manipulating distractor frequency. Results from several experiments converged in demonstrating greater interference for low-frequency (LF) distractors than high-frequency (HF) distractors (Catling, Dent, Johnston, & Balding, 2010; Dhooge & Hartsuiker, 2010; Miozzo & Caramazza, 2003). The distractor frequency effect was obtained even with semantically unrelated picture–distractor pairs (e.g., superimposing the HF distractor book and the LF distractor stool to the picture dog). Furthermore, when effects of distractor frequency and SI were induced in the same experiments (Miozzo & Caramazza, 2003), they did not interact and exhibited different time courses, indicating the two do not share the same processing locus.
A general framework to explain the greater interference of LF distractors presupposes limited processing capacity, whereby word distractors are processed slightly ahead of target pictures, causing a delay in target naming proportional to the time needed to process the distractors (Miozzo & Caramazza, 2003). Because processing time is longer for LF than HF words, LF distractors would generate greater interference, as indeed observed in the distractor frequency effect. However, the locus of the processing delay resulting in the distractor frequency effect has proven more problematic to define. Two accounts have been examined, hereafter referred to as “input” and “output” accounts, respectively (Mahon, Costa, Peterson, Vargas, & Caramazza, 2007; Miozzo & Caramazza, 2003).
The “input” account locates the distractor frequency effect at the level of the (spoken or orthographic) recognition mechanisms leading to accessing word meaning. Because LF words have lower resting activation levels than HF words as is typically assumed in recognition models (e.g., McClelland & Rumelhart, 1981; Morton, 1969; but see Norris, 2006, for a critical discussion of this assumption), then LF words should be recognized more slowly and interfere comparatively more. The assumption that HF words might be processed more quickly also receives support from network modeling studies. For example, in Steyvers and Tenenbaum's (2005) model of developing semantic networks, word frequency influences the probability of connecting new nodes to existing nodes, resulting in HF words having more central, highly connected nodes that are more likely to be accessed first.
In contrast, the “output” account locates the distractor frequency effect at the level of postlexical mechanisms leading to the assembly and execution of articulatory programs (e.g., Mahon et al., 2007; Miozzo & Caramazza, 2003). Word distractors are assumed to have a privileged relationship with the articulators and enter an output buffer as phonologically well-formed responses. Furthermore, the speed at which a response enters the output buffer is assumed to be related to frequency and to influence the speed at which the distractor can be excluded according to task-relevant criteria. By entering the buffer faster, HF distractors are excluded earlier, thus incurring shorter delays in picture naming relative to LF distractors.
Discriminating between these alternative accounts has implications for explanations of PWI and word production theories alike. For example, if the “output” account were correct, distractors would potentially interact with multiple levels of processing of word production, including the level where phonological/articulatory information is computed. For example, the SI effect could result from categorically related distractors entering the buffer faster because of semantic priming, with a decision mechanism taking longer to exclude the distractor as it also satisfies some response relevant criteria (Mahon et al., 2007; Miozzo & Caramazza, 2003). Hence, according to this account, distractors potentially influence prelexical semantic processes as well as postlexical ones. On the other hand, the distractor frequency effect appears problematic for the lexical selection by competition (LSC) account that proposes the time taken to select the target word depends on the activation levels of the competing lexical nodes. Accordingly, the higher the activation level of competing words, the longer name selection takes (e.g., Levelt, Roelofs, & Meyer, 1999; Starreveld & La Heij, 1996; Harley, 1993; Roelofs, 1992; La Heij, 1988). Because HF words should be more strongly activated compared with LF words, this view incorrectly anticipates larger interference for the former than the latter, unless additional assumptions are made (e.g., distractor blocking; Roelofs, 2005).
There are results seemingly favoring the “output” account. For example, the distractor frequency effect varied as a function of distractor phonology, suggesting that this is an effect occurring at the level where phonological/articulatory information are processed in naming (Miozzo & Caramazza, 2003). A further result was accrued by Dhooge and Hartsuiker (2010; Experiment 2) by masking written word distractors. They reasoned that if the output account were correct, masking the distractor should preclude the formation of a response in the articulatory output buffer, thus eliminating the distractor frequency effect. Masking did eliminate the effect. Moreover, a direct testing of the “input” account conducted by Miozzo and Caramazza (2003; Experiments 2 and 3) did not yield results expected within this account. Althought distractor interference should reduce when word recognition is facilitated (e.g., by repetition), it should increase if word recognition is made more difficult (e.g., by cAsE aLtErNaTiOn). Neither prediction was confirmed.
Each of the tests employed to examine the “input” and “output” hypotheses relies on a complex series of assumptions. Let us exemplify this point referring to case alternation. This manipulation represents a valid test of the “input” account if one assumes that (a) case alternation slows ignored/unattended distractor processing and modulates the interference effect and (b) frequency effects in word recognition and picture naming have the same locus and should therefore interact. With respect to the first assumption, case-alternated primes have no differential impact on masked repetition priming (Forster, 1998) or on conventional priming at short SOAs (Lee, Honig, & Lee, 2002), indicating a case alternation effect may only appear for attended stimuli. With respect to the second assumption, there is evidence suggesting that frequency effects in word comprehension (reading) and spoken word production are different, with frequency effects in the former modality being relatively independent of semantic processing while depending on the presence or absence of semantic constraint in the latter (Gollan et al., 2011). In brief, the complexity of assumptions underlying the tests of “input” and “output” accounts would make it desirable to acquire additional evidence to adjudicate between the alternative proposals. We addressed this issue from a novel perspective in the present investigation by characterizing the neural correlates of the distractor frequency effect.
A number of studies have adopted neuroimaging techniques to provide converging evidence for the level(s) at which distractor effects occur in speech production (e.g., Righi, Blumstein, Mertus, & Worden, 2010; Bles & Jansma, 2008; Heim, Friederici, Schiller, Rüschemeyer, & Amunts, 2008; de Zubicaray, McMahon, Eastburn, & Pringle, 2006; de Zubicaray, Wilson, McMahon, & Muthiah, 2001). This is because of the increasing realization that brain imaging data represents an additional dependent variable of relevance to spreading activation models (e.g., Goldrick, 2007; Dell & Sullivan, 2004). There is now a large literature relating brain activation data to stages of processing in models of spoken word production (e.g., Acheson, Hamidi, Binder, & Postle, 2011; Peeva et al., 2010; Schuhmann, Schiller, Goebel, & Sack, 2009; Alario, Chainay, Lehericy, & Cohen, 2006; Indefrey & Levelt, 2004). These studies have identified roles for the midsection of the left middle temporal gyrus in lexical semantic processing and the posterior section of the middle and superior temporal gyri (Wernicke's area) in phonological word form retrieval, respectively, within a predominantly left hemisphere cerebral network. During naming of depicted objects, the time course of activation in these two regions typically occurs between 150 and 300 msec following initial visual object recognition, with postlexical processes of syllabification and phonetic encoding followed by articulation occurring between 300 and 600 msec attributed to the posterior left inferior pFC (Broca's area) and premotor cortical areas, respectively (e.g., Acheson et al., 2011; Schuhmann et al., 2009; Indefrey & Levelt, 2004). Consequently, these studies provide candidate brain regions for testing the “input” and “output” accounts. Crucially, the two accounts make contrasting predictions concerning the brain regions associated with the distractor frequency effect. Although the “output” account anticipates activation related to distractor frequency in brain regions supporting phonological and articulatory processing, no such activation is anticipated by the “input” account.
A second goal of the present investigation relates to the nature of the distractor frequency effect. There was a confounding, in PWI studies, of distractor frequency with age of acquisition (AoA; see Brysbaert & New, 2009; Brysbaert & Ghyselinck, 2006). Frequency and AoA are highly intercorrelated, although have also been demonstrated to exert independent effects in reading, speaking, and speech comprehension (Brysbaert & Cortese, 2011; Cortese & Khanna, 2007; Brysbaert & Ghyselinck, 2006). In addition, a number of studies have provided evidence indicating that frequency and AoA effects might have different loci across a range of tasks (e.g., Catling & Johnston, 2009; Dent, Johnston, & Humphreys, 2008). Further arguments for the distinctiveness of AoA come from accounts attributing a processing advantage for early AoA words to network plasticity—As the network develops, plasticity is reduced resulting in less accessible representations for later acquired words (e.g., Menenti & Burani, 2007; Lambon Ralph & Ehsan, 2006; Ellis & Lambon Ralph, 2000). Consistent with the growing body of evidence suggesting the autonomy of AoA from frequency, Catling et al. (2010; Experiment 2) were able to demonstrate a frequency-independent distractor AoA effect in PWI, with late-acquired words producing greater interference.
Parallel to differential effects demonstrated behaviorally, partially distinct patterns of brain activation would possibly emerge when investigating the fMRI correlates of frequency and AoA. Behavioral data could thus help us to identify possible candidate brain regions sensitive to AoA. Belke, Brysbaert, Meyer, and Ghyselinck (2005) found an interaction between the SI effect and AoA in blocked cyclic naming, in which pictures are presented in categorically homogeneous versus mixed contexts. They interpreted this result as indicating that the AoA effect occurs at a lexical semantic (lemma) level of processing (see also Brysbaert & Ghyselinck, 2006). Of note, no interaction between SI and frequency was found in the same task (Santesteban, Costa, Pontin, & Navarrete, 2006), a contrasting pattern further suggesting the distinctiveness of frequency and AoA effects. A number of studies have identified roles for the middle and posterior portions of the left middle temporal gyrus (pMTG) in lexical semantic processing (e.g., Acheson et al., 2011; Peeva et al., 2010; Indefrey & Levelt, 2004). These regions have demonstrated increased activity in categorically related compared with unrelated distractor conditions in fMRI studies of picture naming (e.g., de Zubicaray et al., 2001, 2006). In line with the hypothesis that AoA is linked to lexical semantic processing (Belke et al., 2005), it seems reasonable to anticipate responsiveness of these regions to AoA.
The effect of distractor frequency and AoA in PWI were tested in Experiments 1 and 2, respectively. We employed sparse temporal sampling designs within two functional MRI experiments, permitting overt naming responses to be accurately recorded during scanning (e.g., Heim et al., 2008; de Zubicaray et al., 2001, 2006).
EXPERIMENT 1
The goal of the first experiment was to determine the locus of the distractor frequency effect, contrasting predictions from input and output accounts. If the latter account is correct, then we would expect to observe increased activity for LF words in brain regions associated with articulation and control mechanisms. Reliable cerebral correlates of postlexical stages of processing (syllabification, phonetic encoding, and motor articulation) encompass both left inferior frontal gyrus (IFG) and premotor cortices (e.g., Peeva et al., 2010; Eickhoff, Heim, Zilles, & Amunts, 2009; Schuhmann et al., 2009; Tremblay & Gracco, 2009; Alario et al., 2006; Indefrey & Levelt, 2004). Within the premotor cortex, Alario et al. (2006) identified a rostro-caudal gradient in the SMA corresponding to postlexical selection, phonetic encoding, and articulatory processes during spoken word production (see also Peeva et al., 2010). Tremblay and Gracco (2009) likewise recently identified pre-SMA as playing a central role in response selection of spoken words. However, as Dhooge and Hartsuiker (2010) noted, the output account does not specify the nature of the control mechanism hypothesized to operate on the articulatory buffer. They tentatively proposed that the verbal self-monitoring system might be responsible for this function, documenting a range of experimental findings in support. In their meta-analysis, Indefrey and Levelt (2004) ascribed the monitoring of both internal and external speech to bilateral posterior superior temporal gyri (pSTG), a finding supported by more recent studies (see Price, 2010; e.g., Zheng, Munhall, & Johnsrude, 2010). Thus, it seems reasonable to assume that the premotor cortex (especially SMA), IFG, and pSTG would show increased activity if the output account were correct.
As noted in Introduction, the input account predicts increased activity in the middle and posterior portions of the left middle temporal gyrus and left posterior middle and superior temporal gyri (pMTG/pSTG) as they have been implicated in lexical processing generally (lemma selection and phonological word form retrieval, respectively; e.g., Acheson et al., 2011; Peeva et al., 2010; Indefrey & Levelt, 2004) and during SI in PWI specifically (e.g., de Zubicaray et al., 2001, 2006).
Methods
Participants
Seventeen healthy volunteers (10 women) with a mean age of 22 years (SD = 3.5 years) performed the experiment. All were undergraduate students of the University of Queensland. All were right-handed and native English speakers, with no history of neurological or psychiatric disorder, substance dependence, or known hearing deficits. All had normal or corrected-to-normal vision and gave informed consent in accordance with the protocol approved by the Medical Research Ethics Committee of the University of Queensland. They were reimbursed AUD$30 for participating.
Materials
The materials were identical to those used by Catling et al. (2010; Experiment 1). Forty-eight black-and-white line drawings were selected from Snodgrass and Vanderwart (1980). These were split evenly into early and late-acquired picture sets that were matched on a range of linguistic variables following Barry, Hirsh, Johnston, and Williams (2001). HF and LF distractors were matched on a range of linguistic variables including AoA (for information about the matching variables, see the appendix in Catling et al., 2010). Each target picture was paired with an HF and an LF word that did not share a semantic or phonological relationship with it. Target pictures were also presented without distractor words in a neutral condition to examine a potentially independent effect of target picture AoA and to determine the direction of distractor related activity in the fMRI experiment.
A laptop PC running Microsoft VisualBasic and ExacTicks (Ryle Design, Mt. Pleasant, Michigan) software was used to show the picture and word stimuli and record vocal responses on digital audio files (sampling rate, 11 kHz). Line drawings were presented in black on a luminous white background, and the visual distractor words were shown in black lowercase Times New Roman 18-point bold font in the center of each picture. Stimuli were back-projected using a BenQ SL705X projector onto a screen that participants viewed through a mirror mounted on the head coil, and subtended approximately 10° of visual angle when each participant was positioned for imaging. A 30 db attenuating headset was used to reduce gradient noise. Naming responses were recorded on digital audio files using a custom positioned fiber-optic dual-channel noise-cancelling microphone attached to the head coil (FOMRI-III, Optoacoustics Ltd., Or-Yehuda, Israel; www.optoacoustics.com). Naming latencies were determined automatically with a voice key software custom written in Microsoft VisualBasic and verified manually using Audacity software (audacity.sourceforge.net) in case nonvocal noise triggered the voice key.
Procedure
A PWI paradigm was employed. Participants were first familiarized with the set of experimental pictures with the appropriate label printed below. The size of the pictures, including background, was approximately 10 cm wide by 10 cm high. Over two consecutive practice blocks they were instructed to name the pictures as fast and as accurately as possible. Erroneous naming responses were corrected. In a final block, they viewed the pictures without labels and were instructed to name the pictures per the instructions above.
Two experimental blocks each comprising 72 trials presented in pseudorandom order were then conducted (48 target pictures presented in three conditions: neutral/no distractor, HF distractor, and LF distractor). A short break was permitted between the two blocks while a structural image was acquired (see fMRI Acquisition below). Trial presentation was pseudorandomized across participants using Mix software (van Casteren & Davis, 2006), such that two presentations of the same picture were always interceded by at least five different pictures, and trials from a given condition were presented no more than twice in succession. Participants were instructed to name the pictures as quickly and accurately as possible while ignoring the distractor word. They were also instructed not to speak or move during image acquisition and, in the event of a naming error, not to correct their response. Trial presentation involved the following sequence: A fixation point (+) was shown for 500 msec, followed by the presentation of the superimposed target and distractor for 750 msec. Intertrial interval was 15 sec.
fMRI Acquisition
Scanning was performed using a Bruker Medspec (Erlangen, Germany) 4T system equipped with a transverse electromagnetic head coil for radiofrequency transmission and reception (Vaughan et al., 2002). A point-spread function mapping sequence was first acquired to correct geometric distortions in the functional images (Zaitsev, Hennig, & Speck, 2003). Functional images depicting BOLD contrast were then acquired with a gradient-echo EPI sequence optimized for both image quality and noise reduction (64 × 64 matrix, 36 axial slices, 3.5 mm in plane resolution, slice thickness = 3.5 mm, effective repetition time [TR] = 15 sec; echo time = 30 msec; flip angle = 90°; McMahon, Pringle, Eastburn, & Maillet, 2004). Two blocks of 73 image volumes were acquired using a sparse temporal sampling sequence to capture the estimated peak BOLD signal response to task-related neural activity (Eden, Joseph, Brown, Brown, & Zeffiro, 1999; Elliott, Bowtell, & Morris, 1999). For each trial, no field gradients were applied for a 4-sec period of relative silence, allowing for stimulus presentation and the participant's overt verbal response. A single image volume was then acquired within 3 sec, approximately coincident with the trial's estimated peak BOLD response. No field gradients were applied for an additional 8-sec period to allow the BOLD response to the gradient noise to return to baseline (for a diagram of the imaging protocol, see Figure 1 in de Zubicaray et al., 2001). Head movement was limited by foam padding within the head coil. A 3-D T1-weighted structural image was acquired between the two functional imaging runs using a magnetization prepared rapid acquisition gradient-echo sequence (MP-RAGE; 2563 matrix; 0.9 mm3 voxels). Total imaging time was approximately 50 min.
fMRI Data Preprocessing and Analysis
The fMRI data were preprocessed and analyzed using statistical parametric mapping software (SPM8; Wellcome Department of Imaging Neuroscience, Queen Square, London, U.K.). The first volume in each fMRI block was discarded, and the remaining images were realigned to the first image of the first block using the INRIAlign toolbox (Freire, Roche, & Mangin, 2002). A mean image was generated from the realigned series and coregistered to the T1-weighted image. The T1-weighted image was next segmented using the “New Segment” procedure. The “DARTEL” toolbox (Ashburner, 2007) was then employed to create a custom group template from the segmented gray and white matter images, and individual flow fields were used to normalize the realigned fMRI volumes to the Montreal Neurological Institute (MNI) atlas T1 template. The images were resampled to 3 mm3 voxels and smoothed with a 9-mm FWHM isotropic Gaussian kernel. Global signal effects were then estimated and removed using a voxel level linear model (Macey, Macey, Kumar, & Harper, 2004).
We conducted a two-stage, mixed effects model statistical analysis. Event types corresponding to distractor and error (see Behavioral Results below) conditions were modeled as effects of interest with delta functions representing each onset and convolved with a basis function consisting of a single finite impulse response with a window length corresponding to the TR. As the sparse image sequence does not acquire BOLD time course information, trials were not convolved with a conventional hemodynamic response function (see Gracco, Tremblay, & Pike, 2005; Eden et al., 1999; Elliott et al., 1999). High- and low-pass filtering were not applied, because of the long TRs involved and the use of detrending (Macey et al., 2004). Linear contrasts were applied to each participant's parameter estimates at the fixed effects level, then entered in a group level repeated measures ANOVA, in which covariance components were estimated using a restricted maximum likelihood procedure to correct for nonsphericity (Friston et al., 2002). Regions with significant main effects and/or interactions were investigated with planned t contrasts.
A priori ROIs (Figure 1) were defined using labeled maximum likelihood gray matter maps from 3-D probabilistic atlases (Eickhoff et al., 2005; Hammers et al., 2003). These were left mid-MTG and posterior temporal cortex (Hammers et al., 2003) and left BA 44/45 (Broca's area) and premotor cortex (BA 6; Eickhoff et al., 2005). The latter cytoarchitectonically defined region encompassed the stereotactic MNI coordinates reported by both Tremblay and Gracco (2009) and Alario et al. (2006) for their SMA regions involved in postlexical selection, as well as the ventrolateral premotor area implicated in articulation (Peeva et al., 2010; Indefrey & Levelt, 2004). A height threshold of p < .005 was adopted in conjunction with a cluster threshold of p < .05 estimated for the whole brain (54 contiguous voxels) and for each ROI volume using a Monte Carlo estimation procedure with 10,000 simulations (alphasim, implemented in Analysis of Functional NeuroImages toolkit, AFNI; National Institute of Mental Health, Bethesda, MD). The height threshold of p < .005 threshold was adopted instead of the more commonly used p < .001 to aid identification of possible overlap in activated regions across experiments.
Results
Behavioral Data
Trials scored as errors included incorrect or omitted naming responses and dysfluencies (e.g., stuttering). These were excluded from analysis (5.2%). Additional trials were excluded in which naming onset RTs were <300 or >3000 msec (1.2%). Mean naming RTs as a function of distractor condition are given in Table 1. Repeated measures ANOVAs were conducted for RTs with distractor frequency and target picture AoA conditions with F1 treating participants as a random factor and F2 treating items as a random factor. Because of the low error rates, these were not subjected to analysis.
Pictures . | Distractor Condition . | ||
---|---|---|---|
Low Frequency . | High Frequency . | Neutral (No Word Distractors) . | |
Early | 1202 (356) | 1175 (333) | 1087 (323) |
Late | 1215 (336) | 1181 (357) | 1142 (336) |
Mean | 1209 (345) | 1178 (343) | 1114 (328) |
Pictures . | Distractor Condition . | ||
---|---|---|---|
Low Frequency . | High Frequency . | Neutral (No Word Distractors) . | |
Early | 1202 (356) | 1175 (333) | 1087 (323) |
Late | 1215 (336) | 1181 (357) | 1142 (336) |
Mean | 1209 (345) | 1178 (343) | 1114 (328) |
Standard deviations are in parentheses.
There were significant main effects of target picture AoA [F1(1, 16) = 6.92, MSE = 2283.15, p < .05, η2 = 0.30, and F2(1, 23) = 10.80, MSE = 5612.87, p < .005, η2 = 0.32] and distractor frequency [F1(2, 32) = 28.87, MSE = 2727.18, p < .001, η2 = 0.64, and F2(2, 46) = 37.04, MSE = 3215.37, p < .001, η2 = 0.62] and no significant interaction (both Fs < 3). Paired samples t tests showed a significant effect of distractor frequency [t1(16) = 2.8, p < .05 and t2(23) = 2.4, p < .05]. In summary, the results replicated those of Catling et al. (2010; Experiment 1) with comparable RTs. Pictures with late-acquired names were named more slowly than those with early-acquired names, and the distractor frequency effect was confirmed; pictures with LF distractors were named more slowly than those with HF words.
fMRI Data
Data from a single participant were excluded from the fMRI analyses because of excessive head movement during image acquisition, defined as exceeding one voxel (3 mm) within an imaging run. Group-averaged motion and rotation parameters from the remaining 16 participants were less than 1 mm and 1°, respectively, consistent with data reported for sparse fMRI acquisitions in the literature (e.g., Gracco et al., 2005). A repeated measures ANOVA failed to reveal any brain region demonstrating a significant main effect of Target picture AoA or significant interaction between Target picture AoA and Distractor frequency, either at the whole-brain level or in any of the a priori defined ROIs, although it did reveal a significant main effect of Distractor frequency. Therefore, the data were collapsed across Target picture AoA for subsequent analyses. A repeated measures ANOVA on the collapsed data (HF, LF and, neutral conditions) revealed clusters showing a significant main effect. These were subjected to post hoc planned t contrasts.
A planned contrast of LF > HF distractors revealed significant activity at the whole-brain level in three large clusters encompassing medial and lateral premotor, primary sensorimotor and caudal anterior cingulate cortices bilaterally, left pSTG/pMTG and right supramarginal gyrus/pSTG. ROI analyses revealed significant activity in the left medial and lateral premotor cortex and the left pSTG/pMTG (Table 2 and Figure 2). No other ROIs showed significant activity. In addition, no significant activity was observed for the reverse contrast (HF > LF distractors) at the whole-brain level or in the ROIs nor did contrasts of LF and HF distractor conditions versus the neutral condition reveal any significant activity.
. | Peak MNI (x y z) . | Z Score . | Cluster Size (Voxels) . |
---|---|---|---|
LF > HF Words (Experiment 1) | |||
Bilateral premotor, primary sensorimotor and caudal anterior cingulate corticesa | 3 12 51 | 4.64 | 1909 |
Left pSTG/pMTGa | −48 −42 3 | 3.13 | 88 |
Right supramarginal gyrus/pSTGa | 57 −42 24 | 4.0 | 182 |
Left medial preSMA/SMAb | −3 9 51 | 4.6 | 317 |
Left lateral premotor cortexb | −45 −12 51 | 3.12 | 28 |
Left posterior temporal cortexb | 48 −42 3 | 3.13 | 55 |
Late AoA > Early AoA Words (Experiment 2) | |||
Left mid-MTG and pSTGa | −51 −24 3 | 4.4 | 107 |
Left inferior parietal lobule/angular gyrusa | −42 −57 21 | 3.61 | 150 |
Bilateral anterior cingulatea | 3 33 30 | 3.68 | 74 |
Left posterior temporal cortexb | −42 −33 12 | 3.29 | 33 |
Left middle temporal cortexb | −42 −57 18 | 3.27 | 20 |
−54 −27 −6 | 3.27 | 16 | |
Late AoA Words > Neutral (Experiment 2) | |||
Right mid-STG | 42 −9 −3 | 4.0 | 56 |
Inferior parietal lobule and pSTG | 60 −18 18 | 3.42 | 61 |
Left mid-MTG and pMTG/STG | −51 −30 12 | 3.6 | 67 |
Left posterior temporal cortexb | −51 −33 12 | 3.51 | 52 |
Neutral > Late AoA Words (Experiment 2) | |||
Left parahippocampal gyrus | −12 −39 −6 | 3.6 | 86 |
. | Peak MNI (x y z) . | Z Score . | Cluster Size (Voxels) . |
---|---|---|---|
LF > HF Words (Experiment 1) | |||
Bilateral premotor, primary sensorimotor and caudal anterior cingulate corticesa | 3 12 51 | 4.64 | 1909 |
Left pSTG/pMTGa | −48 −42 3 | 3.13 | 88 |
Right supramarginal gyrus/pSTGa | 57 −42 24 | 4.0 | 182 |
Left medial preSMA/SMAb | −3 9 51 | 4.6 | 317 |
Left lateral premotor cortexb | −45 −12 51 | 3.12 | 28 |
Left posterior temporal cortexb | 48 −42 3 | 3.13 | 55 |
Late AoA > Early AoA Words (Experiment 2) | |||
Left mid-MTG and pSTGa | −51 −24 3 | 4.4 | 107 |
Left inferior parietal lobule/angular gyrusa | −42 −57 21 | 3.61 | 150 |
Bilateral anterior cingulatea | 3 33 30 | 3.68 | 74 |
Left posterior temporal cortexb | −42 −33 12 | 3.29 | 33 |
Left middle temporal cortexb | −42 −57 18 | 3.27 | 20 |
−54 −27 −6 | 3.27 | 16 | |
Late AoA Words > Neutral (Experiment 2) | |||
Right mid-STG | 42 −9 −3 | 4.0 | 56 |
Inferior parietal lobule and pSTG | 60 −18 18 | 3.42 | 61 |
Left mid-MTG and pMTG/STG | −51 −30 12 | 3.6 | 67 |
Left posterior temporal cortexb | −51 −33 12 | 3.51 | 52 |
Neutral > Late AoA Words (Experiment 2) | |||
Left parahippocampal gyrus | −12 −39 −6 | 3.6 | 86 |
Height threshold p < .005 and p < .05, cluster corrected.
aWhole-brain corrected.
bROI corrected.
Discussion
The distractor frequency effect was replicated (Catling et al., 2010, Experiment 1; Dhooge & Hartsuiker, 2010, Experiment 1), confirming that the effect is robust. The fMRI results may be interpreted as being consistent with an output account (Mahon et al., 2007). Significantly increased activity was observed in the medial and lateral premotor cortex for LF versus HF distractor conditions, with a peak corresponding to the boundary between the pre-SMA and SMA. These regions have been associated with postlexical selection and articulation by various researchers (Peeva et al., 2010; Tremblay & Gracco, 2009; Alario et al., 2006; Indefrey & Levelt, 2004). The absence of activity in the IFG may be consistent with the notion that responses that are already syllabified and phonetically encoded enter the articulatory buffer (Mahon et al., 2007), as activity in this region precedes that in premotor and motor cortices during speech production (Eickhoff et al., 2009). The increase in activity in the left pSTG supports Dhooge and Hartsuiker's (2010) proposal that the verbal self-monitoring system may be responsible for checking the contents of the output buffer and initiating the removal of inappropriate responses. The absence of activity in the left mid-MTG indicates that, contrary to the predictions of the input account, the activation levels of nontarget lexical (lemma) nodes are unlikely to be responsible for the effect (e.g., Acheson et al., 2011; Indefrey & Levelt, 2004; see Miozzo & Caramazza, 2003).
EXPERIMENT 2
The second experiment aimed to investigate whether the effects of distractor frequency and AoA involve common or distinct brain regions. According to the output account, the activation levels of nontarget lexical nodes do not affect selection (Mahon et al., 2007; also Miozzo & Caramazza, 2003). As neither early nor late-acquired words unrelated to the target picture are relevant to the response required by the task, this means the speed at which they enter the buffer will be primarily responsible for the effect. As early-acquired words are accessed first, they enter the buffer first and are first to be excluded (e.g., Lambon Ralph & Ehsan, 2006; Steyvers & Tenenbaum, 2005). Hence, it seems reasonable to presume that the output account predicts increased activity for the late AoA distractor condition in the same articulatory motor and verbal self-monitoring regions (following Dhooge & Hartsuiker, 2010) predicted for the distractor frequency effect in Experiment 1, namely, premotor cortex (especially SMA), IFG, and pSTG. In addition, given the prior evidence afforded by Experiment 1, we would expect similar patterns of activity to be involved. The candidate regions for the input account are essentially the same as those proposed for the distractor frequency effect in Experiment 1, namely, left middle and posterior temporal cortex, if the effect has a lexical semantic locus as proposed by Belke et al. (2005).
Methods
Participants
Seventeen healthy volunteers (11 women) with a mean age of 22 years (SD = 4.75 years) performed the experiment. None had participated in Experiment 1 and met the criteria described in the Participants section of Experiment 1.
Materials
Materials were identical to those in Catling et al. (2010; Experiment 2). The same pictures as Experiment 1 were employed, although early and late AoA distractors (again matched on a range of linguistic variables including frequency estimates from various corpora; see Catling et al., 2010; Appendix) were used. Each target picture was paired with an early AoA and late AoA word that did not share a semantic or phonological relationship with it. Target pictures were again presented without distractor words in a neutral condition to examine any effect of target picture AoA and to determine the direction of distractor related activity in the fMRI experiment.
Procedure
The procedure was identical to that in Experiment 1.
fMRI Acquisition, Data Preprocessing, and Analysis
These were identical to those in Experiment 1.
Results
Behavioral Data
Scoring criteria were identical to Experiment 1. Error trials were excluded from analysis (1.8%), as were RT outliers (0.1%). Mean naming RTs as a function of distractor condition are given in Table 3. Repeated measures ANOVAs were conducted for RTs with Distractor AoA and Target picture AoA conditions with F1 treating participants as a random factor and F2 treating items as a random factor. Because of the low error rates, these were not subjected to analysis.
Pictures . | Distractor Condition . | ||
---|---|---|---|
Early AoA . | Late AoA . | Neutral (No Word Distractors) . | |
Early | 1166 (261) | 1225 (279) | 1093 (3248) |
Late | 1196 (313) | 1239 (238) | 1138 (268) |
Mean | 1180 (251) | 1231 (258) | 1116 (256) |
Pictures . | Distractor Condition . | ||
---|---|---|---|
Early AoA . | Late AoA . | Neutral (No Word Distractors) . | |
Early | 1166 (261) | 1225 (279) | 1093 (3248) |
Late | 1196 (313) | 1239 (238) | 1138 (268) |
Mean | 1180 (251) | 1231 (258) | 1116 (256) |
Standard deviations in parentheses.
These revealed significant main effects of Target picture AoA [F1(1, 16) = 11.22, MSE = 1937.54, p < .005, η2 = 0.41, and F2(1, 23) = 9.65, MSE = 9837.25, p < .01, η2 = 0.29] and Distractor AoA [F1(2, 32) = 58.64, MSE = 1964.64, p < .001, η2 = 0.86, and F2(2, 46) = 47.09, MSE = 2853.46, p < .001, η2 = .67] and no significant interaction (both Fs < 1). Paired samples t tests showed a significant effect of Distractor AoA [t1(16) = 5.0, p < .001 and t2(23) = 3.0, p < .01]. In summary, the results replicated those of Catling et al. (2010; Experiment 2) with comparable RTs: Pictures with late-acquired names were named more slowly than those with early-acquired names, and the independent distractor AoA effect was confirmed: Pictures with late-acquired distractors were named more slowly than those with early-acquired words.
fMRI Data
Data from two participants were excluded from the fMRI analyses because of excessive head movement based on identical criteria to Experiment 1. Group-averaged motion and rotation parameters from the remaining 15 participants were again less than 1 mm and 1°, respectively. A repeated measures ANOVA failed to reveal any brain region demonstrating a significant main effect of Target picture AoA or significant interaction between Target picture AoA and Distractor AoA either at the whole-brain level or in any of the a priori defined ROIs, although it did reveal a significant main effect of Distractor AoA. Therefore, the data were collapsed across target picture AoA for subsequent analyses. A repeated measures ANOVA on the collapsed data revealed clusters showing a significant main effect of Distractor (early, late, and neutral). These were subjected to post hoc planned t contrasts.
A planned contrast of Late > Early AoA distractors revealed significant activity at the whole-brain level in three clusters encompassing the left mid- MTG and pSTG, left inferior parietal lobule/angular gyrus and bilateral anterior cingulate (Figure 3). ROI analyses revealed significant activity in two regions of left pSTG/MTG and left mid-MTG (Table 2). No other ROIs showed significant activity. No significant activity was observed for the reverse contrast (Early > Late AoA distractors) at the whole-brain level or in the ROIs.
In addition, significant activity was observed at the whole-brain level for the contrast of the Late AoA distractor > Neutral conditions in the right mid-STG, inferior parietal lobule and pSTG and left mid-MTG and pMTG/STG (Figure 2). ROI analyses also revealed significant activity in the left pMTG/pSTG. No significant activity was detected in the remaining ROIs. The reverse contrast (Neutral > Late AoA) revealed a single cluster in the left parahippocampal gyrus significant at the whole-brain level (Table 2). No significant activity was observed in any of the ROI analyses nor was any significant activity observed for the early AoA distractor versus neutral contrasts either at the whole-brain level or in ROI analyses.
Discussion
The independent distractor AoA effect reported by Catling et al. (2010; Experiment 2) was replicated. Significantly increased activity was observed in the left mid-MTG and pMTG/STG regions consistent with a role for these regions in lexical selection and phonological word form retrieval (e.g., Acheson et al., 2011; Peeva et al., 2010; Indefrey & Levelt, 2004). The activity in the left pSTG might also be considered consistent with the operation of the verbal self-monitoring system, as Indefrey and Levelt (2004) proposed the region serves as a common store of lexical word form representations for both production and perception. We therefore calculated the overlap in activated voxels between the distractor frequency (Experiment 1) and AoA effects via inclusively masking the former with the latter result. This revealed only 18 of 107 (or 17%) voxels overlapped solely in the posterior portion of the left pSTG, indicating that while the two effects share activity in posterior pSTG that might be sensitive to the self monitoring demands of the task, the distractor AoA effect is associated with additional activity in both middle and posterior MTG consistent with the operation of lexical level processes (Figure 3). More importantly, the absence of significant activity in the premotor cortex or IFG precludes an interpretation of a postlexical mechanism operating at the level of an articulatory response buffer (Mahon et al., 2007). This contrasts with the extensive activity observed in the premotor cortex in Experiment 1, indicating that the independent distractor frequency and AoA effects involve different mechanisms.
GENERAL DISCUSSION
This article is the first to report fMRI data for the distractor frequency and AoA effects in PWI. The results indicate quite clearly that the two effects do not involve identical processing mechanisms based on the observed spatial dissociations in cortical activity across the two experiments. Although hypotheses from two rival accounts of distractor interference in PWI were supported in each experiment, neither account is capable of providing a complete explanation. Rather, our data indicate that word distractors may affect each level of processing proposed by the two accounts.
The Distractor Frequency Effect
Miozzo and Caramazza (2003) questioned the conventional LSC account of word production based on a series of investigations of the distractor frequency effect. Subsequent studies confirmed their basic findings by controlling for the potentially confounding factor of AoA (Catling et al., 2010, Experiment 1; Dhooge & Hartsuiker, 2010, Experiment 1). The behavioral data from Experiment 1 replicated Catling et al.'s results, confirming the robustness of the effect. In Experiment 1, the independent distractor frequency effect in PWI was associated with significant increases in activity in a series of cortical regions associated previously with articulatory motor and verbal self-monitoring mechanisms in spoken word production, including the medial (SMA/pre-SMA) and lateral premotor cortex and left pSTG (e.g., Peeva et al., 2010; Tremblay & Gracco, 2009; Alario et al., 2006; Indefrey & Levelt, 2004). The premotor cortex activity, in particular, may be considered consistent with the output hypothesis, as it attributes the locus of the effect to a postlexical articulatory response buffer (Mahon et al., 2007). The activation observed in primary motor cortex is also consistent with Peeva et al.'s (2010) recent proposal that projections from medial and lateral premotor cortices to primary motor cortex transform sublexical representations into a set of motor commands to the speech articulators.
It is difficult to reconcile the premotor cortex result with an input account. Indefrey and Levelt's (2004) meta-analysis of the time course of spoken word production indicated premotor cortex activity occurs subsequent to lexical (lemma and word form) processing in middle and posterior temporal cortical areas. Nor do the results appear to support Roelofs' (2005) modified LSC mechanism implemented in WEAVER++. In Roelofs' (2005) WEAVER++ account, a selective attention mechanism reactively blocks distractor processing, favoring the production of the target name. The speed of blocking depends on the speed with which the distractor information becomes available. As HF words are accessed first (e.g., Steyvers & Tenenbaum, 2005), they are first to be blocked. Roelofs, Piai, and Schriefers (2011) were able to demonstrate that WEAVER++ accounts for the distractor frequency effect and the results of Dhooge and Hartsuiker's (2010) masking and SOA manipulations. However, the WEAVER++ model still entails the assumption that the activation levels of the lexical nodes of LF words are higher once processing of HF words is reactively blocked, leading to a prediction of increased activity in middle and posterior MTG/STG. Roelofs (2008) associated the attentional control mechanism in WEAVER++ with the anterior cingulate and IFG. The former region showed increased activity as part of a larger cluster involving premotor and sensorimotor regions, although this activity was located in the cingulate motor area, indicating a more likely motor role. The left IFG/Broca's area did not show significant activity, either at the whole-brain level or in an ROI analysis.
As Dhooge and Hartsuiker (2010) noted, the output account does not specify the nature of the control mechanisms operating on the articulatory response buffer. They proposed that this role is performed by the verbal self-monitoring system. The finding of increased activity in the pSTG is consistent with this proposal, given evidence linking verbal self-monitoring to this region (Price, 2010; Zheng et al., 2010; Indefrey & Levelt, 2004). Precisely how the self-monitoring system might exclude responses from the articulatory buffer remains to be specified (see Roelofs et al., 2011). Another issue requiring specification relates to the role in PWI played by SMA/pre-SMA, regions sensitive to distractor frequency.
The Distractor AoA Effect
Experiment 2 replicated the independent distractor AoA effect reported by Catling et al. (2010; Experiment 2), indicating it is a robust effect. The direction of the AoA effect (larger interference/activation for late vs. early words) is in the opposite direction to that predicted by an LSC account operating under a typical assumption of lower resting activation for late-acquired words (e.g., Meschyan & Hernandez, 2002). However, network modeling studies indicate that early-acquired words are accessed relatively faster because they have more central, highly connected nodes (Steyvers & Tenenbaum, 2005), possibly as a result of network structuring through development (e.g., Menenti & Burani, 2007; Lambon Ralph & Ehsan, 2006; Ellis & Lambon Ralph, 2000). The AoA effect was associated in our experiment with increases in activity in left mid-MTG/STG and pMTG/STG, in addition to the anterior cingulate. The former regions are related to lexical level processes, including lemma and phonological word form retrieval (e.g., Acheson et al., 2011; Peeva et al., 2010; Indefrey & Levelt, 2004), supporting an input account in which representations of early-acquired words are accessed first (e.g., Lambon Ralph & Ehsan, 2006; Steyvers & Tenenbaum, 2005). Of note, these regions have demonstrated increased activity in categorically related compared with unrelated distractor conditions in fMRI studies of SI effects (e.g., de Zubicaray et al., 2001, 2006). Thus, these findings converge with the interaction between AoA and SI observed by Belke et al. (2005), a result interpreted by them as indicating independent AoA effects occur at a lexical (lemma) level of processing.
The ACC activity for the independent distractor AoA effect is consistent with that observed previously for the SI effect in PWI (e.g., de Zubicaray et al., 2001, 2006). This activity was rostral to the cingulate motor area activity observed for the distractor frequency effect. A prevailing account of ACC activity during interference tasks is that it reflects a general competition or conflict monitoring mechanism (e.g., Botvinick, Cohen, & Carter, 2004). On the basis of an assumption of competition among rival lexical nodes for production, a “competition detector” could monitor the total amount of activation of items at a given level of processing and respond with an error signal if a particular threshold is exceeded (Hartsuiker, 2006). This type of monitor was proposed by de Zubicaray and colleagues (2001, 2006) based upon results from their fMRI studies of SI effects, and appended to Harley's (1993) connectionist computational model in a successful simulation of the SI effect (Hockey, Wiles, & de Zubicaray, 2005). Other neuroimaging studies have reached similar conclusions concerning the monitoring role of ACC in speech production (e.g., Riès, Janssen, Dufau, Alario, & Burle, 2011; Christoffels, Formisano, & Schiller, 2007). Alternatively, the activity might also be considered consistent with Roelofs' (2008) LSC account in which ACC performs a role in attentional control during speech production. The additional activity observed in the left inferior parietal lobule appears consistent with some role in distractor processing, given the extensive research linking this region with attention and reading (see Shaywitz & Shaywitz, 2008) and its increased response in the distractor versus neutral condition.
Implications for Input and Output Accounts
The contrasting patterns of activity observed for the independent distractor frequency and AoA effects indicate that neither input nor output accounts provide complete explanations of these PWI effects. A number of recent studies have challenged the reliability and generality of results cited in support of the latter account (e.g., Janssen, Schirm, Mahon, & Caramazza, 2008; cf. Mädebach, Oppermann, Hantsch, Curda, & Jescheniak, 2011; Piai, Roelofs, & Schriefers, 2011; also Mahon et al., 2007; cf. Lee & de Zubicaray, 2010). However, an output account clearly provides a better explanation of the premotor activity associated with the independent distractor frequency effect than an input account, while the latter provides a better account of the middle to posterior MTG/STG activity observed in conjunction with the distractor AoA and SI effects. A way to reconcile our data to these accounts is to propose that word distractors may affect the processes implicated by each account.
The results of Experiment 1 converge with other findings from neuroimaging studies indicating that distractor effects may occur at multiple levels of information processing in word production. The locus of the SI effect in the postcue naming task has been variously attributed to prelexical, lexical, and postlexical mechanisms (Mahon et al., 2007; Dean, Bub, & Masson, 2001; Humphreys, Lloyd-Jones, & Fias, 1995). In that task, participants view two differently colored superimposed pictures, naming the target according to a subsequently presented color cue. When the distractor and target pictures are categorically related, naming latencies are slower than when they are unrelated. Dean et al. (2001) proposed that this interference occurs when demands on the integration of perceptual attributes (color) and conceptual representations are high because of feature overlap among category exemplars. Recent fMRI evidence supports this proposal (Hocking, McMahon, & de Zubicaray, 2010).
AoA Effects in Target Picture Names
Although main effects of target picture AoA were observed in the naming latencies in both experiments, target picture AoA did not interact with the independent distractor frequency or AoA effects, replicating the results of Catling et al. (2010). Nor did target picture AoA produce significant changes in fMRI activity. The absence of an interaction is consistent with research indicating different processing mechanisms for AoA effects according to task (Catling & Johnston, 2009; Brysbaert & Ghyselinck, 2006). However, in an fMRI study of covert picture naming, Ellis, Burani, Izura, Bromiley, and Venneri (2006) reported increased activation for pictures with late-acquired names in the left middle occipital and fusiform gyri, a result they interpreted as reflecting mapping of visual onto semantic representations.
To investigate target picture AoA effects more directly in our data, we performed a post hoc analysis involving a direct t contrast of early and late target pictures in the neutral (i.e., no distractor) condition, combining data from participants across both experiments (n = 31). This analysis likewise failed to reveal any significant activity. As Ellis et al. (2006) employed a blocked experimental design, it is possible that their findings represent the strategic operation of a cognitive search mechanism like that envisaged by Steyvers and Tenenbaum (2005). Alternatively, an absence of differential activity might also be considered consistent with the architecture of neural network models (e.g., Lambon Ralph & Ehsan, 2006; Zevin & Seidenberg, 2002). According to these models, AoA effects in picture naming are because of the arbitrary mappings between objects and their names (i.e., there is no systematic relationship between semantic input and phonological output representations). Although these effects emerge in networks with distributed random representations, they do not emerge in networks with localist ones (Zevin & Seidenberg, 2002). Hence, one would not necessarily expect to observe localized activity attributable to a specific level of processing in an fMRI experiment. This differs from explanations that attribute AoA effects with word distractors to a lexical semantic level of processing (e.g., Belke et al., 2005; see also Brysbaert & Ghyselinck, 2006).
Summary and Conclusions
The two fMRI experiments reported here each provides some support for input and output accounts of the distractor frequency and AoA effects, respectively. However, given the different patterns of activity observed, neither account is able to provide a complete explanation. The fMRI data show the former effect engages regions involved in postlexical processes, including articulation and verbal self-monitoring, while the latter effect engages regions linked with lexical level processes. These interpretations should, nevertheless, be considered within the limitations of inference afforded by neuroimaging, as fMRI data cannot provide direct evidence of relationships between processes and processors in speech production.
Acknowledgments
We thank Jonathan Catling for providing the distractor stimuli for Experiments 1 and 2. This research was supported by Australian Research Council (ARC) Discovery Project grant DP1092619. Greig de Zubicaray was supported by ARC Future Fellowship FT0991634. Niels O. Schiller was supported by the Netherlands Institute for Advanced Study in the Humanities and Social Sciences as a Fellow-in-Residence 2010/2011. Michele Miozzo was supported by NIH grant DC006242.
Reprint requests should be sent to Greig I. de Zubicaray, School of Psychology, University of Queensland, Brisbane, QLD 4072, Australia, or via e-mail: [email protected].