There is a range of variability in the speed with which a single speaker will produce the same word from one instance to another. Individual differences studies have shown that the speed of production and the ability to maintain attention are related. This study investigated whether fluctuations in production latencies can be explained by spontaneous fluctuations in speakers' attention just prior to initiating speech planning. A relationship between individuals' incidental attentional state and response performance is well attested in visual perception, with lower prestimulus alpha power associated with faster manual responses. Alpha is thought to have an inhibitory function: Low alpha power suggests less inhibition of a specific brain region, whereas high alpha power suggests more inhibition. Does the same relationship hold for cognitively demanding tasks such as word production? In this study, participants named pictures while EEG was recorded, with alpha power taken to index an individual's momentary attentional state. Participants' level of alpha power just prior to picture presentation and just prior to speech onset predicted subsequent naming latencies. Specifically, higher alpha power in the motor system resulted in faster speech initiation. Our results suggest that one index of a lapse of attention during speaking is reduced inhibition of motor-cortical regions: Decreased motor-cortical alpha power indicates reduced inhibition of this area while early stages of production planning unfold, which leads to increased interference from motor-cortical signals and longer naming latencies. This study shows that the language production system is not impermeable to the influence of attention.
Producing words does not happen at one constant speed. From decades of research on language production, we know that several linguistic factors can influence the speed of production. For instance, highly frequent words are produced more quickly than less frequent words (Jescheniak & Levelt, 1994). Similarly, it takes longer to initiate speech for a disyllabic word compared with a monosyllabic word (Meyer, Roelofs, & Levelt, 2003). However, even producing the same word will sometimes be slower than other times. Here, we test whether slow production that cannot be explained by linguistic factors alone may in part be explained by the neural attentional state of the speaker just prior to initiating the processing required for production. There is ample evidence from visual or auditory discrimination and judgment tasks that the attentional state of an individual can influence the speed of the required manual response (i.e., Mazaheri et al., 2014). Relatedly, the brain state before a slow response is distinctively different from the brain state before a fast response (Weissman, Roberts, Visscher, & Woldorff, 2006). In this study, we test whether the same may be true when the requisite response is preceded by a cascade of linguistic processing.
Models of word production agree that this cascade of linguistic processing consists of at least a meaning stage and a form stage that lead to articulation (Levelt, Roelofs, & Meyer, 1999; Dell, 1986). Performing a meta-analysis of chronometric and neuroimaging studies of word production, Indefrey and Levelt (2004) linked specific time windows and brain areas to different stages of word production. They followed the word production model of Levelt et al., which consists of the following production stages: conceptual preparation; lemma retrieval; form encoding; and finally, articulation. For picture naming, first a preverbal conceptual message is formulated; this process is thought to take on average ∼200 msec and is linked to activation in occipital and ventral temporal cortical regions. Next, the corresponding lemmas or lexical entries are activated and selected. A lemma represents syntactic information such as word class or grammatical gender. The retrieval of a lemma is proposed to take ∼75 msec, and the model links this stage to activation in the left middle temporal gyrus. For selected lemmas, corresponding sound properties are retrieved during the form encoding stage. This stage is further separated into phonological code retrieval (at ∼275 msec and linked to activation in superior temporal gyrus), syllabification (at ∼355 msec and linked to activation in inferior frontal gyrus), and phonetic encoding (at ∼455 msec and linked to activation in precentral gyrus and SMA). Articulation is then initiated at ∼600 msec and is linked to activation in sensorimotor brain regions.
Increased complexity for each of these stages can cause prolonged processing and lead to later speech onset (Meyer et al., 2003; Jescheniak & Levelt, 1994). Delayed speech can, however, also result from difficulties in nonlinguistic processes. For instance, several studies have shown that word production is susceptible to domain-general processes like attention. Attention is an umbrella term that covers several different abilities. According to an influential theoretical proposal by Posner (2012), attention consists of executive control, orienting, and alerting. Executive control is the ability to remain goal-directed in the face of distraction, orienting is the ability to shift the locus of processing towards a particular spatial position, and alerting is the ability to achieve and maintain alertness, either briefly (e.g., in response to a warning signal or stimulus) or prolonged over extended periods of time (called sustained attention; see Sarter, Givens, & Bruno, 2001). In this study, we examined how word production depends on the ability to maintain attention. Attention waxes and wanes during continuous and repetitive task performance. As James (1890) stated, “There is no such thing as voluntary attention sustained for more than a few seconds at a time” (p. 420). Consequently, task performance is characterized by “intermittent failures in efficiency interspersed with normal performance” (Broadbent, 1971, p. 128). We examined the possibility that a slow picture naming trial is (at least in part) due to reduced sustained attention of the participant just prior to the onset of the to-be-named picture.
There is accumulating evidence that language production is influenced by attention (e.g., Shao, Meyer, & Roelofs, 2013). Importantly for our study, individual differences studies have shown that the speed of production and the ability to maintain attention in the task at hand are related: Individuals with a worse alerting (i.e., reduced sustained attention) ability exhibit a larger number of very slow picture naming response trials than individuals with better alerting ability (Jongman, Meyer, & Roelofs, 2015; Jongman, Roelofs, & Meyer, 2015). These very slow responses were interpreted as trials reflecting lapses of sustained attention, as previously proposed by Unsworth, Redick, Lakey, and Young (2010). For the purposes of our study, we operationalize lapses of sustained attention in precisely this way: relatively slower responses on task.
In this study, the goal was to test for neural evidence that lapses of sustained attention affect word production. We investigated participants' brain states during both slow and fast speech trials under the assumption that slower responses result from momentary lapses of attention. A similar approach has been used previously for simple discrimination or judgment tasks with electrophysiological measures such as EEG and magnetoencephalography. Prestimulus electrophysiological signals can be compared for slow versus fast trials. In particular, research has focused on rhythmic neuronal activity, also known as oscillations. Several studies have found that as RTs decrease, prestimulus oscillatory power in the alpha frequency band (8–12 Hz) decreases (Kelly, Gomez-Ramirez, & Foxe, 2009; Thut, Nietzel, Brandt, & Pascual-Leone, 2006). In other words, a phasic decrease in alpha power corresponds to improved performance in speeded RT tasks. This relationship between phasic alpha power and RTs appears to be specific to the brain region typically responsible for processing the type of information relevant for the task at hand: Mazaheri et al. (2014) found the relevant relationship between RTs and alpha power over visual cortical regions when participants performed a visual orientation discrimination task, whereas the same relationship was present over superior temporal gyrus during an auditory discrimination task. Besides the relationship between alpha power and RTs, prestimulus alpha power has been shown to predict perception performance (Hanslmayr et al., 2007), perception errors (Mazaheri, Nieuwenhuis, van Dijk, & Jensen, 2009), and self-reported attentional state (Macdonald, Mathan, & Yeung, 2011). The precise link between prestimulus alpha and perceptual performance is, however, still a hotly debated issue, with some studies suggesting that higher alpha power may reflect a more conservative response criterion (bias), possibly to guard against false positives (Iemi, Chaumon, Crouzet, & Busch, 2017; Limbach & Corballis, 2016). On the other hand, a recent study (Iemi & Busch, 2018) convincingly demonstrated that the level of prestimulus alpha power reflects the degree of perceptual rather than response bias by comparing this relationship for a visual detection and a visual discrimination task (see Iemi & Busch, 2018, for details).
Romei, Gross, and Thut (2010) provided evidence that the relationship between prestimulus alpha power and subsequent stimulus processing is not merely correlative. Using TMS, they stimulated visual areas via short trains of rhythmic TMS. When the visual areas were stimulated at a frequency of 10 Hz, a frequency in the alpha band, visual target detection was impaired in the hemifield opposite the stimulated hemisphere and enhanced ipsilaterally. These effects were not observed for stimulations at 5 Hz (theta band) and 20 Hz (beta band). This suggests that there may be a causal link between alpha power and target detection performance. It has been proposed that oscillations in the alpha range play a role in selective attention by regulating information flow through inhibition of task-irrelevant brain areas (Jensen & Mazaheri, 2010; Klimesch, Sauseng, & Hanslmayr, 2007). This is supported by studies showing not only a decrease in alpha power in task-relevant brain areas but also an accompanying increase in alpha power in task-irrelevant regions, for instance, in a tactile discrimination task (Haegens, Luther, & Jensen, 2012). Furthermore, Kelly, Lalor, Reilly, and Foxe (2006) showed that the hemisphere ipsilateral to a visual stimulus that needed to be ignored (i.e., the hemisphere that would have processed that stimulus were it attended) exhibited increased alpha power, suggesting that alpha may function as an attentional suppression mechanism.
This inhibitory role of alpha oscillations (along with its facilitatory role in upregulating task-relevant brain regions discussed earlier) is one of our primary interests in this study. It is certainly true that alpha power may fulfill different functional roles in different paradigms, but in the context of this study, the role of alpha oscillations in the inhibition of premature motor cortical activity is of particular interest. This is because our picture naming task requires the appropriately timed use of an effector to produce motor output in response to the sensory input. There is converging evidence that under such conditions, perceptual or cognitive processes can modulate activity in the motor system, such that motor cortical activity can be used to “read out” the accumulation of perceptual evidence in favor of making a particular response (Song & Nakayama, 2009). In a study by de Lange, Rahnev, Donner, and Lau (2013), for instance, participants indicated visual motion direction (left/right) by button press using the hand congruent to the detected motion direction. Expectation for a particular motion direction was cued on some proportion of trials, and prestimulus alpha (and beta) power in sensorimotor cortices (mu power) decreased contralateral to the anticipated motion direction (and hence also contralateral to the hand with which a response was expected to be made). The level of prestimulus alpha power was related to participants' bias in perceptual judgments brought about by the expectation cue. In occipital regions, prestimulus alpha power was higher for trials that did compared with trials that did not have an expectation cue, suggesting that when the direction of motion was unpredictable (no cue), the brain allocated additional resources to the visual system for processing motion direction. Importantly, even on neutral trials sensorimotor prestimulus alpha power lateralization was predictive of motion direction response (although the relationship was not as strong as on cued trials), suggesting that the state of the motor system prior to stimulus onset has a direct influence on the eventual motor response. These findings show that prestimulus alpha power in the motor system behaves similarly to alpha in the sensory cortices, with higher or lower power respectively signaling more or less inhibition of the motor system and a greater or lesser degree of “readiness” (likelihood of release from inhibition) to execute a manual response. In the context of a picture, naming the relevant response is articulation, and we may expect that the level of prestimulus alpha power in the motor system provides a useful index of the degree to which the motor system is inhibited (or disinhibited) while stages of speech planning prior to articulation unfold and is thus predictive of subsequent naming times.
Unlike for perceptual discrimination tasks, for picture naming a cascade of cognitively more complex linguistic processing takes place between the prestimulus period and eventual response. This intervening processing likely requires the motor system to remain in a state of “readiness” without actually executing a response until the necessary linguistic processing has reached the appropriate production stage (phonetic encoding or articulation). We may therefore expect that higher prepicture alpha power in sensorimotor regions should result in a greater ability to suppress an immediate motor response while the necessary linguistic processing stages for production unfold. This would suggest a negative relationship between the level of prepicture alpha power and subsequent naming times (i.e., higher alpha in motor cortical regions leads to shorter naming latencies).
To summarize, in this study, we measured EEG and used prestimulus alpha power as an index of attention while participants named pictures. We hypothesized that if naming latencies are long (at least in part) due to a lapse of sustained attention, the level of alpha power should differentiate between fast and slow trials. Which direction this relationship between alpha power and naming speed will take should depend on the functional locus of the lapses in attention. In picture naming, even before all the linguistic processing stages unfold, the first thing that needs to take place is visual processing of the picture. A participant must recognize the object on the screen to be able to prepare the verbal message. If lapses of attention exert their influence on picture naming by affecting visual processing of the picture, then this is expected to be characterized by higher prepicture alpha power over occipital brain regions in relation to longer naming latencies, a positive relationship. This would be in line with many previous studies on visual perception that have shown alpha power decreases over the parietal and occipital cortices (important task-relevant brain regions for visual perception) to relate to better performance (e.g., Kelly et al., 2009; Thut et al., 2006).
We hypothesize that the reverse relation between prepicture alpha power and naming latencies may be found over the motor cortex, such that higher alpha power would lead to faster naming, a negative relationship. As discussed above, previous research has shown that when prestimulus alpha power is high over task-irrelevant brain regions, performance improves (Haegens et al., 2012; Kelly et al., 2006). We argue that, in the case of picture naming, the motor cortex is initially a task-irrelevant brain region: While visual and early linguistic processing stages unfold, task performance will be improved if any premature speech response (or corresponding motor cortical activity) can be successfully withheld.
The main goal of this study was to examine the extent to which prestimulus alpha power is predictive of picture naming latencies. The relationship between prestimulus alpha and response performance has been well attested and is therefore likely to provide a useful index of whether serendipitous periodic fluctuations in the attentional state of the brain influence word production. As a secondary aim, we explored the influence of alpha power after picture onset and immediately prior to speech onset on naming latencies. For prespeech alpha, we expected to find the well-established alpha power desynchronization prior to movement (articulation) onset (e.g., Pfurtscheller & Lopes da Silva, 1999) and tested whether this movement-related alpha was related to naming latencies, on the one hand, and to prestimulus alpha power on the other hand.
Thirty-seven students from Radboud University, Nijmegen, or the Hogeschool van Arnhem en Nijmegen participated. All participants were Dutch native speakers, had normal/corrected-to-normal vision, and no language impairment. The average age was 21.6 years (range: 19–29 years), with 26 female participants. Participants were paid for participation. Ethical approval was granted by the Ethics Board of the Faculty of Social Sciences of the Radboud University, Nijmegen. Participants provided written informed consent before the start of the experiment.
Participants were presented 60 black-and-white drawings, with each picture shown 10 times. The objects were selected from a database of normed pictures (Severens, Van Lommel, Ratinckx, & Hartsuiker, 2005). The object names varied in frequency (mean lemma frequency: 47 tokens per million, range: 1–247; values taken from CELEX; Baayen, Piepenbrock, & Gulikers, 1995) and in length, with words consisting of one to three syllables (mean length: 1.6 syllables). Pictures were selected for high name agreement (mean = 96.4%, range = 79–100%).
Before the start of the experiment, participants were given a booklet with the 60 pictures and corresponding names and were asked to go through the booklet twice. After EEG setup, participants were tested in a dimly lit room, seated in front of a 19-in. (Samsung Syncmaster 961BF) screen, at a distance of ∼100 cm. Stimuli were presented using Presentation software (Neurobehavioral Systems). Participants' vocal responses were recorded (Sennheiser ME64), and speech onsets were determined manually using Praat (Boersma & Weenink, 2012). A short practice block of 10 trials was presented to familiarize participants with trial timing. These trials allowed for adjustment of the voice key to ensure it was reliably triggered by speech onset.
A trial started with a fixation cross in the center of the screen, presented for a random duration ranging between 1500 and 2000 msec. Then, a picture was presented in the center, fit to a virtual square of 300 × 300 pixels. The object disappeared from the screen 500 msec after the voice key was triggered or after 3000 msec. Finally, five hashtags were presented for 1000 msec where participants could blink. All stimuli were presented in white on a black background.
Ten runs of the 60 pictures were presented. Presentation was pseudorandomized such that a particular starting phoneme of a name never occurred twice in a row. Moreover, objects from the same semantic category never followed one another. In total, 600 trials were presented, with self-timed breaks after 200 and 400 trials. The experimental session lasted ∼50 min.
EEG Data Acquisition
The EEG was continuously recorded using an EEG cap containing 59 active electrodes (see Figure 1). In addition to the electrodes in the cap, one electrode was attached below the left eye to monitor for blinks, and two electrodes were directly placed on the left and right mastoids. All electrodes were online referenced to the electrode placed on the left mastoid. The impedance was kept below 10 kΩ for all electrodes. The EEG was digitized at a rate of 500 Hz and recorded with a low cutoff filter of 0.01 Hz and a high cutoff filter of 200 Hz.
Initial preprocessing was performed using Brain Vision Analyser (Brain Products, Version 2.0.2). First, the data were re-referenced using the average of the left and right mastoid electrodes and then a high-pass filter of 0.1 Hz and a low-pass filter at 30 Hz was applied. The EEG signal was demeaned and corrected for ocular artifacts, using the Gratton and Coles (1989) correction: bipolar vertical electrooculography (EOGV) and horizontal electrooculography (EOGH channels were derived from the difference between electrode 50 in the cap and the electrode on the suborbital ridge for the EOGV and the difference between electrode 51 and 59 in the cap approximating the positions of the outer canthi for the EOGH. For stimulus-locked analyses, epochs of −1500 to 1000 msec were created starting from the onset of picture presentation; for response-locked analyses, the data were segmented from −1100 to 3500 msec relative to picture onset. Trials with a voltage step over 50 μV and trials with an absolute difference greater than 200 μV were rejected.
Remaining preprocessing and other analyses were carried out using the FieldTrip toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011) running in a MATLAB environment (R2014b; The MathWorks, Inc.). Preprocessing was performed separately for stimulus-locked and response-locked data (see below). For every participant and trial, data were segmented between −1500 and 500 msec relative to the onset of the picture for stimulus-locked data and between −1500 and 1500 msec relative to the onset of speech for response-locked data. The EEG signal at bad electrodes was reconstructed from a combination of surrounding electrodes (two electrodes in one participant; one electrode in another). Any remaining artifact-containing trials were removed from the data by visual inspection. Finally, trials for which the RTs in the picture naming task were below 400 msec or above 2000 msec were deemed irregular responses and removed from further analyses. This resulted in between 386 and 525 (M = 465.5; SD = 35.86) usable trials per participant for the stimulus-locked data and between 303 and 485 (M = 415.93; SD = 53.78) usable trials per participant for the response-locked data after preprocessing and data cleaning.
Relationship between Alpha Power and Naming Times
To address our primary hypotheses, EEG data were analyzed time-locked to picture onset (stimulus-locked data). These were followed up by more exploratory analyses time-locked to speech onset (response-locked data).
Time–Frequency Analysis of Power
For each participant, single-trial time–frequency representations of power were computed using a sliding window of 300 msec tapered with a single Hanning taper. This results in temporal precision of 300 msec and intrinsic frequency precision of 3.33 Hz. Power estimates were obtained separately for each electrode in frequency steps of 1 Hz from 2 to 30 Hz and in time steps of 20 msec from −1300 to 300 msec relative to picture onset for stimulus-locked data and from −1500 to 1500 msec relative to speech onset for response-locked data. Single-trial power values were baseline normalized by expressing them as a relative change from mean power in a baseline period between −1300 and −1000 msec relative to picture onset (same baseline period used for both stimulus- and response-locked data) and log10 transforming the resultant values to express power as a decibel change from baseline.
To test for a relationship between alpha power and picture naming RTs, a cross-trial regression analysis was performed for each participant (cf. Cohen & Donner, 2013). For every electrode–frequency–time triplet in the stimulus-locked data (all scalp electrodes; 2–30 Hz, −1000 to 300 msec), an independent samples t statistic was computed based on the output regression coefficient and error term when regressing the baseline normalized power value at that data point against the log-transformed, z-scored RT associated with that particular trial. Spearman correlations were employed to minimize potential effects of nonnormally distributed data. The same approach was used to test for a relationship between source-level (see below) alpha power and RTs in the response-locked data (2–30 Hz, −1300 to 1300 msec).
Cluster-based permutation statistics (Maris & Oostenveld, 2007) were subsequently used to test for relationships between power and RTs that were consistent in time, space, and frequency across participants. For stimulus-locked data, our main hypothesis focused on a relationship between prepicture alpha power and subsequent RTs. For completeness, we also tested for a relationship between postpicture power and RTs. Mean prepicture alpha (8–12 Hz) range correlation t values were compared with zero across participants at every electrode–time couplet between −1000 and 0 msec relative to picture onset using dependent-samples t tests. A cluster threshold of p = .05 was chosen, and data points where t values did not correspond to p values below this threshold were discarded. Remaining t values were then grouped into clusters based on adjacency in space (neighboring electrodes) and time, and t values for each cluster were summed to produce a cluster t statistic. This procedure was then repeated 5000 times, each time with condition labels (correlation t values or zeros) randomly interchanged across participants to construct a Monte Carlo distribution. Cluster t statistic values falling in the upper or lower 2.5th percentile of this Monte Carlo distribution were considered statistically significant. The same procedure was repeated for mean postpicture alpha (0–300 msec relative to picture onset) to test for a relationship between postpicture power and RTs.
For response-locked data, the same cluster-based randomization approach (forming clusters only in time at each source ROI; see below) was used to test for a consistent relationship across participants between mean prespeech (−1300 to 0 msec relative to speech onset) alpha and RTs. Because four source ROIs were tested separately, a Bonferroni corrected alpha level of p = .0125 was considered statistically significant for these response-locked analyses.
For source analyses, preprocessed artifact-attenuated data were re-referenced to the average of all scalp electrodes (common average reference), and the DC offset was removed from each trial. Estimates of source power in time–frequency regions exhibiting statistically reliable sensor-level relationships with picture naming RTs (and corresponding baseline periods) in the stimulus-locked data were computed using a frequency-domain adaptive spatial filtering algorithm (dynamic imaging of coherent sources [DICS]; Gross et al., 2001). With this approach, an optimized spatial filter is constructed for each specified grid location (“voxel”) based on the cross-spectral density (CSD) matrix obtained from the data. CSD matrices were obtained for the alpha (10 Hz center frequency) frequency range using a multitaper (Mitra & Pesaran, 1999) fast Fourier transform and 3-Hz frequency smoothing. CSD matrices were obtained for each of baseline picture (−1300 to −1000 msec relative to picture onset; BASE_ALPHA), prepicture (−560 to 0 msec relative to picture onset; PRE_ALPHA), and postpicture (0–300 msec relative to picture onset; POST_ALPHA) alpha individually, as well as for combined baseline picture, prepicture, and postpicture alpha (for the computation of common spatial filters).
For a single participant, an anatomical T1-weighted MRI of their brain was acquired with a magnetization-prepared, rapid-acquisition echo sequence on a 1.5-T Siemens Magnetom Sonata system. For the same participant, corresponding electrode positions relative to the scalp were recorded (Polhemus Patriot). A realistic three compartment (brain, skull, and scalp) volume conduction head model was constructed using the boundary element method (Fuchs, Kastner, Wagner, Hawes, & Ebersole, 2002) based on this participant's anatomical MRI (segmented using SPM 8). This volume conduction model and set of electrode positions were used to compute leadfields for all participants. The MRI used to construct the volume conduction model was warped to a template MRI (Montreal Neurological Institute [MNI]), and an 8-mm resolution 3-D dipole grid in MNI space was constructed. Leadfields were computed separately for every participant for each grid point in the source model.
For each participant and at each grid point corresponding to a position within the brain, the CSD matrices based on combined data from all time periods of interest in the stimulus-locked data were used in combination with the constructed leadfields to compute common spatial filters for the alpha effects identified in the sensor-level analysis. Source-level spectral power estimates were then obtained by applying these common spatial filters separately to the CSD Fourier output for each alpha (BASE_ALPHA, PRE_ALPHA, POST_ALPHA) data segment of interest. Single-trial source-level spectral power estimates were then averaged for each data segment of interest. Next, decibel power change from baseline was computed by dividing mean spectral power from the pre- and postpicture data segments by mean spectral power from the corresponding baseline data and performing a log10 transform.
To identify anatomical ROIs for further analyses, grand-averaged source-level spectral power values across participants were computed at each grid point, and positive and negative spectral peak MNI coordinates were identified in each data segment of interest (PRE_ALPHA, POST_ALPHA). Region labels were assigned using the Automated Anatomical Labeling Atlas (Tzourio-Mazoyer et al., 2002), MNI coordinates of the homologue region in the opposite hemisphere were also identified for comparison. Identified anatomical ROIs are listed in Table 1. Leadfield matrices were then computed for each participant at grid positions corresponding to these identified ROIs. These ROIs were used to construct source-level time series for both stimulus- and response-locked data.
|Anatomical Region Label .||MNI Coordinates [x y z] .||Peak Power Polarity .|
|Left paracentral lobule||[−12 −40 72]||Positive|
|Right paracentral lobule||[12 −40 72]||Positive|
|Left superior orbitofrontal||[−18 68 −4]||Negative|
|Right superior orbitofrontal||[18 68 −4]||Negative|
|Left supplementary motor area||[−6 −24 62]||Positive|
|Right supplementary motor area||[6 −24 62]||Positive|
|Left cuneus||[−20 −88 26]||Negative|
|Right cuneus||[20 −88 26]||Negative|
|Anatomical Region Label .||MNI Coordinates [x y z] .||Peak Power Polarity .|
|Left paracentral lobule||[−12 −40 72]||Positive|
|Right paracentral lobule||[12 −40 72]||Positive|
|Left superior orbitofrontal||[−18 68 −4]||Negative|
|Right superior orbitofrontal||[18 68 −4]||Negative|
|Left supplementary motor area||[−6 −24 62]||Positive|
|Right supplementary motor area||[6 −24 62]||Positive|
|Left cuneus||[−20 −88 26]||Negative|
|Right cuneus||[20 −88 26]||Negative|
To obtain source-level time series data for each participant at each identified anatomical ROI, a time-domain spatial filtering algorithm was employed (linearly constrained minimum variance [LCMV]; Van Veen, Van Drongelen, Yuchtman, & Suzuki, 1997). This algorithm used the covariance matrix obtained based on the full length (−1500 to 500 msec relative to picture onset for stimulus-locked data; −1500 to 1500 msec relative to speech onset for response-locked data) of the data from the processing stage immediately following artifact rejection, to construct separate spatial filters at each grid location of interest based on the computed leadfield matrices at those grid points. For every participant, these spatial filters were then applied to the EEG data to obtain a time series (only time series for the dominant dipole orientation at each location were used) at each anatomical ROI for each trial. This procedure was carried out separately for the stimulus- and the response-locked data. Time–frequency analyses of power were then conducted on resultant time series, employing identical parameters to those used for sensor-level analyses described above. For each source ROI, baseline normalized mean power values in time–frequency alpha regions identified as exhibiting a relationship with RTs from sensor-level analyses in the stimulus-locked data were extracted for every trial for each participant. These values were then entered into linear mixed-effects modeling analyses (Baayen, Davidson, & Bates, 2008) to investigate whether variability in the data related to item-specific characteristics (e.g., word length or frequency) may be (partially) driving the observed power–RT relationships for each source ROI.
For both prestimulus and poststimulus alpha, a linear mixed-effects model was run using R (R Core Team, 2012) and the R packages lme4 (Bates, Maechler, & Bolker, 2013) and language R (Baayen, 2011). The mixed-effects models included alpha power for the two relevant regions (pre-stim: paracentral lobule and superior orbitofrontal; post-stim: SMA and cuneus), hemisphere and trial number, as well as all interactions as fixed effects. Trial number was included, as it has been shown that alpha power systematically increases over time (Benwell et al., 2019). If naming latencies also change systematically over time (e.g., a learning-related speeding up due to repetition of items), any correlation between alpha and RTs could be mediated by these (potentially) independent relationships with time on task. Including trial number as a fixed effect should account for any such epiphenomenal relationship. The models also included participant and item as random effects to account for RT variability across participants and item-specific variability (e.g., word length; frequency). Fixed effects that did not reliably contribute to model fit were dropped; models were compared using a likelihood ratio test. We present the best-fitting model only, which provides estimates, standard errors, and t values for each coefficient. Factors with absolute values of t > 2 were considered to significantly contribute to explaining the dependent variable (Baayen, 2008). To interpret observed interactions with trial number (time on task), we determined the slope for the influence of alpha power in a particular region on RTs while holding the value of trial number constant from 0 to 600 in steps of 100 trials (similar to binning the data for selected time periods of interest). Slopes for which the 95% confidence interval did not cross zero were considered statistically significant.
Prepicture and Postpicture Alpha Power Predict Picture Naming Times
The cluster-based permutation output for the stimulus-locked data revealed a statistically reliable (p = .00080) negative relationship between prepicture alpha power and subsequent RTs (Figure 2A and B). The higher alpha power was between −560 and 0 msec relative to picture onset, the faster the subsequent picture naming response. A statistically reliable (p = .00040) negative relationship between postpicture alpha power and subsequent RTs was also present (Figure 2C and D). Higher alpha power between 0 and 300 msec relative to picture onset (the entire postpicture interval analyzed) resulted in faster picture naming responses.
Using a DICS beamforming approach to reconstruct cortical sources of prepicture alpha power in the time interval exhibiting a statistically reliable relationship with RTs resulted in the selection of dipoles exhibiting peak alpha power synchronization in left and right paracentral lobule ROIs (Figure 2A) and dipoles exhibiting peak alpha power desynchronization in left and right superior orbitofrontal ROIs (Figure 2B; see Table 1). Mean alpha power between −560 and 0 msec relative to picture onset was extracted from time–frequency data based on reconstructed time courses at each of these dipole locations (LCMV beamformer) to test whether prepicture alpha power synchronization or desynchronization (or both) drives the negative power–RT correlations and to control for item variability, which was not accounted for in the cluster-based analyses. The best-fitting linear mixed-effects model showed a significant effect of Alpha Power in the paracentral lobule (β = −0.006, SE = 0.001, t = −4.74, 95% CI [−0.0079, −0.0033]), of Trial Number (β = −0.020, SE = 0.001, t = −15.60, 95% CI [−0.0230, −0.0178]), and of the interaction between Superior Orbitofrontal Alpha and Trial Number (β = −0.001, SE = 0.000, t = −2.55, 95% CI [−0.0014, −0.0002]) on RTs. Exploring the interaction, the slope of the relationship between superior orbitofrontal alpha and RTs became more negative for trials appearing later in the experiment (−0.00 [−0.00, 0.00], −0.30 [−0.07, −0.53], −0.60 [−0.13, −1.06], −0.89 [−0.20, −1.59], −1.19 [−0.27, −2.11], −1.49 [−0.33, −2.64], −1.78 [−0.40, −3.17]). All slopes were statistically significant apart from the first. Note that there was no main effect of Superior Orbitofrontal Alpha, suggesting that the observed adaptation effect (negative relationship between alpha and RTs that increases in size with time on task) in this region is likely to be epiphenomenal (finding a relationship between alpha and RTs depends on the inclusion of time on task).
To summarize, after accounting for time on task, more pronounced alpha power synchronization in paracentral lobule between 560 and 0 msec prior to picture onset resulted in faster picture naming responses. In addition, RTs became shorter, and a negative relationship between alpha in the superior orbitofrontal ROI and RTs was not present from the beginning but appeared and grew larger for trials presented later in the experiment. We thus cannot rule out that this latter relationship between superior orbitofrontal alpha and RTs may be driven by time on task (i.e., time on task increases alpha and time on task increases RTs, perhaps leading to an epiphenomenal relationship between alpha and RTs).
The same DICS beamforming approach in the postpicture interval where alpha power exhibited a statistically reliable relationship with RTs resulted in the selection of dipoles exhibiting peak alpha power synchronization in left and right SMA ROIs (Figure 2C) and dipoles exhibiting peak alpha power desynchronization in left and right cuneus ROIs (Figure 2D; see Table 1). Mean alpha power between 0 and 300 msec relative to picture onset was extracted from time–frequency data based on reconstructed time courses at each of these dipole locations (LCMV beamformer) as for prepicture alpha. The best-fitting mixed-effects model included mean postpicture alpha power values for the SMA (β = −0.007, SE = 0.001, t = −5.89, 95% CI [−0.0093, −0.0046]), but not for the cuneus ROI. Furthermore, Trial Number was a significant predictor (β = −0.019, SE = 0.001, t = −16.22, 95% CI [−0.0213, −0.0167]). More pronounced alpha power synchronization in SMA between 0 and 300 msec after picture onset resulted in faster picture naming responses. An adaptation effect was also observed, with responses becoming faster as time on task increased.
Prespeech Alpha Power Predicts Picture Naming Times
Having identified a robust relationship between both pre- and postpicture onset alpha power and RTs for picture naming, we next investigated whether the well-established alpha desynchronization prior to and during movement execution (Pfurtscheller & Lopes da Silva, 1999; in our case, the movement in question is articulation) was predictive of RTs, and whether this desynchronization is related in any way to prepicture alpha power in the time region (−560 to 0 msec relative to picture onset) already identified as related to RTs in the stimulus-locked data. This would provide support for the interpretation that the level of prepicture alpha power in the motor system determines the degree of preparedness of the motor system. To probe this relationship, we computed response-locked alpha power (8–12 Hz) between −1500 and 1500 msec relative to speech onset for source-level time course data reconstructed in the paracentral lobule and SMA ROIs from our stimulus-locked analyses. All preprocessing, source reconstruction, and time–frequency analysis details are the same as for the stimulus-locked source analyses, and any relevant differences are noted in the respective parts of the Methods section above. To identify time regions exhibiting a consistent relationship between prespeech alpha power and RTs, the same cross-trial regression approach used for sensor-level data in the stimulus-locked analysis was employed with the response-locked source data.
The cluster-based permutation output for these response-locked data revealed a statistically reliable negative relationship between prespeech alpha power and RTs in left paracentral lobule (p = .00040) between −740 and −160 msec relative to speech onset, right paracentral lobule (p = .00040) between −880 and −180 msec relative to speech onset, left SMA (p = .0012) between −740 and −200 msec relative to speech onset, and right SMA (p = .0024) between −640 and −200 msec relative to speech onset (Figure 3). To investigate whether observed effects were robust to item variability, mean alpha power in identified time intervals (−880 to −160 msec relative to speech onset for paracentral lobule; −740 to −200 msec for SMA) was extracted from each cortical ROI and entered into linear mixed-effects modeling analyses. The model details are the same as for the stimulus-locked analyses, except that the relationship between power and RTs was only tested for the paracentral lobule and SMA ROIs, in one model.
The best-fitting model showed a statistically significant effect of Alpha Power in the paracentral lobule (β = −0.023, SE = 0.001, t = −17.56, 95% CI [−0.0255, −0.0204]), Trial Number (β = −0.018, SE = 0.001, t = −12.63, 95% CI [−0.0203, −0.0148]), and a three-way interaction between Trial Number, SMA, and Paracentral Lobule (β = 0.003, SE = 0.001, t = 2.90, 95% CI [0.0009, 0.0046]). To explore the three-way interaction, we tested the same model, replacing the three-way interaction with all the involved two-way interactions. None of these approached significance (SMA × Trial Number: β = −0.002, SE = 0.002, t = −0.917 [−0.0050, 0.0018]; Paracentral Lobule × Trial Number: β = 0.001, SE = 0.002, t = 0.77, 95% CI [−0.0021, 0.0047]; SMA × Paracentral Lobule: β = 0.000, SE = 0.001, t = 0.28, 95% CI [−0.0017, 0.0022]. Thus, there was no clear pattern in the relationship between trial number and alpha power in the two ROIs.
To summarize, naming became faster for trials appearing later in the experiment. Importantly, the main effect of alpha power was present even after accounting for time on task: Higher alpha power in paracentral lobule between 880 and 160 msec prior to speech onset resulted in faster picture naming responses.
Onset versus Amplitude of Prespeech Alpha Desynchronization
Next, we hypothesized that the identified relationship between prespeech alpha power and RTs may be a result of the prespeech alpha desynchronization in the time interval between 880 and 160 msec prior to speech onset beginning closer in time (i.e., later within that time window) to speech onset. Rather than the level of alpha power, it appears to be the timing of the onset of alpha desynchronization that determines the speed of response. When this desynchronization occurs closer in time to speech onset or put another way when it occurs later in the identified time window between 880 and 160 msec prior to speech onset, the result is higher power within this time window, precisely because alpha power stays higher for longer within that window. To test this hypothesis, we performed a median split of the time–frequency data from left and right paracentral lobule ROIs based on RTs and directly compared the level of alpha power for the fastest to the slowest responses using a t test (Bonferroni corrected alpha level = .025). As predicted, shorter RTs exhibited significantly higher alpha power between 880 and 160 msec prior to speech onset than longer RTs in both the left (p = .0000058) and right (p = .0000078) paracentral lobule ROIs. In combination with Figure 3, clearly demonstrating the differences in onset of prespeech alpha desynchronization for short versus long RTs in this time interval, this provides suggestive evidence that the later the onset of this desynchronization (the closer it occurs in time to speech onset), the faster the picture naming responses.
Motor Cortical “Readiness” Predicts Prespeech Alpha Desynchronization
Finally, we tested the hypothesis that resting motor cortical alpha power prior to picture onset is related to subsequent movement-related motor cortical alpha desynchronization prior to speech onset. This was only tested for the cortical ROIs exhibiting a relationship between prespeech alpha power and RTs, namely, left and right paracentral lobule. As in all previously described models, the random structure included participant and item intercepts. Because previous models indicated that hemisphere does not play a role, it was not included as a predictor in this analysis. Therefore, only Paracentral Lobule, Trial Number, and their interaction were included as fixed effects. All fixed effects were present in the best-fitting model: Alpha Power (β = 0.710, SE = 0.018, t = 38.71, CI [0.6740, 0.7459]), Trial Number (β = 0.443, SE = 0.018, t = 24.18, CI [0.4075, 0.4794]), and their interaction (β = 0.072, SE = 0.018, t = 3.91, CI [0.0359, 0.1082]). The slopes of the relationship between paracentral lobule alpha and RTs became more positive for trials appearing later in the experiment (0.71 [0.67, 0.75], 7.92 [4.30, 11.53], 15.12 [7.89, 22.35], 22.32 [11.48, 33.18], 29.54 [15.07, 44.00], 36.74 [18.67, 54.82], 43.95 [22.25, 65.64]). All slopes were statistically significant. The main effect of Power suggests higher prepicture alpha power in left and right paracentral lobule was related to higher prespeech alpha power in these cortical ROIs, which we argued in the previous section is the result of a shorter delay between the onset of prespeech alpha desynchronization and speech onset itself. The main effect was enhanced with more time on task, suggesting that the relationship between prepicture alpha power and the temporal precision of prespeech alpha desynchronization in anticipation of articulation increased over time, possibly reflecting the adaptation effects reported previously.
We tested whether the speed of picture naming can be (partially) explained by our attentional state just prior to and during the preparation of speech. Participants named 60 familiar pictures 10 times while their rhythmic neural activity was measured prior to picture onset and prior to speech onset. The focus was on oscillatory power in the alpha frequency band (8–12 Hz) as prestimulus alpha power has been shown to relate to response speed in visual and auditory judgment tasks (Mazaheri et al., 2014; Kelly et al., 2009; Thut et al., 2006). Importantly, for the interpretation of our findings, it has been proposed that alpha plays a role in attentional gating by regulating information flow, through the inhibition of task-irrelevant brain regions (Jensen & Mazaheri, 2010; Klimesch et al., 2007). In this study, we tested whether people's momentary level of attention as indexed by alpha power in task-relevant and task-irrelevant brain regions prior to picture onset predicts subsequent naming latencies. Furthermore, we investigated how both poststimulus and prespeech alpha power were related to subsequent naming latencies in a more exploratory fashion. In the following, it is important to bear in mind the limited spatial precision of source reconstruction with EEG. Claims about localization of power changes should be interpreted with this in mind, and throughout the discussion, we try to limit claims to brain systems (e.g., motor system vs. visual system) rather than precise locations.
Alpha Power in Visual Regions Is Not Predictive of Naming Latencies
A negative relationship between naming times and alpha desynchronization was present, both before (in superior orbitofrontal regions) and after (in the cuneus) picture onset (Figure 2B and D). This suggests that lower alpha power in superior orbitofrontal regions before picture onset and in the cuneus after picture onset resulted in slower naming responses. We hypothesized precisely the reverse relationship, a positive relationship, between prepicture alpha power and naming latencies, specifically for the brain regions responsible for visual processing of the picture. These posterior brain regions are task-relevant areas during the initial stage of picture naming: First and foremost, the picture needs to be identified through visual processing and then linguistic processes such as finding the right concept and lemma can commence. Prestimulus alpha power is typically decreased for faster responding (Kelly et al., 2009; Thut et al., 2006), yet we found the opposite pattern for the superior orbitofrontal regions and cuneus.
It is possible that this unexpected relationship reflects faster naming when superior orbitofrontal regions, not directly related to the task of picture naming, are more inhibited (higher alpha). It is important to note that the relationship between superior orbitofrontal alpha and RTs exhibited adaptation, because the effect was not present as a main effect but only as an interaction with trial number. In other words, both alpha and RTs could be independently changing with time, likely reflecting an epiphenomenal relationship between the two. The relationship between postpicture alpha in the cuneus (visual system) and RTs is more difficult to explain. Both these surprising findings were, however, no longer statistically significant when accounting for item variability (e.g., word length, word frequency). This suggests that both desynchronization results were sensitive to the systematic relationship between alpha power and variability in the characteristics of individual picture-word combinations. Our data thus suggest that reduced preparatory activation of cortical regions responsible for processing the picture or for speech planning does not appear to be the locus of slower naming responses (a proxy for lapses in attention in the context of our study).
Alpha Synchrony in Motor Areas Is Predictive of Naming Latencies
We observed a negative relationship between alpha synchronization in the motor system and subsequent naming latencies, both before (in the paracentral gyrus) and after (in the SMA) picture onset (Figure 2A and C). This suggests that higher alpha power in the motor system directly before and directly after picture onset resulted in faster naming responses. The final stages of word production—phonetic encoding and articulation—activate the motor and premotor cortices, specifically the precentral gyrus, SMA, and sensorimotor regions involved in mouth movement (Indefrey & Levelt, 2004). This suggests that our observed alpha synchronization in these regions, just prior to and just after picture onset, serves to inhibit the motor cortex and as such to inhibit an immediate motor response to allow for visual identification and early word planning processes to be carried out without interference from signals related to motor preparation or execution. Support for this interpretation comes from studies showing an alpha increase in brain regions that are not relevant for the particular task at hand (Haegens et al., 2012; Kelly et al., 2006). For a cognitively demanding task such as word production, motor cortical regions are initially task-irrelevant, and they typically only become task-relevant approximately 450 msec after picture onset. In other words, just before and just after picture onset, the motor cortex is a task-irrelevant brain region and should therefore initially be inhibited for improved task performance. This interpretation fits well with our observed relationship between prepicture motor cortical alpha and naming latencies.
Although postpicture alpha synchronization may be argued to index more deliberate inhibition of the motor system during early stages of speech production, in our study, the observed prepicture alpha synchronization is necessarily incidental. People's attentional states are known to spontaneously fluctuate over time (e.g., Kucyi, Hove, Esterman, Hutchison, & Valera, 2017) and our primary measure of interest—prestimulus alpha power—was chosen to capture precisely this phenomenon. It is certainly possible to influence the allocation of attention before a stimulus is presented (and by extension people's level of prestimulus alpha power) by changing the expectedness of the stimulus onset for instance (e.g., de Lange et al., 2013), but our study specifically aimed to investigate how participants' incidental level of attention prior to picture onset affected subsequent naming latencies. Stimulus onset in our study was therefore not predictable, and neither was the content of the stimulus (picture identity). Nevertheless, we observed a robust relationship between prestimulus alpha power in the motor system and subsequent naming latencies. We therefore speculate that the incidental level of attention determines the incidental state of the motor system prior to picture onset. This is reflected in the level of motor-cortical alpha power, which in turn influences how amenable the motor system is to initiating a motor program for articulation. Interference may occur when the motor cortex is less inhibited during early stages of speech production, before motor program execution is actually required. The precise details of this proposed relationship between sustained attention and motor-cortical readiness in the context of word production will require further testing in future studies.
Prespeech Alpha Is Also Predictive of Naming Latencies
We also analyzed alpha power in the period directly preceding speech onset and showed that the level of prespeech alpha power exhibited a negative relationship with subsequent naming latencies (Figure 3). A well-established finding is mu (alpha over sensorimotor regions) desynchronization during movement preparation and execution, with the onset of desynchronization beginning just prior to movement onset (e.g., Pfurtscheller & Lopes da Silva, 1999). The onset of desynchronization is thought to reflect the onset of movement preparation, and the timing of this onset is known to be influenced by participants' level of attention (Pineda, 2005). We thus hypothesized that the negative relationship between prespeech alpha power and naming latencies may be driven by the timing of the onset of this desynchronization relative to speech onset, which is reflected in the level of power in this period when averaging over the entire prespeech time window of interest. If this desynchronization occurs closer in time to speech onset for instance, alpha power would be higher in the prespeech time window because it stays higher for longer, whereas an earlier onset of desynchronization (a longer time prior to speech onset) would lead to lower mean alpha power in this same prespeech time window. Indeed, when short and long RTs (categorized by a median split) were compared, the onset of alpha desynchronization occurred closer in time to speech onset in fast than in slow trials (Figure 3), and this was reflected in higher alpha power in the prespeech time window.
Not only did higher prespeech movement-related alpha power lead to faster naming, but the level of prespeech alpha was also positively related to the level of prepicture inhibitory alpha power. This suggests that the level of attention prior to the onset of the picture—indexed by the level of pre-picture inhibitory alpha power—may also influence the precision of the timing of the onset of prespeech mu desynchronization. We have argued that this prespeech mu desynchronization reflects the onset of preparation for articulation. One may wonder why an earlier onset of movement preparation relative to speech onset would lead to slower (and not faster) naming. We speculate that this may be related to interference from motor cortical signals with later stages of speech production that may not yet be complete if preparation for articulation begins prematurely. On this account, when the timing of prespeech mu desynchronization is more precise (the onset is closer in time to the onset of articulation), the cascade of processing that is necessary for speech production is already closer to completion, and with minimal interference from motor-cortical signals, articulation can occur without much delay. In combination with the known relationship between attentional state and premovement mu desynchronization noted above, this relationship between the onset of prespeech mu desynchronization and the level of prepicture inhibitory alpha power in the motor system naturally suggests this processing stage as an additional candidate for how fluctuations in participants' attention prior to picture onset may influence subsequent naming latencies.
Implications and Future Directions
Our results suggest that premature speech preparation and/or execution should be minimized while the planning processes of speech production are unfolding to quickly produce a word. In this context, participants' incidental level of motor cortical inhibition as measured by alpha power over motor cortical regions prior to picture onset plays an important role. It will be important for future studies to try to identify which planning stages are most affected by a lack of motor inhibition. We have shown that alpha power over motor cortices both prior to and during planning is predictive of naming latencies, suggesting a general effect of alpha inhibition on word production. It is possible, however, that attentional lapses indexed by modulations of alpha power specifically impact certain stages of speech production more than others. This could be tested by increasing the processing load for a specific stage of speech production and investigating how this interacts with people's levels of alpha power prior to and after the onset of production planning.
Negative effects of momentary lapses of attention may be minimal in healthy speakers—slow speech every now and then does not necessarily hamper communication. Lapses may, however, be more disturbing for individuals suffering from attention deficit disorder or specific language impairment (SLI). For children with SLI, it has been shown that they exhibit deficits in sustained attention compared with typically developing children. These children also tend to make more speech errors. Furthermore, children's sustained attention abilities are correlated with the speed of their word production (Jongman, Roelofs, Scheper, & Meyer, 2017). A potentially important deficit in children with SLI may therefore be a lack of consistent inhibition over their motor-cortical regions during speech production.
In this study, we have demonstrated that the speed with which people are able to overtly produce the names of pictures presented to them depends on the level of alpha power in their motor cortices both immediately before picture onset and immediately before speech onset. We have argued that this reflects the influence of people's momentary state of attention on their inhibition of motor-cortical signals related to the preparation and execution of a motor response (articulation) while the planning stages required for speech production unfold. Reduced motor-cortical alpha power is thought to lead to a reduction in this motor-cortical inhibition and result in slower picture naming responses. The precise stage of speech production planning at which this motor-cortical interference plays a role will be an important avenue for future research and has the potential to inform work on speech production deficits in clinical populations who exhibit comorbid deficits in attentional processing.
A. G. L. was supported by Gravitation Grant 024.001.006 of the Language in Interaction Consortium from Netherlands Organization for Scientific Research. A. G. L. was partly supported by grant HD 073288 from the NIH National Institute of Child Health and Human Development to Julie Van Dyke (PI). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Reprint requests should be sent to Suzanne Jongman, Max Planck Institute for Psycholinguistics, PO Box 310, 6500 AH Nijmegen, The Netherlands, or via e-mail: firstname.lastname@example.org.