Language switching in bilingual speakers requires attentional control to select the appropriate language, for example, in picture naming. Previous language-switch studies used the color of pictures to indicate the required language thereby confounding endogenous and exogenous control. To investigate endogenous language control, our language cues preceded picture stimuli by 750 msec. Cue-locked event-related potentials (ERPs) were measured while Dutch–English bilingual speakers overtly named pictures. The response language on consecutive trials could be the same (repeat trials) or different (switch trials). Naming latencies were longer on switch than on repeat trials, independent of the response language. Cue-locked ERPs showed an early posterior negativity for switch compared to repeat trials for L2 but not for L1, and a late anterior negativity for switch compared to repeat trials for both languages. The early switch–repeat effect might reflect disengaging from the nontarget native language, whereas the late switch–repeat effect reflects engaging in the target language. Implications for models of bilingual word production are discussed.
Picture yourself at one of the main transportation hubs of Tokyo with more than 2 million passengers a day during rush hour. You have an appointment with your friend at Exit 12. To structure the huge amount of information that enters through our senses, we need to orient our attention toward specific attributes of events. If you know that your friend is wearing a green jacket, this will help a lot to detect him in the continuous stream of passengers. Selective attention can help us by selectively biasing the information that is necessary to guide our actions. Traditionally, researchers studied attentional orienting toward spatial locations in the visual domain (e.g., Posner & Raichle, 1994; Posner, 1980). It is widely agreed that orienting of visuospatial attention depends on a fronto-parietal cortical network (see Wright & Ward, 2008, for a review). Attentional orienting has often been studied by means of the Posner cueing paradigm in which visuospatial attention needs to be shifted (or switched) from one spatial location to the next (Posner, 1980). Typically, precues indicate participants to orient their attention to a certain spatial location in which a target stimulus will subsequently appear. However, on some trials the cue is invalid, cueing the spatial location opposite to that in which the stimulus appears. In such cases, attention needs to be disengaged from the invalidly cued spatial location and moved to the correct spatial location in which attention needs to be re-engaged. Parietal cortex has been proposed to be important for redirecting of attention (Posner, Walker, Friedrich, & Rafal, 1984, 1987). Besides orienting attention to visual spatial locations, it is also possible to orient attention toward other attributes such as color (in case of the example), time (Coull & Nobre, 1998), actions (Rushworth, Johansen-Berg, Göbel, & Devlin, 2003; Rushworth, Krams, & Passingham, 2001), and mental representations in working memory (Lepsien & Nobre, 2006).
Bilingual individuals rely on selective attention for controlling their languages. They have been described to show superior attentional performance compared to monolinguals even on nonverbal tasks of selective attention (Craik & Bialystok, 2005). In bilinguals, attention needs to be oriented to the native language (L1) or a typically later acquired second language (L2). Bilinguals need to dynamically adjust their attention while they switch from one language to the next. A common way to study attentional control in bilingual individuals is by means of the language-switching task in which participants have to alternate between their two languages (L1 and L2), which are often of unequal familiarity. In task-switch studies, reaction times (RTs) are longer for trials on which the task changes (switch) than for trials on which the task is repeated (repeat). The difference in RT between switch and repeat trials is called the switch cost and has been used to probe mechanisms of attentional control (e.g., Yeung & Monsell, 2003, for switching between tasks of unequal familiarity). Task-switch studies often make a distinction between two types of control: endogenous and exogenous control (e.g., Rogers & Monsell, 1995). The endogenous process is a top–down, intentional, voluntary process that is driven by internal goals, intentions, or expectancies. The exogenous process is a bottom–up, nonintentional, involuntary process that is triggered by an external stimulus. In an attempt to separate endogenous and exogenous control processes, task-switch studies used the cue–stimulus paradigm. The cue, which indicates the task to be performed, allows endogenous preparation to take place before the stimulus appears. The stimulus itself elicits exogenous control.
In task-switch studies, endogenous preparation is typically complete after 600 msec (Rogers & Monsell, 1995). Residual switch costs in RTs, which remain after sufficient preparation time, have been taken as a reflection of exogenous control (Rogers & Monsell, 1995), although other theoretical interpretations are possible (see Monsell, 2003, for a review). In behavioral studies, endogenous control is indexed by the difference in RT switch costs between trials that allowed for task preparation and those that did not. An advantage of cue-locked event-related potentials (ERPs) compared with RT measurement is that they provide a direct and on-line measure of endogenous control.
Recently, Logan and Bundesen (2003) and Mayr and Kliegl (2003) questioned the existence of an endogenous component of task switching. According to these authors, endogenous switch costs reflect nothing more than cue processing instead of an active process of task preparation. Logan and Bundesen manipulated cue masking in some experiments and used a double cueing procedure (i.e., two cues per task) in other studies to examine the contribution of cue processing to endogenous switch costs. In the cue-masking experiments, they observed interactions between cue repetition and cue masking. In the double cueing experiments, they observed that RTs on task-repetition trials with cue changes were similar to RTs on task-switch trials, and both were much slower than cue-repetition trials. The authors conclude that cue-encoding processes contribute much to endogenous switch costs. Mayr and Kliegl also used two different cue categories for two different tasks (henceforth, the 2:1 cue-to-task mapping procedure) and found that a considerable switch cost emerged when task sets were repeated while task cues changed. Hence, they conclude that a large part of the switch costs can be attributed to memory retrieval processes associated with the cue instead of the actual endogenous task switching. The results of both studies suggest that cue change alone could account for (almost) all of the switch costs. Monsell and Mizon (2006), however, in favor of an endogenous component of task switching, demonstrated that task-switch costs are not always reduced when cue repetition is controlled for.
To dissociate the relative contribution of cue switching and task switching to endogenous effects, Nicholson, Karayanidis, Bumak, Poboka, and Michie (2006) measured cue-locked ERPs in a dual-cue task-switching paradigm. Participants had to randomly switch between a parity and magnitude task that were associated with one dimension of two cue categories (i.e., parity task: blue and circle; magnitude task: orange and diamond). A 600-msec cue–stimulus interval allowed for optimal preparation, which should maximize effects of endogenous control. Cue category change was manipulated as an experimental factor, which resulted in a full-factorial design for cue and task switching. Thus, unlike most task-switch studies, task switches were not always associated with cue switches, and task repetitions not always with cue repetitions, but both cue switches and repetitions were associated with task switches as well as task repetitions. The results showed a significant RT task-switch cost; switch trials were slower than repeat trials, reflecting exogenous control. Nicholson, Karayanidis, Bumak, et al. (2006) observed an early (180–240 msec) ERP effect with larger N2 amplitudes for repeat compared to switch cues. They interpreted this effect to reflect cue processing within the first 300 msec after cue onset. They also replicated previous findings of task-switch ERP studies. In particular, similar to Karayanidis, Coltheart, Michie, and Murphy (2003) and Barceló, Muñoz-Céspedes, Pozo, and Rubia (2000), they observed an increase in a parietally distributed positivity in the 450–500 msec time window after cue onset for task-switch trials relative to task-repeat trials. Importantly, Nicholson, Karayanidis, Bumak, et al. (2006) did observe this effect even when cue change was controlled for. The authors conclude that the endogenous component of task switching does reflect active task-switch preparation independent of cue processing. Thus, opposite to what Logan and Bundesen (2003) and Mayr and Kliegl (2003) have claimed, the ERP results of Nicholson, Karayanidis, Bumak, et al., as well as the RT data of Monsell and Mizon (2006), reveal that cue change cannot fully explain endogenous switch costs.
An important question is whether an endogenous component of attentional language control also exists for bilinguals when they switch between languages. Bilingual language switching is an instance of very powerful attentional control in a naturalistic situation. Better understanding of endogenous language switching will be informative not only regarding bilingual language performance but most importantly regarding endogenous attentional control in general. Previous RT language-switch studies provide only indirect evidence for endogenous language switching. For example, Costa and Santesteban (2004) presented their participants with a precue indicating the language in which participants had to name a picture that appeared after a 500- or 800-msec cue–stimulus interval. They observed that the magnitude of the switching costs decreased as the preparation interval increased, pointing at a contribution of an endogenous control process. However, because Costa and Santesteban used RTs which reflect a composite of different processes that take place before the behavioral response (including decision-related and motor processes), no information about the time course of the endogenous control process is available. A disadvantage of this procedure is that this endogenous process could only be demonstrated indirectly.
In language-switch ERP studies carried out so far, the color of pictures usually indicated the required language. This results in a confound of endogenous and exogenous control (e.g., Christoffels, Firk, & Schiller, 2007; Jackson, Swainson, Cunnington, & Jackson, 2001). Little is known about endogenous control in language switching. In a recent study, we provided evidence from ERPs that endogenous control processes affect exogenous control processes in language switching (Verhoef, Roelofs, & Chwilla, 2009). Specifically, stimulus-locked ERPs showed an N2 effect that was modulated by preparation interval, suggesting that endogenous language control influences exogenous language control.
The goal of the present study was to test for possible ERP correlates associated with endogenous control in language switching that are unconfounded by exogenous control. Bilinguals, opposite to monolinguals, encounter the problem that a concept is typically associated with at least two response alternatives, one in L1 and another in L2. If evidence could be provided for an endogenous component of language switching, this would suggest that bilinguals can bias the target language in advance. In the current study, a cue–stimulus paradigm with an interval of 750 msec was used to measure cue-locked ERPs as an index of endogenous control. To avoid confounding between cue switching and language switching, we adopted the dual-cue procedure from Nicholson, Karayanidis, Bumak, et al. (2006) with the exception that we did not manipulate cue change as an experimental factor. However, we did control for cue change in a different way, namely, by using two dimensions for each language within a single cue category (color). Cue dimension always changed within the color category independent of language switches (e.g., L1: red or yellow; L2: green or blue).
Support for endogenous control in language switching would consist of any effect of language sequence (difference between switch and repeat trials) in the cue-locked ERP waveforms. The hypothesized language sequence ERP effect might be different for L1 than for L2, indicating that endogenous control depends on language dominance. Given the novelty of this approach (this is the first study to address endogenous language control), it is unclear which ERP effects to predict. But based on the endogenous task-switch ERP literature and one language-switch ERP study, candidate ERP components that we might observe are P300 effects sensitive to task-set updating (Karayanidis et al., 2003; Barceló et al., 2000) or N2 effects related to language inhibition (Jackson et al., 2001). Based on language dominance, we would expect the P300 switch–repeat effect to be larger for L2 than for L1 because we hypothesize that language-set updating is more effortful for L2 than for L1. In this same line of reasoning, a potential N2 switch–repeat effect would also be larger for L2 than for L1. That is, from a language inhibition (task-set inertia) point of view, the dominant language needs to be suppressed more than the weaker language.
Fifteen right-handed college students (14 women, mean age = 20.9 years) participated for course credit or cash. All participants were right-handed native Dutch speakers with normal or corrected-to-normal vision, who learned English as a second language from about the age of 10 (see Appendix A). Participants had no previous exposure to language-switching paradigms and provided written informed consent. None of the participants had any neurological or psychological impairment or had used psychoactive medication.
Materials, Procedure, and Design
Stimuli were presented at the center of a black, 15-in. computer screen set to 1024 × 768 pixel resolution, viewed at a distance of approximately 80 cm. A trial started with the 250-msec presentation of a cue, followed by a blank screen for a duration of 500 msec. Then a picture stimulus appeared on the screen for 250 msec after which the screen turned blank again for the response latency (triggering of a voice key), plus a latency jitter of 1500 to 2300 msec. If the voice key was not triggered or the participant did not respond within 3 sec, the screen was blank for the 3-sec timeout period plus the intertrial latency jitter. Then the next trial began, with the presentation of the next cue (see Figure 1). Thus, the cue–stimulus interval was 750 msec, with the intertrial interval being variable.
The cue was a red, yellow, green, or blue color patch, 150 mm wide × 85 mm high and subtending a horizontal visual angle of 10.6° and vertical angle of 6.1°. Stimuli consisted of 48 black-and-white line drawings, taken from the International Picture Naming Project database (Bates et al., 2003). Picture stimuli did not exceed an invisible square of 80 mm wide × 80 mm high and subtended a maximal visual angle of 5.7° horizontally and vertically. All picture names were Dutch–English noncognates (e.g., Dutch wortel, English carrot; see Appendix B for further description).
The colors of the cue (red, yellow, green, and blue) indicated the response language to be prepared. The assignment of the colors to the response language was counterbalanced across participants. There were two types of trial sequences: switch and repeat. In switch trials, the response language of the current trial was different from the response language of the previous trial (L1, L2 or L2, L1), whereas in repeat trials, the response language of (at least) two subsequent trials was the same (L1, L1 or L2, L2). The trials were randomized such that the proportion of L1 and L2 switch trials was equal to the L1 and L2 repeat trials. To avoid confound of cue switching with language switching, the color of the cue on two subsequent trials was always different, even on language repeat trials. Thus, the cue continuously changes in the ongoing sequence. After signing the consent form, participants completed a bilingual proficiency questionnaire and were familiarized with the Dutch and English picture names while placement of the electrode cap took place. The rationale for the study phase was to make sure that all participants correctly identified the lexical items belonging to the pictures.
Thereafter, participants were seated in front of the computer screen and were instructed to name the pictures as quickly and accurately as possible in the language indicated by the cue. They were further asked to minimize blinking until after picture naming. Naming latencies were registered with a 1-msec accurate voice key (1000 Hz). Cue–stimulus pairs were randomly presented in 20 blocks of 96 trials, with a total of 1920 trials. Each block lasted about 8 min. For every participant, a new list of pseudorandom cue–stimulus pairs was generated with the constraint that pictures were never repeated within three trials and that language repetitions occurred no more than five times in a row. In each stimulus list every picture occurred 20 times in all four conditions: Language (L1 vs. L2) by Language sequence (switch vs. repeat). Participants were offered refreshments between blocks and could decide when they were ready to go on. The total testing session for each participant, including questionnaire, instructions, picture familiarization, cap application, and breaks took approximately 2 hr.
EEG was recorded from the scalp with 29 tin electrodes mounted in an elastic electrode cap. The electrodes were arranged according to the extended International 10–20 System (Jasper, 1958). All electrodes were initially referenced to the left mastoid and later off-line re-referenced to the average of the left and right mastoids. The electrooculogram (EOG) was recorded bipolarly; horizontal EOG was measured by placing electrodes on the outer canthus of each eye, vertical EOG by placing electrodes on the infraorbital and the supraorbital of the left eye. Electrode impedance was kept below 3 kΩ. Neuroscan amplifiers (Neuroscan SynAmps, Singen, Germany) were used to amplify the EEG and EOG signals. All signals were sampled at 250 Hz and filtered on-line using a 0.02 to 70 Hz band0pass filter with an 8-sec time constant.
For each participant, naming latencies and mean EEG signals were calculated for the correct trials only. Trials that were discarded from the analyses could be classified into four categories: (A) errors in language selection (utterances that started with the inappropriate response language); (B) within-language errors (responses that differed from those designated by the experimenter in all but the response language); (C) trials that could not be classified as either switch or repeat (trials at the beginning of each block and trials following language-selection errors); and (D) recording failures and timeouts (naming latencies shorter than 300 msec or longer than 2000 msec). Only 2.4% of all trials were excluded based on the Criterion A or B, another 5.6% of all trials were excluded based on Criterion C or D. Thus, based on behavioral grounds and technical errors in recording voice-key triggers, 8% of the trials were excluded.
Naming Latency and Error Analyses
The experimental design included two within-subject factors: language (L1 vs. L2) and language sequence (switch vs. repeat). Error rates (for Categories A and B) and naming latencies were submitted to repeated measures analyses of variance (ANOVAs) for subjects.
Specific analysis steps were as follows. After re-referencing to the mean of both mastoids, the EEG signal was filtered (low-pass 30 Hz) and segmented into cue-locked −200 to 750 msec epochs. The epochs were referenced to the 200 msec precue baseline. After artifact rejection, cue-locked epochs were averaged. Artifact rejection criteria were as follows: The gradient criterion was set such that voltage steps of maximally 30 μV were allowed per sampling point; the absolute difference voltage per segment should not exceed 100 μV; the lowest allowed activity was 0.5 μV (max–min) per 100 msec; and amplitudes had to be between −0.75 and 0.75 μV. Using these artifact rejection criteria, 11% of all trials had to be rejected from further analysis. Also considering exclusion of error trials, each condition included about 80% of the original number of trials for ERP analyses (switch Dutch: 79%; repeat Dutch: 81%; switch English: 80%; and repeat English: 84%).
The window for quantifying ERP effects was based on visual inspection of the waveforms and corresponded to the time windows in which maximal differences between conditions occurred. This resulted in two time windows, an early window from 200 to 350 msec and a late time window from 350 to 500 msec. If an indication of a Language sequence × Language interaction was present in the overall omnibus analysis, the mean amplitudes for the different conditions of language sequence were entered into separate ANOVAs for L1 and L2. These ANOVAs were carried out with repeated measures on the experimental factor language sequence and electrode site (29 levels). The scalp distribution of possible ERP correlates of endogenous language control was explored in two separate analyses: midline analysis and quadrant analysis. The midline analysis included two levels of region of interest that each consisted of three electrode sites (ROI, anterior: Fpz, Fz, and FCz, and posterior: CPz, Pz, and Oz). Each quadrant (left anterior, left posterior, right anterior, and right posterior) consisted of five electrodes (see Figure 2, left). The quadrant analysis included two factors: the factor hemisphere (2 levels: left, right) and the factor ROI (2 levels: anterior–posterior). Only if the analyses yielded interactions between the experimental factor and the factors hemisphere and ROI, were supplementary ANOVAs performed for the different quadrants. When in the quadrant analysis there was an interaction between the experimental factor and only one of the factors hemisphere and ROI, in the follow-up analyses the data were collapsed over two quadrants (hemisphere collapsed over anterior–posterior or ROI collapsed over left–right). Due to the exploratory nature of this study, the topography of potential ERP effects was further investigated by follow-up single electrode sites analyses, when interactions with the factor electrodes allowed us to do so. When appropriate, the estimated Greenhouse–Geisser coefficient ɛ was used to correct for violations of the sphericity assumption (Geisser & Greenhouse, 1958). All reported p values are based on corrected degrees of freedom, but to aid the reader in interpreting our statistical design, the stated degrees of freedom are uncorrected.
Naming Latency Results
Mean naming latencies and error rates are shown in Figure 3. Naming latencies were longer for L1 (965 msec) than L2 trials (906 msec) and for switch (962 msec) than repeat trials (909 msec). The main effects of language [F(1, 14) = 300.84, p = .000] and language sequence [F(1, 14) = 36.58, p = .000] were significant. No interaction was found between language and language sequence (F < 1). That is, the magnitude of the switch costs for L1 (49 msec) and L2 (57 msec) was similar.
The overall error rate was low, 2.4% on average (see Figure 3). The number of real errors (within- and between-language errors) was greater for L1 (3.2%) than for L2 (1.7%) [F(1, 14) = 10.88, p = .005]. In addition, switch trials (3.6%) led to more errors than repeat trials (1.3%) [F(1, 14) = 33.24, p = .000]. Thus, there was no indication of a speed–accuracy tradeoff. For the error rates, the Language × Language sequence interaction was not significant [F(1, 14) = 2.49, p = .137].
In Figure 4, grand-average ERP waveforms for the two levels of language sequence (switch–repeat) are displayed separately for the first language (L1) and for the second language (L2). Waveforms are time-locked to cue onset and are presented for the five midline electrode sites: Fz, Cz, CPz, Pz, and Oz. All conditions elicited an early ERP response characteristic of visual stimuli. That is, an N1 followed by a P2, which at occipital sites was preceded by a P1 component.
Visual inspection of the data suggests that the amplitudes of the L2 waveforms are more negative for switch trials than for repeat trials early in time (Figure 4). An increase in amplitude for switch compared to repeat trials started at posterior sites at about 200 to 350 msec after cue onset and extended toward anterior sites between 350 and 500 msec. In contrast, L1 waveforms for switch and repeat trials did not seem to diverge until 350 msec after cue onset. Also for L1, overall mean amplitudes elicited by switch trials were more negative than those for repeat trials, with a maximum at anterior electrode sites.
Early ERP Effect (200–350 msec)
No main effects of language and language sequence were present in the omnibus [language: F(1, 14) = 1.53, p = .237; language sequence: F(1, 14) = 1.76, p = .206], the quadrant [language: F(1, 14) = 1.42, p = .254; language sequence: F(1, 14) = 1.33, p = .269], or midline analysis [language: F(1, 14) = 2.21, p = .160; language sequence: F(1, 14) = 1.69, p = .214]. However, the omnibus analysis yielded a three-way Language × Language sequence × Site interaction [F(28, 392) = 4.78, p = .010]. Moreover, both quadrant and midline analyses showed a Language × Language sequence × ROI (anterior–posterior) interaction [quadrant: F(4, 56) = 7.30, p = .017; midline: F(1, 14) = 5.86, p = .030]. These interactions suggest that at least for some electrode sites there are differences in language sequence effects between the languages. Therefore, language sequence effects were explored for the two languages separately.
L1 language sequence effect
A three-way Language sequence × ROI (anterior–posterior) × Hemisphere interaction was present in the quadrant analysis [F(1, 14) = 5.42, p = .035]. However, analysis for the different quadrants did not disclose a language sequence effect [left anterior: F(1, 14) = 0.45, p = .514; right anterior: F(1, 14) = 1.344, p = .266; left posterior: F(1, 14) = 0.01, p = .929; right posterior: F(1, 14) = 0.06, p = .813]. In conclusion, the results did not support the presence of a language sequence effect for L1 in the early (200–350 msec) time window (see also Figures 4, 5 and 6).
L2 language sequence effect
For L2 trials, the main effect of language sequence was not significant in the omnibus [F(1, 14) = 1.76, p = .206], the quadrant [F(1, 14) = 1.23, p = .286], or the midline analysis [F(1, 14) = 1.66, p = .219]. However, there was a significant Language sequence × Site interaction in the omnibus analysis [F(28, 392) = 5.87, p = .003]. Moreover, both the quadrant and the midline analyses showed a Language sequence × ROI (anterior vs. posterior) interaction [F(1, 14) = 7.23, p = .018 and F(1, 14) = 7.07, p = .019, respectively]. No interaction with the factor hemisphere in the quadrant analysis was obtained (Fs < 1).
Follow-up analyses were performed for the quadrant and midline analysis for the two levels of ROI (anterior–posterior). For the anterior ROI, an effect of language sequence was found neither for the quadrant nor for the midline analyses (Fs < 1). In contrast, for the posterior ROI, a significant effect of language sequence was present in the quadrant [F(1, 14) = 14.91, p = .002] as well as in the midline analyses [midline: F(1, 14) = 11.53, p = .004]. Amplitudes of switch trials are about 0.4 μV more negative than those of repeat trials. To further determine the scalp distribution of the language sequence effect for L2, analyses were carried out for all electrodes. These supplementary single-site analyses revealed that a reliable effect of language sequence was present at central sites (Cz, CPz, C4, CP4) and posterior sites (P3, O1, Pz, Oz, P4, O2, TP8, and P8).
Late ERP Effect (350–500 msec)
No main effect of language was observed (omnibus, quadrant, midline, all F values < 1). Switch trials were 0.4 μV more negative than repeat trials. The main effect of language sequence was significant in the omnibus and quadrant analyses [F(1, 14) = 5.61, p = .033 and F(1, 14) = 5.52, p = .034, respectively] and approached significance in the midline analysis [F(1, 14) = 4.50, p = .052]. None of the analyses gave rise to an interaction between language and language sequence (F < 1). The absence of the latter interaction indicates that the language sequence effect was similar for both languages. In contrast, in the omnibus analysis, the language sequence effect did interact with electrode site [F(28, 392) = 8.92, p = .000]. In the quadrant and midline analysis, the factor language sequence interacted with the factor ROI (anterior–posterior) [quadrant: F(1, 14) = 11.27, p = .005; midline: F(1, 14) = 21.22, p = .000]. This suggests that the effect was not broadly distributed over the scalp. No other interactions were observed (ps > .05).
Overall, these results indicate that in the late 350–500 msec ERP window there is an effect of language sequence for both languages. Based on the interaction of language sequence with ROI (in the quadrant and midline analyses) and with electrodes (in the omnibus analysis), follow-up analyses were performed to determine the topography of the language sequence ERP effect.
For the posterior electrodes, the effect of language sequence was neither significant in the quadrant analysis [F(1, 14) = 2.82, p = .115] nor in the midline analysis (Fs < 1). In contrast, for the anterior sites, the quadrant analysis and the midline analysis both revealed a main effect of language sequence [quadrant: F(1, 14) = 7.39, p = .017; midline: F(1, 14) = 8.74, p = .010]. The absence of an interaction between language and language sequence [F(1, 14) = 1.17, p = .298], as well as the absence of an interaction with the factor hemisphere [F(1, 14) = 2.57, p = .132], indicate that the language sequence effect is not only similar for both languages, but also for the left and right anterior hemispheres. To further determine the scalp distribution of the language sequence effect, analyses were carried out for all electrodes, collapsed over language. These supplementary single-site analyses revealed a reliable effect of language sequence for frontal (F3, FC3, FT7, Fpz, Fz, FCz, Fp2, F4, FC4, F8, FT8) and central (posterior) sites (C3, CP3, TP7, Cz, C4, CP4, TP8).
Early and Late L2 Effects of Language Sequence
The scalp distributions of the L2 language sequence effect seem to differ for the early and the late time windows (200–350 msec and 350–500 msec, respectively). As Figure 5 shows, the early language sequence effect for L2 was distributed over posterior scalp sites, whereas the late effect was maximal at anterior electrode sites. To check for reliable differences in scalp topography, the early and late L2 language sequence effects were analyzed together in one ANOVA, using time window (early–late) as an additional factor. Because of the absence of an early effect for L1, this analysis was only performed for L2. Because there is controversy in the literature over whether or not the data need to be z-transformed in order to qualify differences in scalp topography, we performed analyses both on normalized and nonnormalized data (Urbach & Kutas, 2002; McCarthy & Wood, 1985).
Omnibus analyses show an interaction between language sequence, electrode site, and time window [normalized: F(28, 392) = 4.47, p = .008; nonnormalized: F(28, 392) = 9.24, p = .000]. Moreover, both quadrant and midline analyses yielded a significant Language sequence × ROI (anterior–posterior) × Time window interaction for the normalized data [quadrant: F(1, 14) = 8.88, p = .010; midline: F(1, 14) = 11.45, p = .004] as well as for the nonnormalized data [quadrant: F(1, 14) = 13.47, p = .003; midline: F(1, 14) = 20.94, p = .000]. In sum, the results both for the nonnormalized and normalized data bolster the claim that the L2 language sequence effects, indeed, differ in scalp distribution with a posterior distribution in the early window and an anterior distribution in the late window.
Our behavioral (RT) results are consistent with previous results in the language-switch literature (e.g., Christoffels et al., 2007; Kroll, Bobb, & Wodniecka, 2006; Costa & Santesteban, 2004; Jackson et al., 2001) in that we replicated the paradoxical language effect (i.e., naming latencies were longer for the first than for the second language). The paradoxical language effect is uniquely observed in a mixed-language context and has been described to be the result of an experiment-wide bias for L2 (Kroll et al., 2006). Another typical finding when bilinguals switch between languages is that of asymmetrical switch costs (Meuter & Allport, 1999), larger switch costs in RTs for L1 than for L2. Relevant for the present discussion is that switch costs have also been described to be symmetrical (equal for both languages). Specifically, Costa and Santesteban (2004) reported symmetrical switch costs for balanced bilinguals. Recently, symmetrical switch costs have been found also for unbalanced bilinguals (Verhoef et al., 2009; Verhoef, 2008) namely, when unbalanced bilinguals have enough time to prepare for a language switch. In the current study, we used a long preparation interval of 750 msec to allow optimal endogenous control and again, under these circumstances, symmetrical switch costs in RTs were observed for unbalanced bilinguals (for details, see Verhoef et al., 2009). The fact that switch costs in RTs remain after optimal endogenous preparation demonstrates that these residual switch costs originate from exogenous language processes (Verhoef et al., 2009) and can be independent of endogenous language control. The latter type of control was examined in the present study using ERPs.
The main goal of the present study was to test for the existence of an endogenous control component in language switching. To this aim, we used the ERP method to track potential language control operations in real time. The demonstration of an endogenous component of language switching is important in that it would imply that bilinguals can bias their response language in advance of the stimulus. In an RT language-switch study, Costa and Santesteban (2004) have shown that endogenous preparation resulted in smaller switch costs, providing indirect evidence for an endogenous component of language-switch costs. Almost all previous language-switch ERP studies confounded endogenous and exogenous language switching by using stimulus integrated language cues (e.g., Christoffels et al., 2007; Jackson et al., 2001).
In the task-switch literature, two independent research groups have claimed that endogenous control does not exist or cannot be measured using the cue–stimulus paradigm (Logan & Bundesen, 2003; Mayr & Kliegl, 2003). Using a double cueing procedure, these research groups found that cue switches can (almost) completely account for endogenous switch costs. Thus, one has to control for cue switches when studying endogenous control to avoid a confound of cue switching. In the current study, this was accomplished by using a 2:1 cue-to-language mapping (e.g., L1: blue or green; L2: yellow or red) for the precues to indicate the response language in advance of stimulus presentation. Here we measured cue-locked ERPs to tap into the endogenous language control processes on-line, independent of a contamination with exogenous language switching.
The main finding of this article is that we succeeded to identify two distinct ERP effects related to endogenous language control. Firstly, an early ERP effect consisted of an increase in a switch-related negativity over posterior sites between 200 and 350 msec. Specifically, mean amplitude was more negative for L2 switch trials than for L2 repeat trials. This effect occurred for L2, but interestingly not for L1. Secondly, a later anteriorly distributed negativity (mean amplitude was more negative for switch trials than for repeat trials) occurred in the 350–500 msec time window both for the native language and for the second language. Hence, the present ERP data support the existence of two distinct processes for endogenous control in language switching.
For task switching, an endogenous component of switching has previously been described by Nicholson, Karayanidis, Bumak, et al. (2006). A posterior positivity for switch compared to repeat trials was observed between 450 and 500 msec (P300), under similar conditions as in the present study, namely, when task switches were not confounded by cue switches. It is not clear why a P300 switch–repeat effect was not obtained in the present study. However, there are at least three differences between task-switch studies and language-switch studies that could account for the absence of a P300 effect in the present study.
First, in the present language-switch study, participants had to switch languages on 50% of the trials and repeat languages on the other 50% of the trials. In contrast, in task-switch studies, more repeat than switch trials are presented. P300 amplitude in oddball paradigms systematically varies with stimulus probability with larger P300 amplitudes for low probability events. Therefore, the presence of a P300 in task-switch studies versus absence of a P300 in the present language-switch study could be explained in terms of the sensitivity of P300 to stimulus probability (but see, for a different view, Barceló, Escera, Corral, & Periáñez, 2006; Kiefaber & Hetrick, 2005).
Second, endogenous attentional control in task and language switching differs in terms of stimulus–response mappings. That is, in task-switch studies, participants always need to use arbitrary responses in reaction to the stimuli (e.g., press left when a number is odd), whereas our naming study requires bilinguals to naturally name pictures (e.g., say “tree” when you see a picture of a tree).
Third, task-switch studies usually map more than one stimulus and even more than one task to the same response (e.g., press left when a number is odd or larger than 5). In bilingual picture naming, each response is uniquely assigned to just one stimulus and, most importantly, to just one language. Barceló et al. (2006) and Barceló, Periáñez, and Knight (2002) have reported larger P300 amplitudes for switch than repeat trials in task switching even when task novelty was controlled for. They claimed the functional significance of the P300 to be updating of task set information in working memory (including updating and maintaining competing stimulus–response mappings). If the P300 is related to updating of stimulus–response rules, it is not surprising that it is absent in language-switch studies in which stimulus–response rules are stored in the language system instead of being held active in working memory as is the case for arbitrary stimulus–response rules for task switching.
Attention-related anterior and posterior N2 effects have been observed in the visual modality (e.g., Folstein & van Petten, 2008). However, the anterior negativities observed in the present study do not resemble the visual anterior N2 effect (Folstein & van Petten, 2008) in terms of timing or scalp distribution. The standard N2 effect is a frontal peak, whereas the late anterior effect in the present study is a sustained negativity, not a peak, like N2. In terms of distribution, the early posterior effect in the present study corresponds to the posterior N2 that has been reported to be sensitive to target status of visual stimuli (Folstein & van Petten, 2008), suggesting that L2 switch cues are perceived as being more task-relevant for bilinguals. According to Folstein and van Petten (2008), the anterior N2 should be functionally dissociated from the posterior N2 to which our early posterior effect corresponds. Therefore, we conclude that the effects observed in the present study also do not resemble the anterior N2 effect as previously described by Jackson et al. (2001) for language switching. However, this language inhibition effect was observed by Jackson et al. in response to the stimulus, and according to Green (1998), language inhibition takes place at the word level. Therefore, it is not surprising that the cue-locked ERPs in our experiment did not show the N2 switch–repeat language inhibition effect. That our findings are more in agreement with endogenous attentional control than with language-set inhibition is discussed below.
Nicholson, Karayanidis, Davies, and Michie (2006) separated subcomponents of endogenous control over time. To differentiate effects of switching to the current task set and switching away from the previous task set in cue-locked ERPs, they used three tasks in combination with two cue types. “Switch-away” cues instructed participants to switch away from the previous task set and “switch to” cues instructed participants to switch to the upcoming task set. Nicholson, Karayanidis, Davies, et al. (2006) observed a posterior positivity for switch compared to repeat cue-locked ERP waveforms. For “switch to” cues, this effect was larger and more prolonged than for “switch away” cues. From these results, the authors concluded that on “switch to” cues, a new task set was implemented before stimulus onset. In conclusion, Nicholson, Karayanidis, Davies, et al. (2006) have shown that endogenous task switching is a dual process consisting of switching away from the previously relevant task set and switching to the upcoming relevant task set.
The ERP patterns observed in the present study, in particular, the early posterior effect for L2 in combination with the late anterior effect for both languages, support the existence of a dual process of language switching. When switching to L2, bilinguals need to disengage from their native language to be able to name in L2. In other words, bilinguals need to orient their attention away from their native language (disengaging attention) and orient their attention toward their second language (engaging attention). We propose that what happens in language switching is comparable to attentional orienting between spatial locations in the visual domain. Posner and Raichle (1994) described a dual process for shifting attention: A parietal network is involved in switching away or disengaging from a previously relevant spatial location, whereas a frontal brain network is involved in switching to or engaging (e.g., updating goals in working memory) in a currently relevant location. Although Posner mainly used tasks in which participants had to disengage and engage their attention spatially, there is recent evidence that the same brain areas are involved in disengaging and engaging attention toward other attributes as well, such as color, time (Coull & Nobre, 1998), actions (Rushworth et al., 2001, 2003), and mental representations in working memory (Lepsien & Nobre, 2006). The early parietal switch–repeat effect for L2, as reported in this article, might reflect disengaging from the nontarget native language, whereas the late frontal switch–repeat effect might reflect engaging in the target second language. Similarly, in switching to the native L1, L2 needs to be disengaged and L1 has to be engaged. Presumably, in unbalanced bilinguals, disengaging the weaker L2 is easier than disengaging the stronger L1. This may explain the difference in early posterior negativity between L1 and L2: An effect for L2 switch trials (requiring disengagement of the stronger L1) but not for L1 switch trials (requiring disengagement of the weaker L2). The different topographies of the early and late ERP correlates of endogenous control processes show that different neuronal assembles contribute to the effects. Therefore, the present ERP data are consistent with a dual-process account of language control. Future studies using functional magnetic resonance imaging may provide important information about the neural sources that give rise to the early ERP effect and the late ERP effect reported in this article.
Summary and Conclusions
In this study, we investigated the question of whether bilinguals can orient their selective attention toward the target language in advance of a language switch, thereby biasing naming performance. Bilinguals participated in an overt picture naming task in which the designated language was indicated by a precue (750 msec before stimulus onset). A 2:1 cue-to-language mapping paradigm was used to avoid confounding of cue switching and language switching. Cue-locked ERPs were measured to tap into endogenous language switching on-line, independent of exogenous language switching. An early parietal switch-related negativity was found for L2, but not for L1. A later anterior negativity (switch more negative than repeat) was found for both languages. We take these results to indicate that language switching may rely on the same fronto-parietal attention network that has been proposed to be relevant for switching between spatial locations in the visual domain.
APPENDIX A: SELF-ASSESSED PROFICIENCY FOR PARTICIPANTS OF THIS STUDY
This appendix describes self-assessed proficiency scores, L2 use, and age of onset information for all participants. A self-rating questionnaire was used to obtain proficiency scores. Participants needed to indicate how well their English (L2) skills (reading, writing, listening, and speaking) were compared to Dutch (L1). The scores are on a 5-point scale, in which 1 represents that L2 skills were just as good as L1 skills and 5 represents that L2 skills were much worse than L1 skills. On average, participants rated their proficiency for L2 compared to L1 as 2.57 (SD = 0.93). Scores for L2 use were also measured at a 5-point scale, where 1 represents less than 1 hour per week and 5 represents more than 10 hours per week. Participants’ average L2 use score was 1.68 (SD = 1.15). Age of onset refers to the age at which participants started learning the L2; their mean age in years was 10.63 with a standard deviation of 1.22.
APPENDIX B: MATERIALS
Belgian Dutch as well as American English picture naming norms (Severens, Van Lommel, Ratinckx, & Hartsuiker, 2005; Bates et al., 2003) were used to select pictures with high name agreement in both languages (total mean = 94.2%). Dutch and English picture names were matched as closely as possible on word length (means: L1 = 5.79, L2 = 5.42), number of syllables, number of phonemes, and lemma log frequencies (means: L1 = 1.4, L2 = 1.5) that were obtained from the written sources of the CELEX database.
|Dutch (L1) Name||English (L2) Name|
|Dutch (L1) Name||English (L2) Name|
Reprint requests should be sent to Kim M. W. Verhoef, Department of Anatomy and Neuroscience, Vrije University Medical Centre, PO Box 7057, 1007 MB Amsterdam, The Netherlands, or via e-mail: firstname.lastname@example.org.
To examine possible differences in ERP pattern over time, we carried out supplementary analyses for the first and the second half of the experiment. These analyses indicated that there were no changes in ERP effects over time (all Fs < 1).