Using event-related potentials (ERPs), we investigated the neural response associated with preparing to switch from one task to another. We used a cued task-switching paradigm in which the interval between the cue and the imperative stimulus was varied. The difference between response time (RT) to trials on which the task switched and trials on which the task repeated (switch cost) decreased as the interval between cue and target (CTI) was increased, demonstrating that subjects used the CTI to prepare for the forthcoming task. However, the RT on repeated-task trials in blocks during which the task could switch (mixed-task blocks) were never as short as RTs during single-task blocks (mixing cost). This replicates previous research. The ERPs in response to the cue were compared across three conditions: single-task trials, switch trials, and repeat trials. ERP topographic differences were found between single-task trials and mixed-task (switch and repeat) trials at ∼160 and ∼310 msec after the cue, indicative of changes in the underlying neural generator configuration as a basis for the mixing cost. In contrast, there were no topographic differences evident between switch and repeat trials during the CTI. Rather, the response of statistically indistinguishable generator configurations was stronger at ∼310 msec on switch than on repeat trials. By separating differences in ERP topography from differences in response strength, these results suggest that a reappraisal of previous research is appropriate.
Imagine that I have asked you to quickly read over this article, before I send it out for review, and you come across a misspelled wrod. You would probably mark it, so that I could fix the error, and in so doing, you would have switched from reading to writing, based on a cue in the environment (the misspelled word). The ability of the cognitive system to switch from one action sequence to another, based on environmental contingencies, has been the source of considerable research, and the mechanisms that underlie this ability have been the source of substantial controversy. This issue has interested researchers because understanding the ability to instantiate new action sequences (based on the current context) is one way to investigate how the brain is able to control itself.
The ability of the brain to regulate (control) itself is important given that our environment is often unpredictable, and survival depends on flexibly altering behavior based on changing environmental contingencies. However, the environment is not uniformly unpredictable. This point is clarified by extending the above example. If I asked you to proofread a paper, you would probably read it more slowly than if I asked you to simply “read over” the same paper. In both cases, you would mark any misspellings you found, but in the former case, you would proceed more “carefully” because when proofreading, one expects errors to be frequent, and in this case, errors require a change in behavior. That is, when unpredictable changes in behavior are expected (frequent), there is a qualitative change in behavior: We are more “careful” in the way that we respond to all stimuli, whether or not they require a change of action.
There is a large literature on the difference in performance between single-task (“pure”) blocks, and blocks in which the tasks are intermingled, or mixed (Eppinger, Kray, Mecklinger, & John, 2007; Goffaux, Phillips, Sinai, & Pushkar, 2006; Ruge, Stoet, & Naumann, 2006; Braver, Reynolds, & Donaldson, 2003; Meiran, Chorev, & Sapir, 2000; Wylie & Allport, 2000; Los, 1996; Poulton, 1973). Subjects' performance is worse in mixed blocks than in pure blocks: the so-called mixing cost.1 Similarly, if one analyzes the behavior from only mixed blocks, separating trials on which the task switched (relative to the task on the previous trial) from trials on which the task repeated (relative to the previous trial), one finds that performance is worse on switch trials than repeated-task (repeat) trials (e.g., Hsieh & Cheng, 2006; Kieffaber & Hetrick, 2005; Gehring, Bryck, Jonides, Albin, & Badre, 2003; Karayanidis, Coltheart, Michie, & Murphy, 2003; Logan & Bundesen, 2003; Swainson et al., 2003; Waszak, Hommel, & Allport, 2003; Barcelo, Periáñez, & Knight, 2002; Nieuwenhuis & Monsell, 2002; Rushworth, Passingham, & Nobre, 2002; DeJong, 2001; Mayr & Keele, 2000; Wylie & Allport, 2000; Meiran, 1996; Rogers & Monsell, 1995; Allport, Styles, & Hsieh, 1994; Spector & Biederman, 1976). Despite some recent work in this area (see below), the neural mechanisms that underlie the mixing cost and the switch cost are not well understood—neither in terms of their timing nor the specific generators engaged. One of the purposes of the current study was to investigate this issue using electrical neuroimaging analyses of scalp-recorded event-related potentials (ERPs).
A related issue is that of task preparation. Within a mixed block of trials, it has been found that switch costs decrease as subjects are given more time to prepare for a switch of task (e.g., Wylie, Javitt, & Foxe, 2004; Meiran, 1996; Rogers & Monsell, 1995). For instance, in some experiments, the task has randomly changed across trials, and a cue informing the subject of the task to perform on the forthcoming trial has been presented at varying times before the imperative stimulus. As the cue–target interval (CTI) has increased, thereby increasing the amount of time subjects have to prepare, the cost of switching has been found to decrease. This has led some to speculate that there is an “endogenous” mechanism that activates brain networks required for completing the new task in response to a cue stimulus (e.g., Meiran, 1996; Rogers & Monsell, 1995). Behaviorally, if subjects are provided sufficient time for this mechanism to operate prior to stimulus delivery, then reaction times (RTs) are relatively short because the brain network mediating the new task is active. When subjects are afforded insufficient time for the operation of this mechanism, RTs increase because the mechanism must continue to complete its operation(s) during stimulus processing. This line of reasoning produces the hypothesis that the neurophysiological responses to cue stimuli should reflect the operation of such a mechanism, particularly when sufficient temporal intervals are provided between the cue and the imperative stimulus for subjects to adequately prepare for the upcoming task. Indeed, several researchers have used this type of paradigm to investigate these issues using functional magnetic resonance imaging (e.g., Slagter et al., 2006; Brass & von Cramon, 2004). However, in the paradigm used by Brass and von Cramon (2004), the activity associated with the cue could not be distinguished from that associated with the imperative stimulus, making it difficult to ascertain the effects of preparation. The study by Slagter et al. (2006) was specifically designed to look at cueing effects and the effects of preparation. However, this study did not manipulate the CTI, and therefore, there is no way to know whether their subjects used the interval between the cue and the imperative stimulus to prepare for the forthcoming trial.
In the electrophysiology (ERP) literature, this paradigm has been used for some time. One consistent finding is that cues signaling a switch of task are associated with an increased positivity approximately 300 msec after the cues are presented (Nicholson, Karayanidis, Bumak, Poboka, & Michie, 2006, Nicholson, Karayanidis, Davies, & Michie, 2006; Nicholson, Karayanidis, Poboka, Heathcote, & Michie, 2005; Poulsen, Luu, Davey, & Tucker, 2005; Gehring et al., 2003; Hsieh & Yu, 2003; Karayanidis et al., 2003; Barcelo et al., 2002; Rushworth et al., 2002). Although there is variability in the topography of this positivity, with some reporting it centered over frontal sites (Poulsen et al., 2005; Barcelo et al., 2002; Rushworth et al., 2002), some over central sites (Nicholson et al., 2005; Gehring et al., 2003), and still others over parietal sites (Nicholson, Karayanidis, Bumak, et al., 2006; Nicholson, Karayanidis, Davies, et al., 2006; Karayanidis et al., 2003), it is consistently attributed to processes associated with task-set preparation.
Studies investigating the neural mechanisms underlying the mixing cost and the switch cost using ERPs have found a similar pattern of results (e.g., Eppinger et al., 2007; Goffaux et al., 2006; Ruge et al., 2006; Kray, Eppinger, & Mecklinger, 2005; West, 2004). Using similar cueing paradigms, these studies have shown that single-task trials differ from repeat trials at around 300 msec after the cue, and that this difference (which is also a positivity) is centered over parietal sites (e.g., Eppinger et al., 2007; Kray et al., 2005; West, 2004). Although some have interpreted this as showing that the cues in the single-task blocks received less processing than the cues in the mixed-task blocks (West, 2004), others, noting the similarity between these results and those from task-switching studies, have proposed that similar mechanisms might be used to prepare for repeat trials in a mixed block (relative to trials in a single-task block) that are used to prepare for switch trials (relative to repeat trials) (Ruge et al., 2006).
These results have been broadly interpreted within the framework of task-switching models from the cognitive psychological literature. Numerous models have been proposed, ranging from relatively simple hypotheses (Rogers & Monsell, 1995; Allport et al., 1994), to box-and-arrow models (Meiran, 2001; Rubinstein, Meyer, & Evans, 2001), to production systems (Meyer & Kieras, 1997), to mathematical models (Logan & Bundesen, 2003; DeJong, 2001), to computational models (Rougier, Noelle, Braver, Cohen, & O'Reilly, 2005; Yeung & Monsell, 2003; Gilbert & Shallice, 2002). Despite their several differences, these models fall into two broad categories: those that propose that the differences between switch and repeat trials are attributable to an interpolated, stage-like process (e.g., Rubinstein et al., 2001; Rogers & Monsell, 1995), and those that propose that these differences are due to interference from previously relevant task sets (e.g., Wylie, Javitt, & Foxe, 2003a, 2003b; Allport & Wylie, 2001). Although these theories are not mutually exclusive, in their strongest forms they make different predictions about the effects on single-task (pure) trials, switch trials, and repeat trials. For example, if a stage-like process is required on switch trials, one might expect to find differences such as those seen in the ERP response to switch versus repeat cues (i.e., the positivity at ∼300 msec). Indeed, this result has been interpreted as supporting the stage-like view (e.g., Nicholson, Karayanidis, Bumak, et al., 2006; Nicholson, Karayanidis, Davies, et al., 2006; Nicholson et al., 2005; Karayanidis et al., 2003; Rushworth et al., 2002). However, further predictions can be made. Specifically, stage-like models should predict that differences between switch and repeat trials should be qualitative—switch trials require an interpolated processing stage that is absent on repeat (and pure) trials. Furthermore, these models should predict either no differences between cues on pure and repeat trials (because neither requires this interpolated stage), or quantitative differences (because one might expect cues on repeat trials to require more processing than cues on pure trials; West, 2004). Models that attribute differences between switch and repeat trials to interference from previously relevant task sets make a different set of predictions. Because there should be minimal interference on pure-task trials, there should be a qualitative difference between the ERPs associated with processing the cues on pure trials and those on switch or repeat trials. However, the difference between cues on switch trials and repeat trials should be a quantitative one because (by hypothesis) the difference is largely due to more interference on switch trials than on repeat trials.
In order to assess these predictions, we used a paradigm that has been shown to result in switch costs when the CTI is short, but not when the CTI is long (Wylie et al., 2004). In this paradigm, three CTIs were used: 180 msec, 480 msec, and 780 msec. By comparing RTs when the CTI was short to RTs when the CTI was longer, we could assess whether subjects successfully used the interval to prepare for the forthcoming task. However, interpretation of ERPs to the shortest and longest CTIs would each be confounded. In the case of ERPs to the short CTI, sensory processing of the imperative stimulus would be superimposed on ongoing processes associated with the cue, obfuscating the ability to isolate preparatory mechanisms. In the case of ERPs to the longest CTI, subjects would have no uncertainty as to when the imperative stimulus would be presented and would therefore be able to anticipate it, which in turn would result in the superposition of anticipatory slow-wave potentials (e.g., contingent negative variation or CNV; e.g., Walter, 1964; Walter, Cooper, Aldridge, McCallum, & Winter, 1964). For these reasons, we restricted ERP analyses to the middle (i.e., 480 msec) CTI, although behavioral analyses included data from all three CTIs.
To foreshadow our results, we use electrical neuroimaging analyses of ERPs (Murray, Brunet, & Michel, 2008) to demonstrate that distinct neurophysiological mechanisms mediate mixing and switch costs (a qualitative difference). Mixing costs were first apparent in the ERP at ∼160 msec and as a change in the topography of the electric field at the scalp, with one topographic distribution observed more frequently on pure trials and another observed more frequently on both switch and repeat trial types. By contrast, a difference in the strength of the electric field at the scalp between switch and repeat trials appeared at ∼310 msec (a quantitative difference), with stronger ERPs in response to switch than either repeat or pure trial types.
Fourteen subjects (9 men, aged 19–36 years, mean ± SD = 26 ± 5.27 years) participated in this experiment. All subjects had normal, or corrected-to-normal visual acuity, and normal color vision. All subjects were right-handed, had neither history of nor current neurological or psychiatric disorders, and were paid for their participation. The behavioral data from four subjects were lost due to technical difficulties. The Institutional Review Board of the Nathan Kline Institute approved the procedures, and all subjects provided written informed consent to the experimental procedures, which were in accordance with the Declaration of Helsinki.
The stimuli consisted of a cue and an imperative stimulus (see Figure 1). The cue (100 msec duration) was a colored square (50 pixels2) that was either red (red = 100%, green = 0%, blue = 40%) or purple (red = 100%, green = 0%, blue = 100%). These colors were the same as those used in Wylie et al. (2003a, 2003b); the same colors were used in this study to aid in comparing the results of the two experiments. When the cue was colored red, subjects were instructed to perform one task on the forthcoming imperative stimulus (e.g., the letter task, see below); when the cue was purple, they were instructed to do the other task (e.g., the number task, see below). The cue-to-task mapping was counterbalanced across subjects. The cues changed pseudorandomly across trials.
The imperative stimulus was a gray (red, green, and blue all = 45%) letter–number pair (see Figure 1), presented in 30-point Times font. The letter was presented on one side of a fixation cross, and the number was presented on the other side. The side of each randomly changed across trials. This stimulus was presented 180 msec, 480 msec, or 780 msec after the cue was presented (thus, the interstimulus interval was 80, 380, or 680 msec), and was on the screen for 120 msec. The CTI changed randomly across trials. The letters were randomly drawn from a set containing four vowels [A E I U] and four consonants [G K M R], and the numbers were randomly drawn from the set containing four even numbers [2 4 6 8] and four odd numbers [3 5 7 9]. The only constraint, in both cases, was that the characters on trial n were different from those on trial n − 1. The interval between successive cues was always 3 sec.
Tasks and Procedure
Subjects performed one of two tasks with each imperative stimulus. If they were cued to perform the letter task, they categorized the letter according to whether it was a vowel or a consonant; if they were cued to perform the number task, they categorized the number according to whether it was even or odd. As in previous work (Wylie et al., 2003a, 2003b), we used a go/no-go response regimen for each task (this allowed us to collect ERPs that were uncontaminated by response artifact). For the letter task, subjects were required to respond when the letter was a vowel, and to withhold a response if the letter was a consonant; for the number task, they were instructed to respond if the number was even and to withhold a response otherwise. The instructions were counterbalanced across subjects.
Subjects were seated ∼105 cm from a computer monitor (640 × 480 pixel resolution) in a dimly lit, sound-attenuated, electrically shielded room. All stimuli were presented at the center of the screen, superimposed on a fixation cross. Subjects were instructed to maintain central fixation throughout each block of 150 trials. All subjects began the experimental session by performing each of the two tasks alone for three blocks (see Figure 1). The order of the tasks was counterbalanced across subjects. The cueing during these blocks was exactly the same as during the later switching blocks (i.e., the cues randomly changed across trials). Subjects were instructed to use the cues as warning stimuli for the forthcoming imperative stimulus, and the eventual mapping of cue to task was withheld until the initial switching block. These blocks of single-task performance provide an ERP baseline (i.e., brain responses to these stimuli prior to any necessity to switch).
After the three single-task blocks, subjects were then instructed about the requirement to switch between the tasks and completed 13–21 blocks (mean = 17.64, mode = 18). They were required to take short breaks between the blocks, and were encouraged to take longer breaks and leave the testing room whenever they felt the need. This was done to prevent fatigue and concentration lapses.
EEG Acquisition and Preprocessing
Continuous electroencephalogram (EEG) was acquired with Neuroscan Synamps from 128 scalp electrodes referenced to the nose (band-filtered from 0.05 to 100 Hz; digitized at 500 Hz; impedances <5 kΩ). ERPs were computed in response to cues only by averaging peristimulus epochs of continuous EEG (−100 to 800 msec) from each trial type (i.e., pure, switch, and repeat). Trials containing blinks or eye movements were rejected off-line on the basis of horizontal and vertical DC electrooculogram. An automated artifact rejection criterion of ±80 μV was applied at all other sites. Data from those sites exhibiting nonphysiological artifacts (e.g., poor electrode–scalp contact) were labeled as “bad” and were interpolated for each subject and condition after the above artifact rejection procedure (Perrin, Pernier, Bertrand, Giard, & Echallier, 1987). The resulting ERPs were further down-sampled to a common 111-channel montage. Following this procedure and prior to group-averaging, each subject's data were 40 Hz low-pass filtered, baseline corrected using the 100-msec prestimulus period, and recalculated against the average reference.
ERP data were analyzed with a multistep procedure that uses local as well as global measures of the electric field at the scalp, which has been referred to as electrical neuroimaging. This procedure and its benefits over standard waveform analyses have been described in detail elsewhere (Murray, Brunet, & Michel, 2008 for a tutorial review; also, e.g., Murray, Imber, Javitt, & Foxe, 2006; Foxe, Murray, & Javitt, 2005; Murray et al., 2004, 2005; Michel et al., 2004). Briefly, it entails analyses of response topography and response strength to differentiate effects due to alterations in the configuration of underlying generators (viz. the topography of the electric field at the scalp) as well as latency shifts in brain processes across experimental conditions from modulation in the strength of responses of statistically indistinguishable brain generators. These analyses are briefly detailed here, below. All analyses were conducted using CarTool software (http://brainmapping.unige.ch/Cartool.htm). In addition, we utilized the local autoregressive average (LAURA; Grave de Peralta Menendez, Murray, Michel, Martuzzi, & Gonzalez Andino, 2004; Grave de Peralta Menendez, Gonzalez Andino, Lantz, Michel, & Landis, 2001) distributed linear inverse solution to visualize the likely underlying sources of our effects.
The group-averaged ERP topography as a function of time and condition was analyzed with a topographic pattern analysis that uses a modified hierarchical agglomerative clustering algorithm (these methods are implemented in CarTool software; see also Tibshirani, Walther, Botstein, & Brown, 2005; a tutorial film can also be found at http://brainmapping.unige.ch/docs/Murray-Supplementary.pps). Topographies from the 500-msec postcue period were compared over time within and across conditions because topographic changes indicate differences in the configuration of the brain's active generators (Srebro, 1996; Fender, 1987). Analysis of ERP topography is independent of the reference electrode (see, e.g., Michel et al., 2004) and is insensitive to pure amplitude modulations across conditions (topographies of normalized maps are compared). The optimal number of maps (i.e., the minimal number of maps that accounts for the greatest variance of the dataset) is determined using a modified Krzanowski–Lai criterion. The pattern of maps observed in the group-averaged data was statistically tested by comparing each of these maps with the moment-by-moment scalp topography of individual subjects' ERPs from each condition. Each time point was labeled according to the map with which it best correlated spatially, yielding a measure of map presence that was, in turn, submitted to an analysis of variance (ANOVA), with factors of trial type and map (hereafter referred to as “fitting”; cf. Murray et al., 2006; Brandeis, Lehmann, Michel, & Mingrone, 1995). In other words, the fitting procedure yields a degree of expression of a given map that is observed in the group-average ERP in the ERP from single subjects as well as each experimental condition (in the case of the present study, the ERPs from pure, switch, and repeat conditions). This fitting procedure revealed when and if a given trial type was more often described by one map versus another, and therefore, if different generator configurations better accounted for particular trial types. It is important to note that this analysis quantifies how individual subjects' ERPs correlate with template maps based on the group-average ERP, rather than an assessment of how the topography itself modulates. To statistically identify periods of topographic modulation, we calculated the global dissimilarity (Lehmann & Skrandies, 1980) between responses for each time point and applied a Monte Carlo bootstrapping analysis procedure (detailed in Murray et al., 2004). This analysis has colloquially been dubbed topographic ANOVA or “TANOVA” and provides a statistical means of determining if and when brain networks mediating ERP responses differ.
The abovementioned topographic pattern analysis was also used to define time periods of stable ERP topography (i.e., components) over which global field power (GFP) measures were calculated and analyzed. GFP is equivalent to the spatial standard deviation of the scalp electric field (Lehmann & Skrandies, 1980). The observation of a GFP modulation does not exclude the possibility of a contemporaneous change in the electric field topography or topographic modulations that nonetheless yield statistically indistinguishable GFP values. However, observation of a GFP modulation without simultaneous topographic changes is most parsimoniously interpreted as amplitude modulation of statistically indistinguishable generators across experimental conditions. The analysis of a global waveform measure of the ERP was performed so as to minimize observer bias that can follow from analyses restricted to specific selected electrodes. GFP area measures were calculated (i.e., the integral as a function of time vs. the 0 μV baseline) and statistically tested by ANOVA.
Finally, we estimated the sources in the brain underlying the ERPs from each condition, using the LAURA distributed linear inverse solution (Grave de Peralta Menendez et al., 2001, 2004; see Michel et al., 2004 for a comparison of inverse solution methods). LAURA selects the source configuration that better mimics the biophysical behavior of electric vector fields (i.e., activity at one point depends on the activity at neighboring points according to electromagnetic laws described in the Maxwell equations). The solution space was calculated on a realistic head model that included 4024 nodes, selected from a 6-mm3 grid equally distributed within the gray matter of the Montreal Neurological Institute's average brain. Group-averaged source estimations were calculated by first averaging the ERP from each subject and trial type over time periods identified from the above mentioned topographic pattern analysis. This yielded one source estimation result per subject per trial type. We emphasize that these estimations provide visualization, rather than a statistical analysis, of the likely underlying sources.
In order to assess whether subjects evidenced a “mixing cost” (slower performance during the switching blocks than during the single-task blocks), their RTs from the single-task blocks were compared to their RTs from the repeat trials (the first repeat trial after a switch) of the switching blocks (see Figure 2). The factors were mixing (pure vs. mixed block), task (letter vs. number), and CTI (180, 480, 780 msec). Both the main effects of mixing [F(1, 9) = 34.89, p < .0001] and CTI [F(2, 18) = 7.8, p = .004] were significant. The effect of CTI was due to longer responses when the CTI was 180 than when it was longer [t(9) = 2.64, p = .027 and t(9) = 3.48, p = .007 for the 480- and 780-msec CTIs, respectively]. There was no reliable difference between the 480- and 780-msec CTIs. The effect of mixing was due to longer RTs in the mixed blocks than the single-task blocks. Furthermore, these two effects significantly interacted [F(2, 18) = 10.92, p = .001], which was due to the fact that responses were longer in the mixed block than in the pure blocks at each of the CTIs, but this effect was larger for the 180-msec CTI than for the longer CTIs. There were no reliable effects in the error data; subjects were very accurate (∼99%).
Preparation Time (CTI) Effects
In order to ensure that the manipulation of CTI had an effect on subjects' preparedness, we analyzed their behavioral data (see Figure 2) with a 3 × 4 repeated measures ANOVA. The factors were CTI (180, 480, 780 msec) and trial (switch, Repeat 1, Repeat 2, Repeat 3). [The data were first analyzed with an ANOVA that included the factor task (letter vs. number), but this factor did not produce any significant effects or interactions.] The main effect of CTI was significant [F(2, 18) = 34.51, p < .0001]. As above, this was due to longer responses when the CTI was 180 than when it was longer [t(9) = 7.02, p < .0001 and t(9) = 5.87, p < .0001 for the 480- and 780-msec CTIs, respectively]. There was no reliable difference between the 480- and 780-msec CTIs. Furthermore, CTI interacted with trial [F(6, 54) = 5.6, p < .0001]. Planned comparisons showed that this interaction was due to a reliable switch cost when the CTI was 180 msec, but none at either of the longer CTIs. The same analysis was performed on the error data. The only significant effect was that of CTI [F(2, 18) = 7.20, p = .005]. This was due to subjects making more errors when the CTI was 180 msec than when it was 480 or 780 msec.
The topographic pattern analysis isolated four time periods of stable electric field configurations (60–156 msec, 158–218 msec, 220–308 msec, and 310–500 msec). That is, during these time periods, a given topography or multiple topographies predominated the group-averaged ERP (see Methods for details). During some of these periods (158–218 and 310–500 msec), different template maps best described the group-average ERPs from a particular trial type, such that responses to pure trials were characterized by one topography, whereas responses during the same poststimulus time interval to switch and repeat trials were characterized by a different topography (see Figure 3A). This observation, based on the group-averaged ERPs, was then statistically assessed using the fitting procedure (detailed in Methods; see Figure 3B). Over the 158–218 msec period, there was a significant interaction between trial type and template map [F(2, 12) = 6.473, p = .012). Neither main effect reached the .05 significance criterion. Follow-up contrasts revealed that this interaction was explained by one template map better accounting for responses to pure trials than either switch [t(13) = 3.568, p = .003] or repeat trials [t(13) = 2.668, p = .019]. By contrast, the fitting of template maps to the responses from switch and repeat trials did not significantly differ [t(13) = 0.363, p > .70]. Similarly, over the 310–500 msec period, there was again a significant interaction between trial type and template map [F(2, 12) = 9.564, p = .003], as well as a significant map effect of template map [F(1, 13) = 10.403, p = .007]. Follow-up contrasts revealed that this interaction was explained by one template map better accounting for responses to pure trials than either switch or repeat trials [t(13) = 4.511, p = .001 and t(13) = 3.921, p = .002, respectively]. By contrast, the fitting of template maps to the responses from switch and repeat trials did not significantly differ, although the t value approached conventional levels of significance [t(13) = 1.978, p > .070]. In solid agreement with the results of the topographic pattern analyses, the three TANOVAs indicated that responses to pure trials differ topographically from responses to either switch or repeat trials over the 185–220 msec period and also over the 300–500 msec period. No temporally sustained topographic differences (i.e., differences that persisted for at least 20 msec, using the temporal criterion described by Guthrie & Buchwald, 1991) were observed between switch and repeat trials over the −100 to 500 msec peristimulus interval.
At this point, the analyses indicate that mixing costs follow from a change in the configuration of the underlying active network in the brain and manifest as topographic changes in the electric field at the scalp over two distinct postcue time periods (158–218 and 310–500 msec). By contrast, switch costs do not engage in such a mechanism. However, modulations in response topography can be dissociable from modulations in response strength. We therefore also analyzed the GFP waveforms in response to each trial type, which are displayed in Figure 3C (see also the Appendix for the raw waveforms from several midline sites; broadly, these waveforms show a pattern similar to the GFP waveforms). Visual inspection of these waveforms suggests that modulations are present across trial types from approximately 300 msec onward, with no visible effects beforehand. Using the above topographic pattern analysis as a basis for identifying ERP components, we calculated GFP area measures over the 60–156, 158–218, 220–308, and 310–500 msec windows, and submitted these values to ANOVA. Only over the 310–500 msec period was a main effect of trial type observed [F(2, 12) = 5.436, p = .021]. This followed from stronger responses to switch than either repeat or pure trials [t(13) = 2.490, p = .027 and t(13) = 3.414, p = .005, respectively] and stronger responses to repeat than pure trials [t(13) = 2.710, p = .018]. This pattern of results—namely, a GFP modulation in the absence of evidence for topographic changes—would thus suggest that switch costs first manifest in the present study as a change in the strength of responses within a statistically indistinguishable brain network approximately 300 msec following cue presentation. Recall, however, that there was no difference in RT between switch and repeat trials when the CTI was 480 msec. This led us to investigate whether there was any relationship between the observed GFP (where a difference is evident between switch and repeat trial types) and RT (where such a difference is not evident). That is, we correlated each subject's GFP area over the 310–500 msec with their mean RT separately for each trial type. For both switch and repeat trial types, there was a significant negative correlation between these two measures [r(8) = −.821, p = .004 and r(8) = −.839, p = .002, respectively]. No such relationship was obtained for pure trials [r(8) = −.60, p > .05]. These negative correlations would thus suggest that the larger the GFP over this period, the faster a subject's RT was to the upcoming imperative stimulus. In other words, a larger GFP over this period is indicative of greater preparation for processing the upcoming imperative stimulus, irrespective of whether the task is switching or repeating. We would further add that “preparation” in this context is likely not in terms of motor/premotor activity; the present GFP modulation occurs prior to the imperative stimulus as well as the motor response to the imperative stimulus.
Based on the above topographic pattern analysis, source estimations were then calculated from the data from each trial type and subject over the 158–218 and 310–500 msec periods, separately, as these were the two time periods when significant topographic and GFP differences were obtained. Figure 4shows the group-averaged source estimations for each of the three trial types over the 158–218 msec epoch. Source estimations for all three trial types were located within visual cortices of the occipital and posterior temporal lobes. Source estimations for pure trials exhibited a more bilateral distribution that was concentrated toward the occipital pole. By contrast, the source estimations for the switch and repeat trials were strongly lateralized to the right hemisphere and included activations extending into the posterior temporal lobe.
Group-averaged source estimations for all three trial types over the 310–500 msec epoch are shown in Figure 5. Here, the difference between pure trials and switch/repeat trials is more striking. Although the active sources in response to pure trials remained predominantly within posterior regions, those in response to switch/repeat trial types included parietal and frontal sources. This parieto-frontal involvement appears to be stronger on switch trials than either pure or repeat trials. Additionally, activations within posterior regions were graduated across trial types, such that they were strongest on switch trials, followed by repeat trials, and then pure trials.
We set out to investigate the response associated with effective preparation for a switch of task, using two baselines: repeated-task trials and “pure” trials. Our results replicate previous results and add an important qualification to them. For example, Brass, Ullsperger, Knoesche, von Cramon, and Phillips (2005) showed activity in a fronto-parietal network when switch trials were compared to repeat trials in an ERP investigation of task-switching (see also Poulsen et al., 2005; Rushworth et al., 2002). However, the mechanistic interpretation of their results is limited because their analyses did not permit the differentiation of contributions of signal strength (GFP) and generator configuration (topographical analysis). The activity in this network could be due to switch trials using mechanisms not used on repeat trials (i.e., a different topographic distribution of the ERP), or to a different amount of activity in the same generators (or a combination of different generators and changes in strength of activity). The results reported here suggest that the difference between switch and repeat trials is due primarily to differences in the strength of responses within a statistically indistinguishable brain network. That is, the same mechanisms appear to be employed on switch and repeat trials (no difference in the topographical analysis), but to a different extent (stronger GFP on switch than on repeat trials). This is an important result because it suggests that the activity on switch trials is not qualitatively different than that on repeat trials.
Other investigations of task switching have also found that similar networks are active on switch and repeat trials. For instance, Slagter et al. (2006) reported results comparable to those reported here, using functional magnetic resonance imaging. However, their paradigm did not allow them to demonstrate that subjects actually used the cues to prepare for the forthcoming task (only one CTI was used). Furthermore, although they varied the complexity of the switch, their paradigm did not include switches of task: Subjects performed an orientation judgment throughout. Rather, switches in this study were switches of the stimulus to use for the orientation task. This might explain why the effects they found were largely over posterior areas. Our data show that a statistically indistinguishable network is active in preparation for switch and repeat trials, even when subjects demonstrably prepared for the forthcoming task, and when two tasks involved different stimulus–response (S–R) transformations (letter vs. number categorizations).
One way to interpret this, that we (Wylie et al., 2003a, 2003b, 2004; Wylie & Allport, 2000) and others (e.g., Gehring et al., 2003; Hsieh & Yu, 2003; Barcelo et al., 2002) have favored in the past, is to suggest that switching is accomplished through a competitive process in which the possible S–R mappings compete with one another. According to this interpretation, a substantial portion of the “switch cost” is attributable to the time it takes to resolve this competition, and for the system to settle into a state in which one S–R mapping has “won” the competition (is most active). Although several experiments have found evidence for this sort of competition in the ERPs associated with the processing of the imperative stimulus (Ruge et al., 2006; Hsieh & Yu, 2003; Wylie et al., 2003a, 2003b), it has remained an open question as to whether the same sort of mechanism might underlie the preparatory processes associated with the cue. One well-replicated finding in the literature is that as subjects are allowed more time to prepare for a switch of task, their switch cost decreases, although typically not to zero. This has been interpreted as showing that there are, indeed, some processes that are under the subjects' control, and that can be initiated in preparation for a switch of task, if sufficient time is provided.
An alternative interpretation is that there is a competitive process, which is initiated at some time after the cue is presented—in this paradigm, ∼300 msec after the cue was presented. If this interpretation is correct, we would expect this process to be present on repeat trials as well as switch trials, but to be absent (or very much less) on pure-task trials. However, although present on both switch and repeat trials, there is a large body of evidence to suggest that it should be larger on switch trials. Our data fully support this interpretation. We find no differences in generator configuration on switch and repeat trials, a finding that is all the more surprising given that subjects were able to effectively prepare for the task on switch trials (relative to repeat trials). Rather, we find a difference in the strength of activity between switch and repeat trials—with larger amplitude on switch than on repeat trials. Indeed, the only differences we found in generator configuration was between pure-task trials and mixed-task trials, consistent with the view that there is a qualitative difference between pure-task and mixed-task trials (e.g., competition/interference is far stronger in the mixed-task blocks, and/or there are differences in workload, arousal, effort, etc.). If it is the case that activity in this fronto-parietal network is a metric of interference/competition, we might expect more activity in this network to be associated with decreased interference and, therefore, faster responses. That is, we would expect to find negative correlations between the strength of activity in this network and RT. Furthermore, because we expect this network to be active on both switch and repeat trials, but not on single-task trials, we might expect to find this relationship only on switch and repeat trials and not on single-task trials. This is exactly the pattern of results we find, lending further support to this view. However, because of the several differences between the pure and mixed blocks that also might be expected to lead to faster RTs (e.g., differences in arousal), this interpretation is not conclusive.
Another aspect of these data that replicates previous work (e.g., Eppinger et al., 2007; Braver et al., 2003; Meiran et al., 2000; Wylie & Allport, 2000) is that subjects' RTs were longer during the mixed blocks than during the “pure,” single-task blocks. This has been interpreted as showing that interference exists even on repeat trials of mixed blocks (Wylie & Allport, 2000; Allport et al., 1994). Here, we extend this work by detailing the electrophysiological correlates associated with this interference. The earliest difference between the ERPs associated with the cue in pure blocks and in mixed blocks occurred at ∼160 msec. Although a number of researchers have suggested a fronto-parietal network underlying control processes (e.g., Brass & von Cramon, 2004; Sylvester et al., 2003; Gurd et al., 2002; MacDonald, Cohen, Stenger, & Carter, 2000), we have shown that this network may be more sensitive to competition in the system rather than to control per se (Wylie et al., 2004). In one of the contrasts in that experiment, we compared two blocks in which subjects switched on every trial, thereby equating the amount of switching in the two blocks. The only difference between these two blocks was the amount of interference from previously learned S–R associations. When subjects had learned more than one S–R association for each stimulus type, there was increased activity in a fronto-parietal network, relative to when they had only learned one. In the present study, we find increased activity in a fronto-parietal network in the mixed blocks, relative to the pure blocks. Taken together, these results suggest that competition in the system might arise as early as ∼160 msec, in response to the cue, and that this competition is present even on repeat trials (in the current paradigm). Indeed, at ∼160 msec, there was no difference in the ERP response on switch trials relative to repeat trials (see also Nicholson, Karayanidis, Bumak, et al., 2006; Nicholson, Karayanidis, Davies, et al., 2006; Nicholson et al., 2005; Rushworth et al., 2002).
One issue that has become important in the literature on task switching is the importance of distinguishing switching the task from switching the cue (Logan & Bundesen, 2003). In paradigms such as the one used here, every time the task switched, the cue instructing subjects of which task to perform also switched—thus confounding switching task with encoding a new cue. This would complicate the interpretation of the data, were it not for the “pure” task blocks. In these single-task blocks, the stimulation was exactly the same as in the switching blocks—that is, the cue switched just as frequently in the single-task blocks as in the switching blocks. However, in the single-task blocks, the fact that the cues would eventually be associated with different tasks was carefully withheld from subjects (they were merely told to use the cue as a warning stimulus for the forthcoming imperative stimulus). Thus, subjects were required to encode a new cue half as frequently during the single-task blocks (the cue switched on roughly half of the trials) as during the later switching trials (the cue switched on every switch trial). If the difference in GFP between switch and repeat trials was due merely to the necessity of encoding a new cue on the switch trials, one would expect the GFP of the “pure” trials to be less than switch trials (where subjects had to encode a new cue on every trial), but more than repeat trials (where subjects never had to encode a new cue). In fact, as Figure 3 shows, the GFP on “pure” trials was reliably less than that on either switch or repeat trials—strongly suggesting that the difference in GFP between switch and repeat trials is due to some other cause.
Using a paradigm in which subjects were able to effectively prepare for a forthcoming switch of task, we find evidence that supports a model of executive control in which control is affected through the competition of relevant S–R associations. These results have wide-reaching implications for theories of executive function, and also in the broader framework of psychological function and dysfunction.
Cartool software (http://brainmapping.unige.ch/Cartool.htm) has been programmed by Denis Brunet, from the Functional Brain Mapping Laboratory, Geneva, Switzerland, and is supported by the Center for Biomedical Imaging (www.cibm.ch) of Geneva and Lausanne. We thank Prof. Christoph Michel for providing additional analysis tools, as well as Drs. Sara Gonzalez Andino and Rolando Grave de Peralta Menendez, for providing the LAURA inverse solution. Finally, we would like to acknowledge the National Institutes of Health (MH067653 [G. W.] and MH65350 [J. F.]) for grant support.
Reprint requests should be sent to Glenn R. Wylie, Neuropsychology and Neuroscience Laboratory, Kessler Medical Rehabilitation Research and Education Corporation, 1199 Pleasant Valley Way, West Orange, NJ 07052, or via e-mail: firstname.lastname@example.org.