Severe capacity limits, closely associated with fluid intelligence, arise in learning and use of new task rules. We used fMRI to investigate these limits in a series of multirule tasks involving different stimuli, rules, and response keys. Data were analyzed both during presentation of instructions and during later task execution. Between tasks, we manipulated the number of rules specified in task instructions, and within tasks, we manipulated the number of rules operative in each trial block. Replicating previous results, rule failures were strongly predicted by fluid intelligence and increased with the number of operative rules. In fMRI data, analyses of the instruction period showed that the bilateral inferior frontal sulcus, intraparietal sulcus, and presupplementary motor area were phasically active with presentation of each new rule. In a broader range of frontal and parietal regions, baseline activity gradually increased as successive rules were instructed. During task performance, we observed contrasting fronto-parietal patterns of sustained (block-related) and transient (trial-related) activity. Block, but not trial, activity showed effects of task complexity. We suggest that, as a new task is learned, a fronto-parietal representation of relevant rules and facts is assembled for future control of behavior. Capacity limits in learning and executing new rules, and their association with fluid intelligence, may be mediated by this load-sensitive fronto-parietal network.
The term goal neglect was introduced by Duncan, Emslie, Williams, Johnson, and Freer (1996) to describe a form of performance failure where, although participants can say exactly what it is they should do, they show no apparent attempt to do it. Such behavior has been described in patients with major damage to the frontal lobes (e.g., Luria, 1966; Milner, 1963) but also in people from the normal population (Duncan et al., 1996). Goal neglect has been found to closely relate to scores on a standard test of fluid intelligence (Duncan et al., 1996), suggesting a link between frontal lobe function, fluid intelligence, and the effective use of task rules. Recently, Duncan et al. (2008) examined the role of task complexity in goal neglect. Intriguingly, neglect was not influenced by complexity at task execution, for example, by attentional demand or number of behavioral alternatives during a single trial or trial block. Instead, the key factor was the complexity of the whole set of rules specified in task instructions.
The findings are illustrated by one of the complex, multicomponent tasks used by Duncan et al. (2008, Experiment 4). On each trial, a pair of numbers appeared on a computer screen. On most trials, the numbers were surrounded by colored shapes. Participants were divided into two groups. One group were given instructions for two tasks, one for numbers without surrounds and the other for numbers with surrounds (full instructions condition). The second group was only instructed about the surround trials (reduced instructions condition). Apart from this, both groups performed an identical task, as the main experimental blocks included only surround trials, and participants were explicitly told that no-surround trials would not occur in these blocks. Despite similar performance demands, full instruction participants neglected significantly more rules for the no-surround trials than those in the reduced instruction group. Neglect was observed as a tendency to simplify the response rules, and, importantly, was not explained by simple forgetting of task rules. Instead, the results suggest that the more complex set of task rules given to the full instruction group affected their ability to use those rules in appropriate control of behavior.
To explain these results, Duncan et al. (2008) introduced the concept of a “task model”—an internal representation of relevant facts and rules that is used to shape correct behavior (see also Anderson, 1983; Fitts & Posner, 1967). As a new task is learned, a new task model must be assembled. As the complexity of the model increases, it becomes increasingly likely that one task component will be lost, leading to neglect of this task component. In this process, the task is simplified, losing important constraints on what should be done. Duncan et al. replicated their previous finding of a close link between goal neglect and fluid intelligence, and proposed that fluid intelligence reflects the ability to organize novel information into complex, effective task models.
Functional brain imaging studies have revealed that tests of fluid intelligence produce a consistent pattern of activity in a wide network of frontal and parietal regions (Crone et al., 2009; Duncan, 2005; Duncan et al., 2000; Esposito, Kirkby, Van Horn, Ellmore, & Berman, 1999; Prabhakaran, Smith, Desmond, Glover, & Gabrieli, 1997). In the frontal lobes, clusters of activation are observed in and around the inferior frontal sulcus (IFS), in the anterior insula extending into the frontal operculum (AI/FO), in lateral parts of rostral prefrontal cortex (RPFC), in the dorsal part of the anterior cingulate (ACC), and in the adjacent presupplementary motor area (pre-SMA). This frontal activity is commonly accompanied by activity in the parietal lobe along the intraparietal sulcus (IPS). Consistent with a role in general intelligence, a very similar activity pattern is observed in the brain's response to many different kinds of cognitive demands (Dosenbach et al., 2006, 2007; Duncan & Owen, 2000), which has led to the term “multiple-demand” (MD) network (Duncan, 2005, 2006). In prefrontal cortex of the behaving monkey, neurons have highly flexible response properties, adapting to code many different kinds of information relevant to current behavior (e.g., Sigala, Kusunoki, Nimmo-Smith, Gaffan, & Duncan, 2008; Everling, Tinsley, Gaffan, & Duncan, 2002; Duncan, 2001; Freedman, Riesenhuber, Poggio, & Miller, 2001; Asaad, Rainer, & Miller, 2000; Sakagami & Niki, 1994; for related findings in ACC/pre-SMA, see Procyk, Tanaka, & Joseph, 2000; Niki & Watanabe, 1979; and for parietal cortex, see Toth & Assad, 2002). Such results suggest broad involvement of the MD system in assembly and use of current task models (Duncan et al., 2008).
At present, the only direct evidence bearing on the neural basis of goal neglect comes from one experiment by Duncan et al. (1996). In this experiment, neglect was common in patients with frontal lobe lesions, and appeared not to be affected by lesions in posterior cortex. Interpretation of this result was complicated, however, by the fact that most patients in the frontal lobe group had closed head injuries, which are typically associated with diffuse brain damage in addition to any focal lesion (Richardson, 1990). The aim of the current neuroimaging study, therefore, was to investigate the neural basis of task model assembly and use in healthy adults. To achieve this, a number of tasks were designed which had a similar form to the final experiment reported by Duncan et al. (2008, Experiment 4).
One of the central questions for the current experiment concerned the encoding of new task rules. In particular, we aimed to investigate the learning of new task rules rather than their trial-by-trial implementation. To investigate this, a method was developed for presenting new task instructions to the participants while they lay in the scanner. Participants were shown separate instruction screens for each task rule, and a delay between them allowed the possibility of analyzing BOLD signal changes while participants read and learned the individual task rules. Previous neuroimaging studies have investigated the brain regions recruited during the presentation of instructions, but in most cases, the instructions corresponded to previously learned task rules (e.g., Hanakawa, Dimyan, & Hallett, 2008; Sakai & Passingham, 2003, 2006). Such studies have shown an overlap between the neural networks recruited during task performance and those recruited during the presentation of task instruction for the next trial. Further evidence also suggests that pFC, and more particularly lateral RPFC, implements rules through specific interactions with posterior areas involved in task execution (Sakai & Passingham, 2003, 2006).
Of particular interest in the current study were any changes in the recruitment of brain regions that occurred as the task model developed. Duncan et al. (1996, Experiment 3) observed that neglect of a task rule was much more likely if it was described last in the instructions. This finding suggests that a new requirement is more likely to be neglected when several others have already been activated. The design of the current experiment enabled us to investigate brain activity during presentation of successive instructions, in particular, whether the elaboration of a task model by adding an increasing number of rules produced a progressive change in the recruitment of brain regions responsive to the presentation of instructions.
The second aspect of the experiment concerned task performance. Goal neglect, by definition, is observed during task performance, and another aim of this study was to investigate whether neural activity reflected the overall complexity of the task knowledge encoded after task performance requirements had been matched. For each task, the total set of possible rules was divided into two subsets, A and B. Similar to the experimental design used by Duncan et al. (2008, Experiment 4), participants who learned the full instructions of a task were given both A and B rules. During subsequent task performances, they were then told at the beginning of each block of trials whether all rules could apply (“Full AB” blocks), or only the B rules (“Full B” blocks). In reduced instructions tasks, only B rules were described, and consequently, all blocks were B rule only (“Red B” blocks). This design allowed two comparisons: firstly, the effect of task model complexity controlling for actual performance demands (Full B vs. Red B); secondly, the effect of the number of rules currently maintained and applied while controlling for task model complexity (Full AB vs. Full B). The latter resembles comparisons of pure and mixed blocks in the task switching literature, where effects on performance arise from having more possible task types to keep active through the block (mixing costs) and also from having to perform different tasks on successive trials (switch costs) (see Monsell, 2003, for a review).
The task performance blocks were designed in such a way as to be able to distinguish between transient and sustained BOLD signal changes, as there is evidence to suggest that different regions of prefrontal cortex show different patterns of transient and sustained activity during multirule tasks. In a task switching paradigm, Braver, Reynolds, and Donaldson (2003) observed a sustained increase in BOLD response in lateral RPFC in mixed task blocks compared to single task blocks, but no transient event-related response. In contrast, dorsolateral prefrontal cortex (DLPFC) showed an increased event-related response in trials of mixed task blocks compared to trials of single task blocks, but little difference in sustained activity between mixed and single task blocks. Sakai and Passingham (2003) also observed that lateral RPFC showed a sustained and prolonged response to instructions cueing participants about the upcoming task, whereas the response in DLPFC was locked to the start of the trial. A role for RPFC in sustained maintenance of task sets has also been proposed by Dosenbach, Fair, Cohen, Schlaggar, and Petersen (2008) and Dosenbach et al. (2006). Our experiment examined distinct effects of task complexity in transient and sustained brain activity.
Twenty-four right-handed volunteers (12 men) between 18 and 35 years old (24.0 ± 4.4 years), with normal or corrected-to-normal vision, took part in this study. Ethics approval was granted by the local research ethics committee, and informed written consent was obtained. A measure of general intelligence was obtained for 16 of the participants, using the Cattell Culture Fair Test, Scale 2 Form A (Institute for Personality and Ability Testing, 1973; Cattell, 1971). The mean Culture Fair IQ score of this group was 112.9 (range 97–145).
Tasks and Stimuli
Participants were required to learn and then perform a number of tasks while lying in the scanner. A total of eight tasks was used, each carried out once by each participant. The eight tasks were organized into four pairs (Sets I to IV; see Figure 1), with a different set used in each of four scanning runs (or “sessions”). Assignment of task sets to different sessions was counterbalanced across participants. Within each session, one of the tasks (counterbalanced across participants) was presented with full instructions and the other with reduced instructions. The order of full and reduced tasks within the session was also counterbalanced across participants, and across sessions within each participant.
All tasks were designed in a similar manner with five possible types of trials: A1, A2, B1, B2, and B3 (Figure 1). In each session, participants were given three possible keypress responses. A1 and A2 trials required participants to detect specific types of stimuli and perform the corresponding A keypress response. For the words task of Set I (Figure 1, top left), for example, the A keypress was required either for upper case words (Rule A1), or for rows of symbols (Rule A2). A1 and A2 trials were only presented to the participants in the full instructions condition. For all other trials—including all trials under reduced instructions—B rules applied. These B trials required a decision on the number of specific target stimuli in the display. For the words task, for example, targets were pseudowords, whereas in the letters task presented in Figure 2, targets were vowels. In B1 trials the target was absent, in B2 trials there was one target, and in B3 trials there were two targets. No response was required in Trials B1 and B3. In B2 trials participants were asked to make an additional decision, and to make one of two alternative (left vs. right) B-rule keypresses. This B2 decision always involved the central part of the display (Figure 1). In Set I, the B2 task was to press a key on the side with a greater number of dots; in Set II, on the side indicated by a greater number of arrows; in Set III, on the side with an oblique bar; and in Set IV, on the side with a larger shape. All eight tasks had this same structure (Figure 1), with subjects making one response for two possible types of A trials, otherwise counting targets and making no response for 0 or 2 targets, but for a single target, making the B choice response based on the central part of the display.
For each task, the actual keypresses made for A and B trials were counterbalanced across participants. For tasks run in the first of the four scanning sessions, the A response was indicated with the left middle finger, whereas B responses were indicated with the right index or right middle finger. For the second session, the A response was both index fingers together, whereas B responses were the left or right index. For the third session, the A response was the right middle finger, whereas B responses were the left index or left middle finger. For the fourth session, the A response was both middle fingers together, whereas B responses were the left or right middle finger. Because the order of task sets was counterbalanced across sessions, specific keypress responses were also counterbalanced across task sets. Participants' responses were recorded until 2.2 sec after the onset of the stimulus. When participants were required to make two simultaneous keypresses, reaction times were calculated on the basis of the first key pressed.
Four additional tasks were designed in order to train participants on the central stimuli and the response keys at the beginning of each session. These tasks included only A1, B1, and B2 rules to simplify the training. The four training tasks involved symbols, letters, numbers, and colored shapes. Data from these training tasks were not analyzed.
Before beginning the experiment, participants received initial training in a separate testing room. This training consisted of one complete session using different tasks and response keys from those used during the scanning sessions. Once training was finished, participants performed the four scanning sessions, with short breaks between each session.
The structure of each scanning session is shown in Figure 2B. Following performance on the training task, imaging data were collected during instructions and performance blocks of the two main tasks, one full and one reduced. Each new task started with a number of instruction screens telling the participants the rules of the task. The first instruction screen described (i) the stimuli for this task (e.g., words and dots in the middle); (ii) whether this was a full instructions task (A + B) or a reduced instructions task (B only); and (iii) the keys to use to respond in the A (for full tasks only) and B tasks. This first screen was presented for 30 sec and then replaced by a blank screen for 10 sec. Separate instruction screens for the A1, A2, B1, B2, and B3 rules were then presented (or B1, B2, and B3 only in reduced tasks). The screen for each rule was shown for 20 sec and followed by a blank screen for 10 sec. For each screen, an example stimulus was shown, along with the correct response (see Figure 2C for examples of A1, B1, and B2 instruction screens).
Instructions were followed by a practice block of eight trials, including two trials of each B rule and one trial of each A rule in the full tasks, and at least two trials of each B rule in the reduced tasks. The practice block was followed by four B blocks in the reduced instructions condition, and two B and two AB blocks in the full instructions condition (in alternating order; see Figure 2B). The order of B and AB blocks was counterbalanced across participants and across sessions within each participant. All testing blocks lasted 30 sec and were preceded by a 4.7-sec instruction screen that stated whether the block was going to be B or AB and 300 msec of blank screen. Each block contained six trials in total. In B blocks, there were two trials of each B trial type (B1, B2, B3). In AB blocks, there was at least one trial of each B trial type, and at least one A trial. Stimuli were presented for 2 sec. A white fixation cross was presented on a blank screen between trials. The interstimulus interval was fixed at 300 msec in the practice blocks, whereas in the testing blocks it was jittered between 300 msec and 15.3 sec. The distribution was biased toward shorter delays, however, with 75% of delays being shorter than 600 msec, and the rest distributed between 600 msec and 15.3 sec. This distribution ensured that it was possible to differentiate transient and sustained BOLD responses during task performance (Chawla, Rees, & Friston, 1999).
The testing blocks alternated with baseline blocks. Baseline blocks lasted for 20 sec and were also preceded by an instruction screen lasting 4.7 sec and a blank screen lasting 300 msec (Figure 2B). The baseline task was the same throughout the experiment and required participants to make a two-choice judgment by indicating which side of the screen contained the larger of two numbers. This task was self-paced to minimize the amount of time participants' minds could wander, and trials were separated by 300 msec of blank screen. Participants used the same two keypress responses to answer as those used for the B2 two-choice judgment in that particular session.
Imaging Acquisition and Data Analysis
A Siemens 3-T Tim Trio system was used to acquire both T1-weighted structural images and T2*-weighted echo-planar (EPI) images. EPI images contained thirty-two 3-mm slices with an interslice gap of 25%. In-plane resolution was 3 mm2, and the TR for each volume was 2 sec. Functional scans were acquired during four sessions, each containing 413 volumes. The total time taken to collect functional and anatomical scans was approximately 1 hr 10 min for each participant.
EPI data were preprocessed using SPM5 (Wellcome Department of Cognitive Neurology; www.fil.ion.ucl.ac.uk/spm/). For each participant, all volumes were realigned into the same orientation, corrected for different slice acquisition times, normalized into the Montreal Neurological Institute (MNI) template space, and smoothed with an isotropic 10-mm full-width half-maximum Gaussian kernel. During normalization, each volume was also resampled to a spatial resolution of 3 mm3. The experiment was programmed using Cogent2000 software (developed by the physics group of the Wellcome Laboratory of Neurobiology; www.vislab.ucl.ac.uk/cogent.php) and the display was projected onto a screen, visible from the scanner via a mirror mounted on the head coil.
Statistical analysis of functional data was also carried out using SPM5. Variance in the BOLD signal was decomposed in two separate analyses with different sets of regressors in a general linear model (Friston et al., 1995). Analysis 1 was used to investigate brain response during task instructions. To minimize the total number of regressors in the model, events during subsequent task performance were modeled with simplified predictors, and not further analyzed. Analysis 2 was used to investigate brain activity during task performance; here, simplified predictors were used for task instructions.
In Analysis 1 (task instructions), each instruction was modeled using an FIR basis set of 2-sec-long boxcars covering the whole period from onset of the instruction screen to the end of the following fixation period (20 time bins for the first instruction screen and 15 time bins for all subsequent instructions). This allowed the response to instructions to be modeled without making any assumptions about the shape of the BOLD response. Other regressors included in this model were: extended boxcar regressors modeling task blocks (Red B, Full B, Full AB, and baseline separately) and the instructions at each block onset; separate event regressors for B trials of Red B, Full B, and Full AB blocks, and for A trials of Full AB blocks, each restricted just to correct trials, and a regressor representing all incorrect trials; head movement parameters as covariates of no interest; and the mean over scans. Both block and event regressors were convolved with a canonical hemodynamic response function (hrf). To allow analysis across the full 190 sec (full instructions) or 130 sec (reduced instructions) instruction period, data were analyzed without high-pass filtering of the time series.
In Analysis 2, we examined transient (trial-related) and sustained (block-related) activity during task performance. Trial-related activity was modeled with an eight-boxcar FIR basis set (Visscher et al., 2003) for each trial type (B trials of Red B, Full B and Full AB blocks, and A trials of Full AB blocks, each restricted just to correct trials). Block-related activity was modeled using extended boxcar regressors convolved with a canonical hrf for each block type separately (Red B, Full B, Full AB, and baseline). Other regressors included in this model were: an FIR basis set (8 boxcars) representing all incorrect trials; an FIR basis set (10 boxcars) modeling the instructions at the start of each block (“A + B blocks” or “B only block”) to capture changes in BOLD signal associated both with the start of a new blocks and also with the end of a previous block (Dosenbach et al., 2006; Fox, Snyder, Barch, Gusnard, & Raichle, 2005; Visscher et al., 2003; Donaldson, Petersen, & Buckner 2001); two extended boxcar regressors convolved with a canonical hrf modeling the instructions presentation and the delay periods between instructions at the start of each task; head movement parameters as covariates of no interest; and the mean over scans. Again, the analysis was performed without high-pass filtering of the time series.
Parameter estimates for each regressor were calculated from the least squares fit of the model to the data, and estimates for individual participants were entered into a series of random effects group analyses. Whole-brain and ROI analyses were performed using this model. ROIs of the MD network were based on the results of a previous analysis of fronto-parietal activity associated with diverse cognitive demands (Duncan, 2006; for full details of ROI definition, see Cusack, Mitchell, & Duncan, 2010), and comprised 10 regions, given here with central coordinates in MNI space: bilateral IFS (±41 23 29), bilateral IPS (±37 −56 41), bilateral RPFC (±21 43 −10), bilateral AI/FO (±35 18 3), ACC (0 31 24), and pre-SMA (0 18 50). All ROIs were spheres of radius 10 mm, constructed using MarsBar for SPM (http://marsbar.sourceforge.net; Brett, Johnsrude, & Owen, 2002). Estimated data were averaged across voxels within each of the ROIs using the MarsBar toolbox and the mean values were exported for analysis using SPSS.
Whole-brain comparisons were performed using paired t tests on the relevant contrast values from each participant's first-level analysis. Unless otherwise specified, all results are reported at a threshold of p < .05, corrected for multiple comparisons using the false discovery rate (FDR). Peak activations are reported using MNI coordinates. ROI analyses were performed individually for each ROI, with one-sample t tests, paired t tests, or repeated measures ANOVA, using SPSS and a significance threshold of .05.
For our main analyses, we focused on performance of B tasks under three conditions: Red B (Reduced instructions, B trials only), Full B (Full instructions, B blocks), and Full AB (Full instructions, AB blocks). Mean error percentages are shown in Figure 3. To examine the effect of instructions (task model complexity), data were analyzed using a 2 (instructions; Red B, Full B) × 3 (trial type; B1, B2, B3) repeated measures ANOVA. The main effect of instructions was significant [F(1, 23) = 4.1, p < .05, one-tailed], showing poorer accuracy in the full instructions condition (error rate ± SE: Full B, 14 ± 3%; Red B, 10 ± 2%). There was also a main effect of trial type [F(2, 46) = 7.4, p < .002], with more errors in B2 than in B1 trials (p < .001, with Bonferroni correction). The interaction was not significant [F(2, 46) = 0.1, p > .8].
To investigate the effect of block complexity within the full instructions condition, a similar analysis was performed using a 2 (block types; Full B, Full AB) × 3 (trial type; B1, B2, B3) repeated measures ANOVA. The main effect of block type was significant [F(1, 23) = 16.3, p < .001], with participants showing poorer accuracy in AB blocks (error rate 17 ± 3%) than in B blocks (14 ± 3%). As in the previous analysis, the main effect of trial type was also significant [F(2, 46) = 8.0, p < .001]. Two further analyses were used to examine the separate contributions of mixing costs (difference between pure and mixed blocks) and switch costs (difference between repeat and switch trials) to the overall block effect. B task trials (combining B1, B2, B3) in AB blocks were separated into switch (preceded by A trial) and stay (preceded by B trial) trial types. To isolate mixing cost, the first analysis compared error rate in Full B trials (14 ± 3%) and Full AB stay trials (16 ± 3%). To isolate switch cost, the second analysis compared Full AB stay (16 ± 3%) and switch trials (21 ± 4%). Although the mixing cost was significant [t(23) = 2.0, p < .05], the switch cost was marginal [t(23) = 1.6, p < .1; both tests one-tailed].
In the 16 participants with Culture Fair IQ scores, correlation analyses were also performed examining the relationship between IQ and the average error rates for B trials. Pearson correlations were negative and significant for all three conditions: Red B (r = −.69, p < .005), Full B (r = −.51, p < .05), and Full AB (r = −.64, p < .01). Figure 4 presents a plot of the relationship between IQ and average error rates across all three conditions (r = −.61, p < .05). One participant's error rate was further than 2.5 SD away from the mean. When this participant was excluded, the regression remained significant (r = −.73, p < .005). Thus, B task errors strongly declined with increasing fluid intelligence.
In the baseline task, error rate was, on average, 5% (±2). In A trials, error rate was, on average, 9% (±4).
No response was required in B1 and B3 trials. RTs to correctly answered B2 trials were slower in Full AB blocks than in the other conditions (see Figure 3). RTs were analyzed using two paired t tests, one testing for an effect of instruction, the other for an effect of block type. There was no effect of instruction as participants' speed did not differ between Red B and Full B blocks [Red B: 1404 ± 31 msec; Full B: 1419 ± 26 msec; t(23) = 0.4, p > .6]. However, there was a significant effect of block type, with slower RTs in Full AB (1554 ± 36 msec) than Full B blocks [t(23) = 5.6, p < .001]. This effect was investigated further by testing whether the mixing cost (Full B vs. Full AB stay trials) and switching cost (Full AB stay vs. Full AB switch trials) were significant. Paired t tests indicated that there was a significant effect of mixing tasks [Full B: 1419 ± 26 msec; Full AB stay: 1549 ± 39 msec; t(23) = 4.6, p < .001]; however, there was no significant cost of switching [Full AB switch: 1522 ± 48 msec; t(22) = 0.7, p > .5. One participant did not respond correctly to any B2 switch trials and was not included in this analysis].
In the baseline task, mean correct RT was 576 msec (±15). In A trials, mean correct RT was 1015 msec (±29).
Instructions (Analysis 1)
The primary aim of the analysis of the instructions period was to examine changes in brain responses associated with the number of task rules being encoded. A first analysis was performed to identify the regions involved in reading and encoding the instructions. This analysis examined the phasic response when each new rule was presented. For this purpose, we focused on the three screens describing B rules for each task. Each rule presentation was covered by 15 FIR boxcars (10 boxcars for the 20 sec of instruction presentation, 5 boxcars for the 10 sec of delay following each instruction screen). For each rule, we calculated the mean activity across the first 10 time points (representing the time that instructions were on the screen), and the mean activity across the final five time points (representing the delay period between rules). These values were then averaged across the three B rules of each task, and across the eight tasks to give two values for each participant. At the second level, these values were entered into a contrast testing for brain regions more active during instruction presentation than during rest.
In a whole-brain analysis, using an FDR threshold of p < .05, a large network of brain regions was found to show increased BOLD signal during the instructions compared to the following delay periods (see Figure 5). Consistent with the visual and linguistic demands of each instruction screen, strong activity was seen in occipital cortex, extending dorsally into the parietal lobe, and especially in the left hemisphere, into the superior and middle temporal gyrus. Strong activity was also seen in posterior lateral and superior medial frontal cortex. A more conservative threshold of p < 10−6 (FDR) was used to differentiate the clusters observed in this contrast. Peaks of activations were observed bilaterally in occipital cortex (BA 18: 27 −90 −6; −27 −90 −6), extending into superior parietal cortex (BA 7: 27 −60 48; −27 −60 51), and toward the fusiform gyrus (BA 19: 39 −66 −12; BA37: −39 −48 −18); in left middle temporal gyrus (BA 21: −54 −39 −3); in lateral frontal cortex (BA 9: −45 3 36; 45 9 27); and in medial frontal cortex (BA 6: −3 6 63) (all Z > 7).
For each ROI of the MD network, parameter estimates were also extracted for each boxcar regressor in order to create a time course showing the evolution of the BOLD signal following presentation of each of the B rule instructions. Phasic response to presentation of each instruction was strong in the bilateral IFS and IPS, and in the pre-SMA (Figure 5). Other ROIs (bilateral RPFC and AI/FO, ACC) showed much less striking effects. To test the significance of these changes, a similar comparison to that used in the whole-brain analysis was performed, contrasting the instruction phases (average of the first 10 time bins) to the following delay phases (average of the 5 last time bins). Paired t tests indicated that the bilateral IFS, bilateral IPS, and pre-SMA showed increased BOLD signal during the instruction phase compared to the following delay phase [all t(23) > 5.0, p < .001]. Right but not left RPFC [respectively t(23) = 3.0, p < .01; t(23) = 1.2, p > .2], as well as left but not right AI/FO [respectively t(23) = 2.8, p < .05; t(23) = 1.2, p > .2], showed a similar but weaker pattern. As suggested by the plot on Figure 5, there was no difference between instructions and delay phases in ACC [t(23) = 1.2, p > .2].
A further analysis compared instruction screens for B rules to the initial 30 sec screen introducing the materials of each new task (see Methods). In line with the more substantial visual and reading demands of the first screen, with more pictures and text than later screens, a clear difference in activity was seen in ROIs centered on early visual cortex (±10 89 1; derived from the Jerne Volumes of Interest website http://neuro.imm.dtu.dk/services/jerne/ninf/voi.html). MD ROIs, in contrast, showed no such difference (see Supplementary Figure 1). The results suggest that MD activity during task instructions is not tightly linked to simple visual analysis of word and picture input.
The second, and more critical, set of analyses assessed specifically how the brain responses related to the instructions changed as a function of the number of rules being encoded (i.e., task model complexity). We examined changes in BOLD signal across successive instruction periods firstly for the five successive rules of full instructions, and secondly, for the three B rules during both reduced and full instructions. Across successive rules, the data suggested little consistent change in peak activity associated with each instruction screen (Figure 5). Delay activity following instructions, in contrast, appeared to increase across successive instruction periods, and to do so differently in full and reduced conditions. To examine these results, activity for the delay period following each instruction was calculated by averaging across the five corresponding time bins. Plots of mean delay activity across successive instruction periods are shown for each ROI in Figure 6. Both for full and reduced conditions, the results suggest a pattern of increase to an asymptote across successive instructions. For reduced instructions, this increase was evident across rules B1 to B3, that is, the only rules given in these tasks. For full instructions, in contrast, activity was already maximal following instruction for rules A1, A2, and B1, thereafter remaining constant or even somewhat decreasing.
In a first analysis, we examined the general pattern of increasing delay activity across successive instructions. For maximum power, we focused just on the five successive instructions of full conditions. For each MD ROI, we used linear contrasts to compare the delay activity associated with successive instructions. There were linear increases in BOLD signal during the delay period in the majority of MD regions: these were significant in the bilateral AI/FO [left: F(1, 23) = 8.5, p < .01; right: F(1, 23) = 17.0, p < .001], right RPFC [F(1, 23) = 10.9, p < .005], right IFS [F(1, 23) = 5.3, p < .05], ACC [F(1, 23) = 5.6, p < .05], and pre-SMA [F(1, 23) = 7.1, p < .05], and were marginal in the left IPS [F(1, 23) = 4.1, p < .06], left IFS [F(1, 23) = 3.7, p < .07], and left RPFC [F(1, 23) = 3.0, p < .1] (Figure 6).
Second, we compared patterns of delay activity for full and reduced instructions, focusing just on delays following the three B instructions. For each ROI, data were examined with 2 (full, red) × 3 (B1, B2, B3) repeated measures ANOVAs on the average of the five delay time bins. The linear component of the interaction was significant bilaterally in the IPS, in the left IFS, as well as in the pre-SMA [all F(1, 23) > 4.3, p < .05]. A similar trend was observed in ACC [F(1, 23) = 3.6, p < .07] (Figure 6).
In a final analysis, we asked whether activity in delay periods was associated with later differences in performance. For each participant, we identified sessions with best and worst full task performance. (No similar analysis was possible for reduced tasks as performance was typically good.) The comparison was restricted to participants who had a difference in accuracy between best and worst tasks of at least 15% (n = 15). Tasks with identical overall accuracy were averaged before performing the comparison. Overall, the increase in delay activity from A1 to B3 rules was steeper in the worst than in the best sessions. In 2 (worst, best session) × 5 (A1 to B3) repeated measures ANOVAs, worst-session increases (linear component of the interaction) were significantly steeper in ACC, right AI/FO, and pre-SMA (p < .05), and marginally so in the left IFS (p < .06) and left AI/FO (p < .1) (see Figure 7).
Taken together, the results suggest a clear link between task model complexity and postinstruction delay activity in the MD network. Activity increased across successive rules, more strongly for early rules; and when this increase was especially strong, later task performance tended to be poor.
Task Performance (Analysis 2)
Two sets of regressors were used during task performance to separate block-related (sustained) and trial-related (transient) BOLD signal changes. Separate whole-brain and ROI analyses were performed.
Sustained BOLD signal changes were first compared to the baseline task (number magnitude judgment). Combining data from the three different block types (Red B, Full B, Full AB), activations were found in bilateral frontal and medial occipital cortex regions and, to a lesser extent, in left parietal cortex (see Figure 8 for a render of the activations in task blocks vs. baseline).
ROI analyses comparing red and full blocks to the baseline task revealed significant sustained response in the left IFS only [t(23) = 2.5, p < .05]. Further analyses compared sustained activity across block types. To test for an effect of instructions (task model complexity), the contrast Full B versus Full baseline was compared to the contrast Red B versus Red baseline. The whole-brain analysis revealed no significant clusters at the threshold of p < .05 (FDR). In ROI analysis, a trend for greater sustained activity with full instructions was observed in the left IPS (Figure 8) [t(23) = 1.8, p < .05, one-tailed t test]. ACC ROI showed a trend toward BOLD signal change in the reverse direction [t(23) = 2.2, p = .07]. To test for a sustained effect of mixing tasks (block complexity), Full AB blocks were compared to Full B. Again, the whole-brain analysis revealed no significant effects. ROI analysis showed a trend toward stronger activity in Full AB versus Full B blocks in the pre-SMA only [t(23) = 1.8, p < .05, one-tailed] (see Figure 8).
A second set of analyses examined trial-related BOLD signal changes. Regressors for B trials from Red B, Full B, and Full AB blocks averaged across time bins were all associated with large bilateral frontal, parietal, and occipital transient BOLD signal changes (see Figure 9 for a brain render of the average of these activations). ROI analyses showed that all MD regions, except for ACC and left RPFC, showed transient response to the experimental trials [all t(23) > 5.8, p < .001; right RPFC: t(23) = 3.1, p < .005]. In those ROIs showing a transient trial-related response, BOLD signal changes resembled the canonical hemodynamic response function (Figure 9). Here, however, there were no effects of task model or block complexity. In both whole-brain and ROI analyses, neither Full B versus Red B nor Full AB versus Full B contrasts revealed significant differences either in terms of main effect over time bins or Condition × Time bins interaction (see Figure 9).
A comparison of Figures 8 and 9 reveals contrasting patterns of block- and trial-related activity across brain regions. Although trial-related activity was seen bilaterally in the IFS, IPS, and AI/FO, in the pre-SMA and, to a lesser extent, in right RPFC, block-related activity appeared mainly outside the MD areas. In lateral frontal cortex, in particular, block-related activity was anterior, in a region dorsal to our RPFC ROI.
Studies investigating complex response selection tasks have proposed a role for premotor cortex in the hierarchical organization of cognitive control (Badre & D'Esposito, 2007; Koechlin, Ody, & Kouneiher, 2003). To examine premotor activity, we derived an ROI [± 41 9 33] from average activation peaks in four previous studies (Kouneiher, Charron, & Koechlin, 2009; Badre & D'Esposito, 2007; Dosenbach et al., 2006; Koechlin et al., 2003). The analyses described above were repeated on this new premotor ROI.
The RPFC ROI used in this study is more ventral than the peaks of activation observed by Dosenbach et al. (2006), Braver et al. (2003), and Sakai and Passingham (2003). The RPFC coordinates of these three studies were therefore transformed into MNI coordinates and averaged to produce a second RPFC ROI centered on [± 31 47 17]. Analyses were also repeated on this further RPFC ROI.
The results (Supplementary Figure 2) confirm the conclusions suggested by the whole-brain analyses (Figures 5, 8, 9). Phasic response during task instructions (Supplementary Figure 2A) was strong in premotor cortex, but weak in RPFC (cf. Figure 5). In both regions (Supplementary Figure 2B), postinstruction delay activity increased across successive instruction screens (cf. Figure 6). During task performance (Supplementary Figure 2C), there was sustained (block-related) activity in RPFC but not in premotor cortex (cf. Figure 8). Block-related activity in left RPFC increased with task model complexity (Red B vs. Full B), but decreased with block complexity (Full B vs. Full AB). In contrast, trial-related activity was strong in premotor cortex but not in RPFC (Supplementary Figure 2D; cf. Figure 9). Again, trial-related activity was not sensitive to model or block complexity.
Effect of Complexity on Performance
The novel within-participant design of this experiment successfully replicated the effect of task model complexity described by Duncan et al. (2008, Experiment 4). Comparing performance on B trials, participants were less accurate when they had been asked to encode a more complex task model (Full B vs. Red B). This effect was specific to accuracy and similar across the three B rules. The results show that, in multirule tasks of this sort, performance depends not only on the rules active in a current trial block but also on the complexity of the whole rule set described in initial instructions.
In the previous, between-participant designs of Duncan et al. (1996, 2008), the effect of task model complexity was largely manifest in neglect of one or more task rules. In such cases, the rule exerted little or no apparent control over behavior. In the present, within-participants study, neglect of this sort was rare. Across the whole experiment, there were only 11 cases of accuracy <25% on any single B rule; when these cases were eliminated, the difference between Red B and Full B blocks remained significant. As described in the Methods section, all the tasks used in this study had a similar structure, with a specific response for A1 and A2 trials, no response for B1 and B3 trials (0 or 2 targets, respectively), and a two-choice response for B2 trials (1 target). With this consistent structure over the experiment, there may have been substantial transfer from one task to another. Very plausibly, participants may have constructed a general cognitive framework within which each new task was learned. A general framework of this sort may have reduced effects of model complexity, in particular, minimizing frank rule neglect.
Beyond the effects of task model complexity, our study also examined the effect of mixing A and B rules within one task block. Although one previous study of goal neglect found no effect of block complexity (Duncan et al., 2008, Experiment 2), such results are common in the task switching literature (e.g., Braver et al., 2003; see Monsell, 2003, for a review). In line with these results, participants in our study made more errors and were slower in mixed blocks (Full AB) than in single task blocks (Full B). The accuracy difference appeared to reflect a combination of mixing and switch costs, whereas the RT difference reflected a mixing cost only. It is worth noting, however, that our task blocks included few switch trials, and that repeats of B trials often required a change of rule (e.g., B1 preceded by B3). These B rule changes may have produced switch costs similar to those following a switch from A to B.
Overall performance was correlated with a measure of fluid intelligence in the subgroup of participants who were tested on Cattell's Culture Fair Test (Institute for Personality and Ability Testing, 1973; Cattell, 1971). These results show that lower fluid intelligence is correlated with a greater frequency of errors on multirule tasks and replicate the findings observed by Duncan et al. (1996, 2008) in a range of paradigms.
Assembly of the Task Model
The first aim of the neuroimaging aspect of this experiment was to investigate the neural substrate of learning new task instructions, that is, the construction of a new task model. A wide network of regions was recruited during the presentation of instructions compared to following delay periods. In sum, these activations may reflect a combination of visual and linguistic processing, task rule comprehension, and task model construction. In line with visual and linguistic processing, extensive activation was seen in bilateral occipital cortex extending into superior parietal cortex, and, especially in the left hemisphere, in the middle temporal gyrus. In addition, activity was seen in several parts of the MD system, in particular, in the bilateral IFS/IPS and the pre-SMA, and, to a lesser extent, in right RPFC and AI/FO. MD activity may reflect task model construction, but could also, to some extent, be driven by visual and linguistic processes. A control test provides some evidence against the latter alternative, as unlike early visual activity, MD activity was not tightly linked to the quantity of visual input on different instruction screens. Overall, the results suggest that, as new instructions are received, MD regions are involved in converting these instructions into a new functional task model. A comparison of interest for future studies would be to compare the learning of a new set of task rules to the retrieval of previously learned task rules. This would permit the differentiation of neural processes associated with task model construction and with task model retrieval and preparation, while controlling for visual and linguistic processing.
Our main goal in this study was to address the consequences of increased model complexity. To this end, we examined changes in brain activity across successive task instructions. Although the peak response to each new instruction was relatively stable, BOLD signal during the intervening delay periods increased with successive instructions (Figure 6). This increase was negatively accelerated, approaching an asymptote following the first two or three rules. A significant linear increase in BOLD signal over the five successive full instructions during the delay period was observed in a subset of MD regions: the bilateral AI/FO, right RPFC, right IFS, ACC, and pre-SMA, with additional trends elsewhere. In several MD regions, the approach of delay activity to an asymptote was reflected in a significantly greater increase over B rules for reduced compared to full instructions.
Two results link this increased delay activity to the behavioral consequences of task model complexity. First, smaller baseline changes for later rules are reminiscent of the finding that, in behavior, it is the later rules that are likely to be neglected (Duncan et al., 1996, Experiment 3). Plausibly, smaller baseline changes for later rules reflect the weaker impact these rules may have on the developing task model. Second, the results suggest that, when baseline increases are especially steep, later performance tends to be poor (Figure 7). Steep increases in delay activity may indicate relatively poor or ineffective model development.
Together, these results link task model complexity to widespread activity of the MD network. As new task instructions are received, there is a progressive increase in sustained or baseline activity in many MD regions. At least in part, performance limits linked to task model complexity may be mediated through MD activity.
Recruitment of the MD Network during Task Performance
The second aim of this experiment was to investigate brain activity during new task execution. Whole-brain analyses revealed quite distinct patterns of sustained activations during task blocks and transient activity linked to individual trials. The MD network, lateral occipital regions, and premotor cortex exhibited a strong transient response to all trials, but little sustained response to task blocks (only significant in the left IFS ROI). Sustained responses were observed instead in anterior dorsal PFC regions not typical of the MD network, as well as in medial occipital cortex.
In contrast to widespread effects of task complexity during instructions, only restricted effects were observed during task execution. Most striking was the increased activity in Full B compared to Red B blocks observed in the left IPS (Figure 8) and left dorsal RPFC (Supplementary Figure 2). Again, it is noteworthy that, for these two types of blocks, the task actually performed was identical. Differential fronto-parietal activity suggests some sustained processing demand linked to a more complex overall task model. Inconsistent changes in sustained activity were associated with the contrast of Full B with Full AB blocks. More complex task blocks were associated with increased activity in the pre-SMA (Figure 8), but with decreased activity in dorsal RPFC (Supplementary Figure 2), where block-related activity was overall the strongest (Supplementary Figure 2, Figure 8).
In contrast to complexity effects in block-related activity, transient activity related to individual trials appeared to be relatively constant, showing little influence of either model or block complexity. Just as the cognitive requirements of individual trials were fixed across block types, so too was trial-related neural activity.
Our results contrasting sustained and transient effects of block complexity may be compared with those described by Braver et al. (2003). In their experiment, there was a sustained effect of mixing tasks in RPFC and a transient effect of mixing tasks in DLPFC. The analogous comparison in the current study was between Full AB and Full B blocks, but here we found neither sustained nor transient differences in the MD IFS or RPFC ROIs. In dorsal RPFC (Supplementary Figure 2), results were actually opposite to those of Braver et al., with a decrease in sustained activity for AB blocks. Our failure to replicate the results of Braver et al. might be due to the differences between the designs of the two studies. Indeed, Braver and colleagues' study led to switching costs during the mixed tasks blocks which, in the current experiment, were modest in accuracy and absent in RT. One possible interpretation of this difference is that, as participants were presented with the A1, A2, B1, B2, B3 rules in succession, they considered the set of rules as a whole, with a possible B rules subset, rather than as two separate tasks as in typical task switching paradigms.
Differences between MD Regions
The different measures of BOLD signal change obtained in this experiment, that is, sustained or transient response to instructions or to task performance, revealed that the regions forming the MD network did not behave in a completely homogeneous way. A separation of the MD network into distinct systems has previously been proposed and studied by Dosenbach et al. (2006, 2007, 2008). Results from a meta-analysis of a large number of studies (Dosenbach et al., 2006) and the analysis of resting state functional connectivity (Dosenbach et al., 2007) suggest the presence of two distinct task-control networks rather than a unitary system. A fronto-parietal network including the IFS and IPS is found to respond to start-cue and error-related activity, and is proposed to initiate and adapt control on a trial-by-trial basis. Transient response to individual task events can also be seen in the second network, including the AI/FO, RPFC, and dorsal anterior cingulate/medial superior frontal cortex; in this network, however, there is a stronger pattern of sustained activity across entire task epochs. Accordingly, this second network is proposed to control goal-directed behavior through the stable maintenance of task sets.
Our results are only partially consistent with this distinction. During the instructions period, strong phasic responses to individual instructions were seen in the IFS, IPS, and pre-SMA. Sustained increase across delay periods following individual instructions was seen across much of the MD network. During task performance, the left IFS showed both sustained and transient activity; left dorsal RPFC showed sustained activity only; the right IFS, and bilateral IPS, AI/FO, premotor cortex, and pre-SMA showed a transient response only. Combining these results with those of Dosenbach et al. (2006), Braver et al. (2003), and Sakai and Passingham (2003), the most consistent result is perhaps that RPFC exhibits relatively weak transient activity patterns, with more reliable sustained activation. For other regions such as the AI/FO and pre-SMA, however, the pattern across conditions and studies is less consistent, with the relation between sustained and transient activity still to be clarified.
Interestingly, ACC showed a very different pattern of activity from the other ROIs, with little sustained response and no transient response to task performance. The pattern of sustained task-related activation observed in ACC was, in fact, the opposite of that observed in the other MD regions, with reduced sustained BOLD signal when task model complexity was greater.
Together, our results show a broad role of fronto-parietal MD regions in comprehension, assembly, and use of new task rules. During task instructions, specific MD regions (IFS, IPS, pre-SMA) show phasic response as each new rule is read and encoded. In a wider range of MD regions, task complexity is reflected in increased sustained activity as rules are added. During performance, MD regions exhibit mostly trial-related activity, with sustained activity restricted to anterior parts of PFC. Only sustained activity, however, is sensitive to overall task complexity. Capacity limits in learning and using new task rules are closely associated with fluid intelligence. Such limits, we suggest, may be reflected in load-selective activity of fronto-parietal MD regions.
This work was funded by the Medical Research Council (UK) intramural program U.1055.01.001.00001.01; I. D. was funded by the Fyssen Foundation.
Reprint requests should be sent to Dr. Iroise Dumontheil, Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK, or via e-mail: firstname.lastname@example.org.