Abstract
People often have good intentions but fail to adhere to them. Implementation intentions, a form of strategic planning, can help people to close this intention–behavior gap. Their effectiveness has been proposed to depend on the mental formation of a stimulus–response association between a trigger and target behavior, thereby creating an “instant habit.” If implementation intentions do indeed lead to reliance on habitual control, then this may come at the cost of reduced behavioral flexibility. Furthermore, we would expect a shift from recruitment of corticostriatal brain regions implicated in goal-directed control toward habit regions. To test these ideas, we performed a fMRI study in which participants received instrumental training supported by either implementation or goal intentions, followed by an outcome revaluation to test reliance on habitual versus goal-directed control. We found that implementation intentions led to increased efficiency early in training, as reflected by higher accuracy, faster RTs, and decreased anterior caudate engagement. However, implementation intentions did not reduce behavioral flexibility when goals changed during the test phase, nor did it affect the underlying corticostriatal pathways. In addition, this study showed that “slips of action” toward devalued outcomes are associated with reduced activity in brain regions implicated in goal-directed control (ventromedial prefrontal cortex and lateral orbitofrontal cortex) and increased activity of the fronto-parietal salience network (including the insula, dorsal anterior cingulate cortex, and SMA). In conclusion, our behavioral and neuroimaging findings suggest that strategic if–then planning does not lead to a shift from goal-directed toward habitual control.
INTRODUCTION
At the start of the new year, many people reflect on their future plans and form resolutions. However, they often fail to put their good intentions into practice (Sheeran & Webb, 2016). Strategic “if–then” plans, also known as implementation intentions, are an effective way to support the translation of intentions to actions. For example, instead of formulating an abstract plan such as “I want to lose weight,” an implementation intention links the intended action to a specific cue or situation, for example, “If I get home, I will eat an apple,” thereby enhancing the probability of success. Indeed, many studies have shown that implementation intentions support behavior change better than goal intentions that merely specify the intended action or outcome (Gollwitzer & Sheeran, 2006). In addition to increasing attention to the relevant cue, the effectiveness of if–then planning has been proposed to rely on creating a strong associative link between the stimulus (S) in the if-part (“home”) and the response (R) in the then-part (eat an apple), in a manner akin to habits acquired through behavioral repetition (Dickinson, 1985; Thorndike, 1911). These mentally formed S–R associations may allow for automatic action initiation (Gollwitzer, 2014)—a process often referred to as strategic automaticity or “instant habits” (Gollwitzer, 1993, 1999, 2014).
The notion that merely using a verbal action-plan could be sufficient to form a habit is fascinating, because a central assumption in theories of habit formation is that this process critically depends on behavioral repetition. Support for the idea that implementation intentions accelerate habit formation comes from research showing that they increase (self-reported) automaticity (Orbell & Verplanken, 2010; Parks-Stamm, Gollwitzer, & Oettingen, 2007; Brandstätter, Lengfelder, & Gollwitzer, 2001). Therefore, implementation intentions lead to benefits in terms of efficient goal attainment (Gollwitzer, 2014; Gollwitzer & Sheeran, 2006). However, habits developed through behavioral repetition also come at a cost, namely, decreased behavioral flexibility (Dickinson, 1985). The question arises, therefore, if the use of implementation intentions also leads decreased flexibility when goals change. This can be investigated using the outcome-devaluation test, an experimental paradigm originally used in rats (Adams & Dickinson, 1981) and later translated to humans (de Wit, Corlett, Aitken, Dickinson, & Fletcher, 2009; de Wit, Niry, Wariyar, Aitken, & Dickinson, 2007; Valentin, Dickinson, & O'Doherty, 2007). In this task, participants first learn to make a response to obtain a reward. Subsequently, the value of the outcome associated with that response is devalued, and the ability to flexibly adapt responding to this change in outcome value is measured during an extinction test. Sensitivity to outcome devaluation suggests that behavior is based on knowledge and evaluation of their consequences, and therefore under goal-directed control. If implementation intentions lead to “instant habits,” then we would predict reduced sensitivity to outcome devaluation, reflecting a shift from goal-directed toward more rigid, habitual control (de Wit et al., 2018; Balleine & O'Doherty, 2010).
We have previously tested this hypothesis (van Timmeren & de Wit, 2022), using a computerized symmetrical outcome-revaluation task (SORT; Watson, Gladwin, Verhoeven, & de Wit, 2022). Participants learn to make a response (go) to certain ice cream vans to collect valuable ice creams (and points) or to withhold a response (no-go) to other ice cream vans delivering nonvaluable ice creams (and a reduction of points). To investigate the effect of if–then planning, we instructed them to use verbal implementation intentions for half of the stimuli and use goal intentions for the other half. In the subsequent test phase, some outcome values changed (i.e., outcome revaluation). Whereas participants should continue to respond according to the learned S–R mappings on value-congruent trials (i.e., still-valuable and still-not-valuable), they should flexibly adjust their behavior on value-incongruent trials (i.e., devalued and upvalued). The results of this previous study suggest that the use of implementation (compared with goal) intentions facilitates instrumental learning, but also impairs performance when some of the signaled outcome values change during the test phase (van Timmeren & de Wit, 2022). This detrimental effect of if–then planning was observed across value-congruent and incongruent trials, suggesting that it was not mediated by strengthened S–R associations (as this would have impacted the value-incongruent trials specifically). Instead, this result may have been driven by reduced goal-directed control. Investigating the neural processes underlying implementation intentions may offer us a window on the underlying (goal-directed vs. habitual) processes.
To this end, in the present study, we used fMRI to investigate the neural correlates of if–then planning of instrumental responses on the SORT. We capitalized on current insights regarding the neural basis of goal-directed and habitual control to investigate the notion that if–then planning gives rise to “instant habits.” Decades of animal research have provided detailed insights into the neurobiology of goal-directed and habitual actions, demonstrating that they are causally supported by anatomically distinct but interacting corticostriatal systems (Balleine, 2019; Balleine & O'Doherty, 2010; Yin, Knowlton, & Balleine, 2004). These findings are mirrored by (correlational) neuroimaging evidence in humans, albeit less consistently. Specifically, previous fMRI studies have found that goal-directed control is supported by the ventromedial prefrontal cortex (vMPFC) and caudate whereas outcome-insensitive habitual actions depend on the premotor cortex and posterior putamen/dorsal striatum (Watson, van Wingen, & de Wit, 2018; Delorme et al., 2016; Morris, Quail, Griffiths, Green, & Balleine, 2015; de Wit et al., 2012; Tricomi, Balleine, & O'Doherty, 2009; Valentin et al., 2007).
The present study is the first fMRI investigation with the SORT, and we will therefore start with specifying our predictions regarding the general pattern of neural activity independent of intentions. First, we expected that over the course of training (i.e., habit acquisition) activity would increase in regions associated with habitual control whereas the involvement of regions implicated in goal-directed control would decrease (Zwosta, Ruge, Goschke, & Wolfensteller, 2018; Liljeholm, Dunne, & O'Doherty, 2015; Tricomi et al., 2009). Second, we expected neural activity during training in these regions to be predictive of revaluation insensitivity in the test phase (Watson et al., 2018; Zwosta et al., 2018; Liljeholm et al., 2015; de Wit et al., 2009). Third, in line with previous work (Watson et al., 2018; Valentin et al., 2007), we hypothesized that, in the test phase, we would find higher activity in areas implicated in goal-directed action, cognitive control, and response conflict when participants flexibly updated their responses and equal (if anything reduced) activity in habit-related regions. Finally, we expected that “slips of action” would be associated with higher activity in habit regions and reduced activity in goal-directed regions (Watson et al., 2018).
Our central aim was to investigate the neural basis of implementation intentions and their effect on behavioral flexibility. To this end, we measured neural activity related to the effect of implementation intentions on acquisition and flexible adjustment of instrumental actions on the SORT. We hypothesized that the use of implementation intentions (compared with goal intentions) during training would lead to increased habit acquisition as reflected by higher accuracy, increased automaticity (measured with the Self-Reported Behavioral Automaticity Index; Gardner, Abraham, Lally, & de Bruijn, 2012), and increased brain activity in habit regions and equal—or if anything reduced—activity in goal-directed regions. Moreover, we expected if–then planning to lead to increased reliance on previously formed S–R associations in the subsequent test phase as indicated by inflexible, habitual responding on value-incongruent compared with value-congruent trials, and higher activity of habit regions during the test phase. Finally, we expected that overcoming mentally rehearsed S–R associations (as part of an if–then plan) would require more goal-directed control and correspondingly engage related neural regions.
METHODS
All operationalizations, exclusion criteria, and main hypotheses and analyses were preregistered on Open Science Framework (https://osf.io/yrpxa).
Participants
Participants were recruited through the participant portal of the University of Amsterdam Web site, flyers, and word of mouth. We used the following inclusion criteria: age 16–35 years, not having previously participated in a previous study using this same task, and any contraindications for MRI. Data collection took place between July and November 2020. Note that this is during the first year of the COVID-19 outbreak; however, no strict lockdowns were implemented during this period in The Netherlands. The study was approved by the Psychology ethics committee of the University of Amsterdam and performed in accordance with those guidelines. All participants gave informed consent and received either course credit or financial compensation (15 €/hr) for their time (total ∼2 hr). An additional €20 voucher was given to the participant with the highest score to motivate participants to perform well on the task.
Forty-seven participants were enrolled, conforming to our preregistered sampling plan. Our sample size was based on a previous pilot study, which found a significant effect of implementation intentions in 35 participants using the same task and manipulation. Moreover, a power analysis with G*Power (Version 3.1.9.3) showed that our target sample size of n = 40 should be sufficient to detect a small behavioral effect (f = 0.12) with an α level of .05 and power of .8. Six participants were excluded from all analyses. One participant quit half-way through participation, and five participants were excluded based on performance exclusion criteria (see Results for details). The remaining 41 participants (22 women, 19 men) had a mean age of 23.2 (SD = 4.1) years. All participants had normal or corrected-to-normal vision, and all were right-handed except one who was ambidextrous. All participants were free of neurological or psychiatric disorders and completed or were enrolled in higher professional education at the time of participation, the vast majority being university students. Two participants were native Germans who spoke Dutch fluently; all others were native Dutch speakers.
Stimuli and Materials
Procedure
Participants performed a computerized instrumental learning task called the SORT (Figure 1; Watson, Gladwin, et al., 2022), programmed in Presentation (Version 18.1). Participants played a hungry skateboarder with the objective to collect ice creams to earn points and satisfy their hunger by pressing a response button. They were informed that the best performing participant at the end of the study would receive a €20 voucher. Four pictures of ice creams were used: a Cornetto, a Magnum, a Rocket ice lolly, and a soft serve ice cream. The task consisted of three phases. First, participants conducted an instrumental training phase without strategic planning outside the scanner, after which they were moved to the MRI scanner and performed an instrumental training phase with strategic planning followed by a test phase (see Figure 1). The symmetrical nature of the task stems from the inclusion of both valuable and nonvaluable outcomes, which allows comparisons in the test phase (when outcome values change) between the value-congruent and value-incongruent conditions to be made with the same response type (see Watson, Gladwin, et al., 2022, for a more elaborate discussion on the advantages of this task). The total experiment took ∼2 hr, of which 1 hr was spent in the scanner.
The task used here is almost identical to a previous study in which we tested the same hypothesis behaviorally (van Timmeren & de Wit, 2022), apart from the following changes. To minimize head movements, we used a static version of the task here instead of having ice cream trucks moving across the screen. We added one block of practice with strategic planning before being moved to the scanner, in order for participants to once read the intentions out loud and be able to ask questions. Moreover, we adapted the task to promote stimulus–outcome (S-O) learning across intention conditions, to rule out that any effect of implementation intentions on behavioral flexibility would be mediated by reduced contingency knowledge, as was the case in the original behavioral study (van Timmeren & de Wit, 2022). To this end, we changed the way in which the blocks were composed in the first part of training (i.e., without intentions): Instead of alternating between two sets of four ice cream vans, each block now contains four (out of eight) pseudorandomly selected stimuli (see Instrumental Training section for details). More than with the block-sets, participants are now forced to pay attention to all outcomes in the value-screen and evaluate for which stimulus they should (not) make a response.
Instrumental Training
At the start of the task, participants were instructed that their goal was to collect valuable ice creams (which earn points and alleviate hunger) and avoid collecting nonvaluable ice creams (which lose points and cause stomach pain) by (not) responding to ice cream vans. There were four different ice creams, and before each block of instrumental training, participants were shown which two ice creams were valuable (in green) and which two ice creams were not valuable (in red; Figure 1A). The position of the valuable and nonvaluable ice creams (left/right) was counterbalanced across participants. Each ice cream was associated with two out of eight vans (Figure 1B): one van always predicting this ice cream as being valuable and the other as being nonvaluable. Each block contained only half of the vans: two associated with a valuable ice cream and two with a nonvaluable ice cream. Participants were told to find out by trial and error which ice cream truck delivered which ice cream, and that the S-O contingencies would remain the same throughout the whole task. Participants first practiced with different discriminative stimuli (scooters) and outcomes (pizzas) for two blocks to familiarize them with this procedure. As mentioned previously, the composition of the blocks (i.e., which four out of eight vans were presented during this block) was now pseudorandomized. The conditions described above allow for six unique combinations of four vans, which were presented twice each (order randomized) during this first part of training for 12 blocks. The contingencies between ice creams and vans and which of the ice creams was valuable/nonvaluable were randomized across participants.
Each stimulus was shown 4 times per block, constituting 16 trials. Trial order was randomized per eight trials, with each van being presented twice in the first and twice in the second half of a block. Each trial started with a jittered 1- to 5-sec intertrial interval. Participants were instructed that they should respond as quickly as possible and before the deliverer disappeared (after 500 msec). Irrespective of the response, the associated outcome was then presented for 500 msec. Thus, participants did not receive direct feedback about the accuracy of their response to balance the feedback provided for valuable and nonvaluable outcomes and to promote goal-directed (R-O) learning and S-O knowledge. Each block ended with a 3-sec feedback screen that displayed accuracy and late responses in that block and total number of points collected (Figure 1D).
Instrumental Training with Intentions
The next phase of training took part in the MRI scanner. Participants were told that instead of seeing which ice creams were valuable or nonvaluable, each block would now start with sentences that would help them perform well. These sentences came in two different forms (Figure 1D). Goal intentions indicated for each ice cream whether they should make a response (R-O), formulized as “If I see [picture of an ice cream], then I WILL press.” Implementation intentions indicated for each ice cream van if they should make a response or not (S–R), formulized as “If I see [picture of an ice cream van] then I WILL (NOT) press.” Each intention was presented for 2500 msec and twice per intention block (randomized order). Half of the stimuli were trained using goal and the other using implementation intentions. Each block of verbal intentions was directly followed by a block of instrumental training (identical to the previous phase) with the corresponding stimuli. Blocks now alternated between two sets of vans, one van-set being trained with implementation intentions (S1–S4, “Van-Set A”) and one with goal intentions (S5–S8, “Van-Set B”). Whether the training started with an implementation or goal intention block was counterbalanced across participants. At the end of regular instrumental training and before being moved to the scanner, participants practiced each verbal intention without instrumental training for one block, followed by two blocks (one for each intention type) with instrumental training. During these first few practice blocks outside the scanner, participants were asked to read the intentions out loud. During the subsequent 24 blocks of training with intentions in the scanner, participants were instructed to subvocalize the intentions instead of reading them out loud to minimize head motion. Participants entered the scanner in a head-first supine position and were able to view the screen using a mirror attached to the head coil on which the task stimuli were presented. A button box allowed them to collect ice creams by responding using their right index finger.
At the end of training with intentions, participants completed a questionnaire on subjective automaticity (Self-Report Behavioral Automaticity Index [SRBAI]) and were tested on their S-O knowledge (details below; Figure 1E). We had planned to additionally obtain a (pre-intention) baseline measure of these questionnaires, but because of a programming error, they were presented after the practice blocks with intentions, making them unusable as a baseline measure.
Test Phase
Participants completed six test blocks. The test phase was similar to the first training phase (without intentions), but with some important differences. First, as intention blocks were no longer presented, value-screens were again shown at the start of each block, for the duration of 4 sec. Second, participants were told that the ice cream deliverers placed a banner on top of their van, blocking the view of the ice cream they delivered (i.e., nominal extinction). Because each van still kept on delivering the same ice cream as during training, they should base their choice on what they learned before. Third, the feedback screens presented at the end of each block no longer included information on the accuracy of their responses, but only the percentage of responses, nonresponses, and late responses. We did this to prevent outcome-based learning during the test phase. We explicitly instructed participants that each block contained an equal amount of valuable and nonvaluable outcomes so they knew they should aim for a 50%/50% distribution. Fourth, we shortened the response window to 450 msec to force rapid responding, which has been shown to boost the expression of habitual slips (Hardwick, Forrence, Krakauer, & Haith, 2019). However, because a lot of participants responded just after the 450-msec time limit, we decided to include responses up to 600 msec for both the behavioral and fMRI analysis to increase the number of included trials in the fMRI analyses. This change did not significantly impact the pattern of behavioral results, which was unsurprising as the test phase was conducted in extinction, meaning that no performance feedback was provided during this period. Finally and crucially, participants were informed that the final phase would be more challenging because all eight ice cream vans would appear intermixed during each block. The crucial consequence of each block containing all eight stimuli is that half of the vans would now deliver an ice cream with a value incongruent with the value during training. Some ice cream vans for which they had been trained to always make a go response during training, now delivered a (devalued) ice cream that should not be collected. Vice versa, other ice creams vans had carried nonvaluable outcomes during training, but their signaled outcome was upvalued and therefore required a go response. On other (value-congruent) test trials, the signaled outcome remained the same (i.e., still-valuable and still-not-valuable trials).
Consider for example the Rocket ice cream in Figure 1C. In this example, during training, this ice cream is always delivered by the van with a purple star and the van with the pink circle. During training blocks with the van with the purple star, the Rocket is valuable and therefore requires a go response. In contrast, during training blocks with the van with the pink circle, the Rocket is not valuable, and participants should refrain from pressing the space bar (i.e., no-go response). Subsequently, during the test block all (ice cream van), stimuli are presented, and in the example illustrated in Figure 1C, the Rocket is instructed to be currently not valuable. This means that the van with the purple star signals a devalued outcome (i.e., this is value-incongruent with training and requires a different response), and the van with the pink circle signals a still-valuable outcome (i.e., value congruent; the learned response remains correct).
SRBAI
The SRBAI (Gardner et al., 2012) is a 4-item scale that captures self-reported habitual behavior patterns that we adapted for to assess automaticity for (not) responding to the ice cream vans. Participants were presented with each ice cream van and asked to indicate the associated response (press or not press) and the degree to which (not) making a response was something they did: “automatically,” “without having to consciously remember,” “without thinking,” and “before I realize I am doing it.” Each item was scored on a scale ranging from 1 (strongly disagree) to 100 (strongly agree). The SRBAI scale was previously shown to have good reliability and validity (Gardner et al., 2012). Before the four SRBAI items appeared, participants were asked to indicate which response was associated with that stimulus (“making a response” / “not making a response”) to test S–R knowledge. Cronbach's alpha was calculated separately for each of the four conditions (2 intentions × 2 values), using the eight test items (four SRBAI questions for the two stimuli per condition). The results indicate high internal reliability, with alpha ranging from .91 to .95. The final score was calculated separately for each intention by taking the mean across the four items (range: 1–100), with higher scores reflecting more automatic behavior.
Test of Stimulus–Outcome Knowledge
Participants were asked about their knowledge of the S-O contingencies by asking them for each ice cream vans which ice cream it delivered. After selecting one of the four ice creams, participants were asked to indicate how confident they were about their decision (0–100). Composite scores, reflecting S-O knowledge, were calculated for each intention and separately for go- and no-go-trained stimuli by multiplying percentage of correct S-O contingencies (0%/50%/100%) with percentage mean confidence.
Preregistered Behavioral Data Analysis
Behavioral data analyses were performed using IBM SPSS Statistics 25 for Mac for frequentist statistics and JASP Version 0.16.3 (JASP Team, 2018) for Bayesian statistics. For data analysis purposes, the training data were collapsed across blocks of three, referred to as block-sets. Accuracy is reflected by the percentage of trials on which a correct response was made, calculated by the number of correct responses divided by the total number of trials. In line with the fMRI analyses, trials on which a late response was made were not included in the analyses (of both accuracy and RTs). To assess that learning took place over the first part of the training without intentions, accuracy was analyzed using a 2 × 4 repeated-measures ANOVA with within-subject factors Value (valuable or nonvaluable) and Block-set (1–4). The second part of training was analyzed using a 2 × 2 × 4 repeated-measures ANOVA, with Intention Type (implementation or goal intention) as an additional factor. RTs for correct responses (and thus only for valuable go trials) were analyzed with similar ANOVAs.
For the test phase, data were analyzed using a 2 × 2 × 2 repeated-measures ANOVA with three factors: Intention Type (implementation or goal intention), Test Value (valuable or nonvaluable during test), and Congruency (congruent or incongruent with value during training). Thus, for each intention type there are four conditions: still-valuable trials (valuable, congruent), upvalued trials (valuable, incongruent), still-not-valuable trials (nonvaluable, congruent), and devalued trials (nonvaluable, incongruent). Again, RTs (including all responses up to 600 msec) were analyzed using similar ANOVAs but now also analyzing responses on no-go trials (i.e., responses on still-not-valuable and devalued trials). Note that eight participants were excluded from the no-go analyses because they performed perfectly on still-not-valuable trials and thus did not make any response.
Subjective automaticity (SRBAI scores) for responding to stimuli trained with implementation and goal intentions at the end of training was compared using a paired t test. Finally, the relationship between automaticity and the “revaluation insensitivity” index was tested for both intention types separately using correlational analyses. A revaluation insensitivity index was calculated for each intention type by taking the difference between accuracy for congruent and incongruent test trials separately for go (still-valuable minus devalued) and no-go-trained stimuli (still-not-valuable minus upvalued), with higher revaluation insensitivity scores indicating more habitual performance. Kendall's tau was used as the four revaluation indices, and SRBAI scores were not normally distributed. In the case of violations of sphericity, we report Greenhouse–Geisser corrected degrees of freedom and p values. In addition to 95% confidence intervals, partial eta squared (ηp2) for the ANOVAs and Cohen's d for paired t tests are reported as estimates of effect sizes.
We additionally conducted corresponding Bayesian analyses. For null results (p > .05), as preregistered, we report the Bayes Factor01 (BF01), which quantifies the relative evidence in favor of the null hypothesis (H0) over the alternative hypothesis (H1). For ANOVAs, we report the BFexcl, which quantifies the extent to which the data support inclusion of the factor of interest in the model (i.e., the change from before posterior inclusion odds, across matched models). Finally, although we interpret significant findings on the basis of p < .05, we also report BFs for comprehensiveness and transparency (i.e., BF10 or BFincl for ANOVAs, which quantify evidence in favor of the alternative hypothesis over H0 and is identical to 1/BF01/excl). BFs were interpreted according to Table 1 in Wetzels and colleagues (2011), with BFs between one and three reflecting anecdotal support, BFs larger than three reflecting substantial support, and BFs larger than 10 reflecting strong support. In all Bayesian analyses, JASP's default priors (cauchy = 0.707 for t tests and r = 0.5 for fixed and r = 1 for random effects for ANOVAs) were used.
MRI Data Acquisition
All MRIs were performed on a 3-Tesla, full-body Achieva dStream MRI-scanner (Philips Medical Systems) equipped with a 32-channel head coil. After entering the scanner, a low-resolution survey scan was made to determine the location of the field of view.
fMRI scans were acquired at a ∼30° angle from the anterior–posterior commissure line to maximize signal sensitivity in orbital regions (Deichmann, Gottfried, Hutton, & Turner, 2003) using a T2*-weighted single-shot gradient echo imaging sequence with the following parameters: repetition time = 2000 msec; echo time = 28 msec; flip angle = 76.1°; voxel size = 3 mm3 with 0.3-mm slice gap; matrix size = 80 × 78; number of slices = 36; field of view = 240 × 118.5 × 240 mm. The training with intentions was split in two runs of 598 scans each, whereas 415 scans were acquired for the test phase. The first six volumes of each run were discarded to allow T1 saturation to reach equilibrium.
A high-resolution T1-weighted structural image was acquired before the final run (while participants completed the post-training SRBAI and SO-test) using an MPRAGE sequence with the following parameters: voxel size = 1 mm3; field of view = 240 × 220 × 188 mm; repetition time = 8.2 msec; echo time = 3.7 msec, 220 slices, flip angle = 8°.
fMRI Data Analysis
Image Preprocessing
MRI data were first converted to Brain Imaging Data Structure format using in-house scripts. An initial check of data quality was done by visually inspecting the image-quality metrics derived from MRIQC v0.15.0 (Esteban et al., 2017). Data were preprocessed using fMRIPrep v20.1.1 (Esteban et al., 2019; RRID:SCR_016216), which is based on Nipype 1.5.0 (Gorgolewski et al., 2011; RRID:SCR_002502), with the default processing steps. These included brain extraction, segmentation, and surface reconstruction of the structural T1 image; spatial normalization of both the structural and functional data to MNI space; and head motion estimation, coregistration, susceptibility distortion correction, and resampling to 2 mm3 of the functional data. No slice-timing correction was performed. A comprehensive description of the preprocessing pipeline is available here: https://osf.io/72bsh.
fMRI Statistical Analyses
The preprocessed functional data were further analyzed using Statistical Parametric Mapping software (SPM12, Wellcome Trust Centre for Neuroimaging). The data were spatially smoothed using a Gaussian kernel with a FWHM of 8 mm and all functional data was high pass filtered (with a 128-sec cutoff) to remove slow signal drifts.
First-Level Analysis
For the first-level analysis of the fMRI data, a general linear model was constructed for each participant, concatenated over all three runs from the training and test phase. For the data on training with intentions, trial onsets of valuable stimuli and nonvaluable stimuli for implementation and goal intentions were modeled using stick functions, making four conditions. To look at the effect of time on training, these were modeled as separate regressors per three blocks, making four training block-sets. Only correct trials (i.e., where an accurate (non)response was made) were included. Blocks of verbal rehearsal of implementation and goal intentions were additionally modeled as blocks of 28 sec (total duration of eight 3.5-sec trials). For the test phase, stick functions modeled the trial onsets of still-valuable and still-not-valuable (“value-congruent”; the outcome value is congruent with training phase) and devalued and upvalued (“value-incongruent”; the outcome value is not congruent with training phase) stimuli that were trained with implementation or goal intentions separately, making eight regressors. To investigate BOLD activity during habitual (c)omission errors (habitual “slips” in case of incongruent trials), separate regressors were included for incorrect trials for all conditions. The following regressors of no interest were included separately for each run: one regressor for errors (only for training, as test-errors/“slips” were modeled as regressors of interest) and late trials, keypresses, feedback-displays, value-screens (only for test phase), and six realignment parameters capturing rotation and translation to correct for residual participant motion. Three session constants were included in the model. All onsets were then convolved with the canonical hemodynamic response function, and an autoregressive AR(1) model was used to correct for serial correlations. The general linear model was regressed against the fMRI data to generate parameter estimates for each participant.
Regressor-specific first-level contrast images were created for the training- and test-regressors modeling the different conditions of interest to construct the planned second-level full factorial models. These contrasts of parameter estimates were then entered into between-subjects ANOVAs to generate group-level random-effects statistics. To test for a difference in learning between intention types, contrasts of parameter estimates of the instrumental training phase were entered into a 2 × 4 × 2 (Value × Block-set × Intention) factorial ANOVA. Following estimation of the second-level model, t tests were specified by adding linear weights to each instrumental training block-set, modeling increases over training as [−1.5 −0.5 0.5 1.5] and decreases as [1.5 0.5 −0.5 −1.5].
In addition, first-level contrast images were created. To assess the effect of planning during training, contrasts were created comparing training with implementation versus goal intentions (across all blocks, separately for go and no-go trials). To examine markers of goal-directed control during test, we compared correct congruent trials with correct incongruent trials (i.e., [still-valuable go > upvalued go] and [still-not-valuable no-go > devalued no-go]). We also investigated situations where participants fail to adapt to the new outcome value and continue to respond according to the learned S–R association by comparing incorrect incongruent trials (i.e., “slips of action”) with correct incongruent trials. Again, separate contrasts were created for test-go- and test-no-go trials (i.e., [devalued go > upvalued go] and [upvalued no-go > devalued no-go]). Finally, we also created a similar contrast comparing incorrect incongruent trials (slips) with correct congruent trials (i.e., [devalued go > still-valuable go] and [upvalued no-go > still-not-valuable no-go]). More information about the rationale behind these contrasts is provided in the Results section. To assess the effect of planning strategy on test performance, the same test-phase contrasts were constructed but looking for an interaction with intention type (e.g., [still-valuable go > upvalued go × implementation > goal intention]). Parameter estimates generated from these first-level analyses were entered into a random-effects group analysis, and linear contrasts were used to identify significant effects at the group level.
Higher level whole-brain statistical maps were corrected for FWE at the cluster-level (pFWE-cluster < .05) with a voxel cluster-defining threshold of p = .001 uncorrected. When activations did not reach statistical significance at the cluster level, we also checked the peak-voxel level with a threshold of p < .05 corrected (pFWE-peak < .05). In such cases, we clearly indicate this in the text, and we report the peak-voxel level results so as to be as comprehensive as possible in our reporting. Finally, in an exploratory analysis, we further aimed to test for effects in specific regions of the striatum given prior published findings on the role of these structures in goal-directed and habitual responding (Watson et al., 2018; de Wit et al., 2012; Tricomi et al., 2009; Tanaka, Balleine, & O'Doherty, 2008; Valentin et al., 2007). In particular, we defined an anatomical ROI to examine effects in the caudate nucleus, a region previously implicated in goal-directed processes, as well defining a functional ROI based on the results from Tricomi et al. (2009) that implicated the posterior putamen in habit-related processing.
In addition, we identified several ROIs in our preregistration: for habitual control, goal-directed control, response conflict, and implementation intentions. Three separate masks were created based on these ROIs to apply small volume correction (SVC). Apart from a striatal ROI (encompassing the bilateral caudate, putamen, and NAcc from the AAL atlas (Tzourio-Mazoyer et al., 2002); however, applying SVC with the three preregistered ROIs did not alter the pattern of results. This may be because of the large number of voxels included in the ROIs (especially the goal-directed mask) thereby reducing the sensitivity of the SVC. Therefore, we have opted to stick to reporting the whole-brain results for the confirmatory analyses. Whole-brain t-maps (without thresholding) of the main fMRI contrasts are available at https://neurovault.org/collections/13191/.
RESULTS
All analyses reported in this section were preregistered at the start of this study, unless indicated otherwise in the text. We generally followed the preregistered analysis plan, but in some cases, the results prompted us to further explore the data. We should also point out that we preregistered these hypotheses before finishing data analysis of our related behavioral study (van Timmeren & de Wit, 2022). Hence, we preregistered the same behavioral hypotheses for this study, although the original behavioral study only partially supported our initial predictions—a point we will come back to in the discussion. We therefore incidentally deviate from the preregistration to keep our analyses in line with analyses and findings from the behavioral study, which is always clearly indicated.
The total final sample used for the analyses consisted of 41 participants, after excluding the following participants. On the basis of the preregistered exclusion criteria, no participants were excluded on the training criterion (< 80% accuracy in the last block-set of training), whereas three were excluded because they made < 25% responses on upvalued trials trained with goal intentions in the test phase. The goal of this criterion was to ensure that participants understood the test-phase instructions and updated their performance accordingly, while not excluding participants based on the manipulation of interest (i.e. implementation intentions). We additionally excluded two participants (post hoc) based on a very low overall response rate during the test phase. Although these participants made (just) > 25% upvalued responses, we deviated from the preregistration because they were outliers on the overall response rate and responded on less than one out of three trials during the test, despite receiving explicit instruction to aim for a response rate of ∼50% and receiving feedback about that at the end of each block. Hence, they did not follow the test-phase instructions and their performance is not reliable. Note that this criterion is independent of actual task performance (accuracy) and that the in-/exclusion of these two participants does not change the general pattern of behavioral nor fMRI results.
Behavioral Results
Training Phase without Intentions
As expected, participants learned to make correct responses over the first part of training (Figure 2A), as revealed by a significant main effect of Block-set on accuracy, F(2.46, 98.20) = 16.74, p < .001 ηp2 = .30, BFincl = 2.81 × 105, and a marginally significant effect of Block-set on RT, F(2.45, 98.07) = 2.75, p = .058, ηp2 = .06, BFincl = 0.81. There was no significant difference in learning to make go versus no-go responses (main effect of Value: F(1, 40) = 2.00, p = .17, ηp2 = .05, BFexcl = 1.60; Block × Value interaction: F(1.70, 68.16) = .25, p =. 57, ηp2 = .01, BFexcl = 22.15).
Instrumental Training with Goal versus Implementation Intentions
Following the first 12 blocks of instrumental training without planning, intentions were introduced during a practice block (still outside the scanner). Although we did not preregister to analyze those data, for completeness and in line with our previous behavioral study with this paradigm investigating the same question (van Timmeren & de Wit, 2022), we conducted a paired t test comparing the final block of training without intentions to the practice block. This analysis revealed that participants benefitted from if–then planning on the valuable go trials, as reflected by higher accuracy (M = 96.1, SD = 12.4) relative to the preceding (pre)training block-set (baseline: M = 91.8, SD = 9.1, Z(40) = 2.57, p = .01, d = 0.59, 95% CI [.81, .22], BF10 = 1.34), whereas RTs were not affected, t(40) = −.01, p = .99, d = −0.001. In contrast, the use of goal intentions negatively impacted both accuracy (M = 87.6, SD = 14.7, Z(40) = −1.86, p = .065, d = −0.40, 95% CI [−.01, −.69], BF10 = 1.36) and RTs, t(40) = −2.03, p = .049, d = −0.32, BF10 = 1.08, of go trials compared with (pre)training. For no-go trials, no significant effects of implementation, Z(40) = 1.03, p = .31, BF01 = 5.12, or goal intention, Z(40) = .10, p = .93, BF01 = 5.68, were seen.
Subsequently, when instrumental training was resumed during the scanning session, the 2 × 2 × 2 repeated-measures ANOVA indicated that the advantage of if–then planning was initially still apparent on valuable go trials (Figure 2A). In addition to a strong main effect of Value, driven by participants performing better overall on valuable compared with nonvaluable trials, F(1, 84.47) = 10.93, p = .002, ηp2 = .22, BFincl = 18.08, we found the expected preregistered three-way interaction between Intention, Value, and Block-set, F(3, 103.14) = 6.45, p < .001, ηp2 = .14, BFincl = 857.7. Separate analyses of valuable and nonvaluable trials revealed a significant Intention × Block interaction for valuable, F(3, 81.78) = 6.21, p = .003, ηp2 = .13, BFincl = 74.01, but not for nonvaluable trials, F(3, 120) = 1.88, p = .14, ηp2 = .05, BFexcl = 2.63. The significant effect on the valuable go trials was driven by higher accuracy with implementation compared with goal intentions during the first block-set, Z(40) = 3.34, p < .001, d = 0.85, 95% CI [.64, .94], BF10 = 22.76. At the end of training (Block-Set 4), there was no longer a significant effect of Intention Type on accuracy, Z(1, 40) = −.34, p = .80, ηp2 = −1.43, BF01 = 5.87.
The analysis of RTs (Figure 2A) revealed a main effect of Intention Type, F(1, 40) = 12.08, p = .001, ηp2 = .23, BFincl = 11.12, with faster responses during blocks trained with implementation intentions (median = 365 msec, SD = 17) compared with goal intentions (median = 374 msec, SD = 20), but no significant effect of Block-set, F(2.4, 98.6) = 2.31, p = .09, ηp2 = .05, BFexcl = 3.41, nor an interaction (p = .20, ηp2 = .04, BFexcl = 3.67).
Symmetrical Outcome-Revaluation Test
As expected, learned S–R associations had a clear impact on performance during the test phase (Figure 2B), as revealed by a main effect of Congruence, F(1, 40) = 65.08, p < .001, ηp2 = .62, BFincl = 1.39 × 107. Because test value showed significant interactions with both Congruence, F(1, 40) = 10.73, p = .002, ηp2 = .21, BFincl = 8.91, and Intention Type, F(1, 40) = 5.94, p = .02, ηp2 = .13, BFincl = 1.27, separate follow-up comparisons were conducted for go (associated with still-valuable and upvalued outcomes) and no-go (associated with still-not-valuable and devalued outcomes) trials. Main effects of Congruence were seen for both the go, F(1, 40) = 16.82, p < .001, ηp2 = .30,, BFincl = 76.40, and no-go, F(1, 40) = 56.46, p < .001, ηp2 = .59, BFincl = 2.31 × 106, stimuli. As can be seen in Figure 2B, the congruency effect was larger for no-go trials mainly because of participants struggling more on devalued trials, where they had to suppress responding to discriminative stimuli that previously signaled a valuable outcome. Importantly, we were interested in the effect of implementation intentions on test performance. First, an analysis of the go test trials suggested that overall performance was worse when trained with implementation compared with goal intentions, F(1, 40) = 5.48, p = .02, ηp2 = .12, although Bayesian statistics showed that this evidence was inconclusive (BFincl = 1.46). Importantly, in contrast to our preregistered hypothesis, there was no evidence for reduced flexibility as a consequence of if–then planning: The expected interaction of congruence with intention type failed to reach significance, F(1, 40) = 1.52, p = .23, ηp2 = .04, BFexcl = 1.86. Given the direct relevance of the comparison between intentions for our research question, we followed these analyses up with separate (exploratory) paired t tests for still-valuable and upvalued trials to also report Bayesian evidence against a difference. Findings indicate that intentions only had a significant negative effect on (congruent) still-valuable, Z(40) = −2.55, p = .01, d = −0.56, BFincl = 3.68, but not on (incongruent) upvalued trials, t(40) = −.75, p = .46, BFexcl = 4.54. Finally, for the no-go stimuli (still-not-valuable and devalued), no main, F(1, 40) = .42, p = .52, BFexcl = 4.37, nor interaction, F(1, 40) = .06, p = .81, BFexcl = 4.25, effects of intention type were observed.
We also analyzed RTs during the test phase. A Value × Congruence interaction, F(1, 32) = 49.47, p < .001, ηp2 = .61, BFincl = 2.91 × 105, prompted separate analyses for trials trained with go responses (still-valuable and devalued) and for trials trained with no-go responses (still-not-valuable and upvalued). Interestingly, there was a main effect of congruence for go-trained stimuli, suggesting significantly faster RTs on devalued trials (M = 418 msec, SE = 8.8) relative to still-valuable (M = 443 msec, SE = 6.8; F(1, 40) = 12.56, p = .001, ηp2 = .24, BFincl = 23.40), in line with the idea that habitual slips of action are triggered fast and efficiently before one has the chance to suppress them. As late responses were excluded from this analysis (following the accuracy analysis), we ran an additional analysis including RTs for late responses to make sure that this effect was not driven by a higher number of (excluded) late responses on devalued trials. This analysis showed an even stronger main effect of congruence than the original analysis without late responses, F(1, 40) = 14.84, p < .001, ηp2 = .27, BFincl = 36.88. No other significant effects of RTs were found (all p > .22, BFexcl > 1.74).
Self-reported Automaticity and S-O Knowledge
Self-reported automaticity was at a high level overall (median = 80.4%, SD = 16.7), but did not differ between intentions, t(40) = −.98, p = .34, BF01 = 3.80, nor did subjective automaticity correlate with revaluation insensitivity for implementation (rτ = −.09, p = .57, BF01 = 4.39) or goal intentions (rτ = .22, p = .17, BF01 = 2.03).
Following van Timmeren and de Wit (2022), we also explored differences in S-O knowledge between intention types and their relationship with overall test accuracy. S-O knowledge was high (median = 89.8%, SD = 22.1) and, contrary to our previous study, no longer differed significantly between intention types, F(1, 40) = 2.07, p = .16, ηp2 = .05, BF01 = 2.6; values, F(1, 40) = 3,42, p = .07, ηp2 = .08, BF01 = 2.4; or their interaction, F(1, 40) = .91, p = .35, ηp2 = .02, BF01 = 5.88, suggesting that the adaptation we made to the task (i.e., using a pseudorandom selection of stimuli instead of alternating between two block-sets in the first part of training, see Methods section) had the desired effect. S-O knowledge did correlate positively with test accuracy (across all four conditions) for both implementation intentions (rτ = .30, p = .008, 95% CI [.08, .52], BF10 = 7.91) and goal intentions (rτ = .39, p < .001, 95% CI [.21, .57], BF10 = 99.22).
Conclusions: Behavioral Results
We provide evidence for habit learning, as indicated by the general effect of previously learned S–R mappings on the ability to flexibly adapt responding when the cue signals a revalued outcome (i.e., incongruent). Importantly, although if–then planning seemed to increase efficiency relative to goal intentions, as reflected in superior acquisition, this was not at the expense of flexibility when outcome values changed in the test phase.
Neuroimaging Results
Instrumental Training: Across Intentions (Exploratory)
First, we were interested to explore general learning effects across intention types because this was the first time the SORT was used in the MRI scanner. These analyses showed that over the course of go training (i.e., on valuable trials), activity increased linearly in the head of the caudate nucleus extending into ACC (at p < .05 FWE rate corrected; pFWE-cluster < .05). Activation in the left amygdalo-hippocampal junction and the angular gyrus did not reach our cluster-level correction threshold, but did survive voxel-level correction at p < .05 (pFWE-voxel < .05; Table 1). In this same contrast, we also observed a cluster in the posterior putamen, which survived a small-volume correction for the posterior putamen ROI (i.e., pFWE < .05 with SVC, defined as a 10-mm sphere at peak value of the cluster that showed a significant increase over training in the study of Tricomi et al. [2009]; x = 33, y = −24, z = 0). On the other hand, activity decreased over training in the bilateral anterior caudate (a more ventral part of the striatum), primary motor cortex (extending to mid-posterior cingulate), hippocampus extending into the putamen, and the left temporal cortex (all pFWE-cluster < 0.05 corrected). In contrast, on no-go trials, there were no voxels that showed a significant linear change over training blocks.
Contrast . | Region . | MNI Coordinates (x, y, z) . | Cluster Size (Voxels) . | z Score at Peak Level . | Correction . | ||
---|---|---|---|---|---|---|---|
Increase over training blocks (go) | Caudate nucleus head | 22 | 6 | 30 | 443 | 4.37 | Cluster |
Amygdalo-hippocampal junction | −10 | −4 | −14 | 348 | 5.17 | Peak | |
Angular gyrus | 20 | −52 | 38 | 214 | 4.92 | Peak | |
Posterior putamen | 26 | −20 | 4 | 34 | 3.96 | SVC Tricomi | |
Decrease over training blocks (go) | Anterior caudate L | −24 | 10 | 2 | 912 | 6.53 | Cluster |
Anterior caudate R | 24 | 10 | −4 | 537 | 6.39 | Cluster | |
Primary motor/SMA | 8 | −24 | 60 | 860 | 5.44 | Cluster | |
Hippocampus/putamen | 43 | 14 | −8 | 657 | 4.83 | Cluster | |
Temporal cortex L | −46 | −46 | −4 | 591 | 5.69 | Cluster | |
Goal > implementation intentions block-set 1 (Go) | Anterior caudate | 13 | 18 | −4 | 40 | 3.69 | SVC striatum |
Contrast . | Region . | MNI Coordinates (x, y, z) . | Cluster Size (Voxels) . | z Score at Peak Level . | Correction . | ||
---|---|---|---|---|---|---|---|
Increase over training blocks (go) | Caudate nucleus head | 22 | 6 | 30 | 443 | 4.37 | Cluster |
Amygdalo-hippocampal junction | −10 | −4 | −14 | 348 | 5.17 | Peak | |
Angular gyrus | 20 | −52 | 38 | 214 | 4.92 | Peak | |
Posterior putamen | 26 | −20 | 4 | 34 | 3.96 | SVC Tricomi | |
Decrease over training blocks (go) | Anterior caudate L | −24 | 10 | 2 | 912 | 6.53 | Cluster |
Anterior caudate R | 24 | 10 | −4 | 537 | 6.39 | Cluster | |
Primary motor/SMA | 8 | −24 | 60 | 860 | 5.44 | Cluster | |
Hippocampus/putamen | 43 | 14 | −8 | 657 | 4.83 | Cluster | |
Temporal cortex L | −46 | −46 | −4 | 591 | 5.69 | Cluster | |
Goal > implementation intentions block-set 1 (Go) | Anterior caudate | 13 | 18 | −4 | 40 | 3.69 | SVC striatum |
SVC = small volume correction; L = left; R = right.
Instrumental Training: Comparing Goal and Implementation Intentions
We then examined whether strategic planning affected instrumental training. The contrast comparing the average BOLD signal of trials trained with implementation intentions and goal intentions did not reveal any significant activations, neither on go nor no-go trials. We also tested for differences in learning between intentions over the course of training by adding linear weights to block-sets to compare increased activity over block-sets during implementation intentions with decreased activity during goal intentions, and vice versa. However, both tests of this interaction failed to show significant differences.
The finding that implementation intentions showed the most pronounced effect behaviorally early in training prompted us to conduct an exploratory analysis of only the first training block-set. This analysis revealed significantly decreased activation in the anterior caudate (pFWE < .05 with SVC, z = 3.69) on trials trained with implementation intentions compared with goal intentions (Figure 3A and Table 1). For visual purposes, the extracted average BOLD signal from the anterior caudate cluster is shown separately for each block-set and intention in Figure 3B. As can be seen here, activity was indeed lower on implementation intention trials during the first block-set only and subsequently decreased for both intentions. A whole-brain analysis also showed decreased activity for implementation relative to goal intentions at an uncorrected threshold (p < .001) in the right lateral orbitofrontal cortex (OFC; pFWE-cluster = .061, z = 4.25; x = 26, y = 50, z = 14) and the left insula (pFWE-cluster = .28, z = 3.76; x = −42, y = 20, z = 2). However, because these results did not survive FWE-correction, we refrain from interpreting them further. To rule out that these findings were driven by RTs, which were significantly shorter for implementation compared with goal intentions, we performed an additional analysis controlling for trial-by-trial RT by including a parametric regressor (one for each of the two training runs) with RTs for each trial. This had no significant impact on the results, and we could qualitatively replicate all reported findings.
Neural Predictors of Test Performance
To determine whether brain activity during instrumental training with implementation intentions was predictive of test performance, we tested whether the average BOLD signal during training covaried with the revaluation insensitivity score. This preregistered test did not reveal significant neural predictors of test performance. For completeness, we also exploratively ran this analysis separately for goal intentions and across intentions, but this similarly did not reveal any significant results.
Symmetrical Outcome-Revaluation Test: Markers of Goal-directed versus Habitual Performance
In the test phase, changes in outcome value create conflict between goal-directed control and learned S–R associations. Specifically, to perform the correct response on incongruent trials (i.e., upvalued go and devalued no-go), participants have to exert goal-directed control and override the learned S–R mapping. Conversely, on congruent trials (still-valuable go and still-not-valuable no-go), participants can rely on the learned S–R associations. The advantage of the symmetrical outcome-revaluation test (compared with the original slips of action test) is that we can compare congruent and incongruent trials with each other unconfounded by test outcome value (and therefore required response: i.e., go or no-go). Therefore, to examine markers of goal-directed control, we firstly compared upvalued go with still-valuable go responses and found that this was associated with increased right insula activity (pFWE-cluster < .05, z = 4.16; Table 2). No significant activations were seen in the contrast between devalued no-go and still-not-valuable no-go trials.
Contrast . | Region . | MNI Coordinates (x, y, z) Max . | Cluster Size (Voxels) . | z Score (Peak) . | Correction . | ||
---|---|---|---|---|---|---|---|
Upvalued go > still-valuable go | Insula R | 38 | 24 | −2 | 468 | 4.16 | Cluster |
Devalued slips > still-valuable go | Anterior insula L | −40 | 26 | 2 | 611 | 5.46 | Cluster |
Anterior insula R | 42 | 26 | −10 | 621 | 4.49 | Cluster | |
Still-valuable go > devalued slips | vMPFC | 22 | 42 | −4 | 388 | 4.64 | Cluster |
Caudate | 8 | 28 | 2 | ||||
NAcc | 4 | 20 | 4 | ||||
Primary motor cortex | −26 | 12 | 60 | 252 | 5.10 | Peak | |
Paracentral lobule | −10 | −30 | 66 | 336 | 4.56 | Cluster | |
Angular gyrus L | −30 | −52 | 52 | 1653 | 5.51 | Cluster | |
IPL L | |||||||
Angular gyrus R | 38 | −50 | 58 | 2510 | 5.33 | Cluster | |
IPL R | |||||||
Occipital cortex | −36 | −74 | 8 | 767 | 5.25 | Cluster | |
Devalued slips > upvalued go | Anterior insula L | −36 | 26 | −8 | 707 | 4.29 | Cluster |
SMA | 8 | 8 | 64 | 378 | 5.47 | Cluster | |
dACC | 8 | 18 | 34 | 431 | 4.18 | Cluster | |
Inferior parietal lobule L | −56 | −42 | 34 | 269 | 4.60 | Peak | |
Inferior parietal lobule R | 56 | −4 | 44 | 331 | 4.44 | Cluster | |
Supramarginal gyrus | −36 | 26 | −8 | 707 | 4.29 | Cluster | |
Upvalued go > devalued slips | ACC, caudate nucleus | −20 | 22 | 18 | 327 | 4.17 | Cluster |
Premotor/PMC | −26 | 0 | 42 | 529 | 4.80 | Cluster | |
Lateral OFC | −32 | 62 | 0 | 317 | 4.19 | Cluster | |
Superior parietal love L | −28 | −76 | 36 | 1482 | 4.12 | Cluster | |
Superior parietal love R | 30 | −62 | 38 | 4099 | 5.04 | Cluster | |
Occipital/visual cortex | −30 | −96 | 16 | 1307 | 6.43 | Cluster |
Contrast . | Region . | MNI Coordinates (x, y, z) Max . | Cluster Size (Voxels) . | z Score (Peak) . | Correction . | ||
---|---|---|---|---|---|---|---|
Upvalued go > still-valuable go | Insula R | 38 | 24 | −2 | 468 | 4.16 | Cluster |
Devalued slips > still-valuable go | Anterior insula L | −40 | 26 | 2 | 611 | 5.46 | Cluster |
Anterior insula R | 42 | 26 | −10 | 621 | 4.49 | Cluster | |
Still-valuable go > devalued slips | vMPFC | 22 | 42 | −4 | 388 | 4.64 | Cluster |
Caudate | 8 | 28 | 2 | ||||
NAcc | 4 | 20 | 4 | ||||
Primary motor cortex | −26 | 12 | 60 | 252 | 5.10 | Peak | |
Paracentral lobule | −10 | −30 | 66 | 336 | 4.56 | Cluster | |
Angular gyrus L | −30 | −52 | 52 | 1653 | 5.51 | Cluster | |
IPL L | |||||||
Angular gyrus R | 38 | −50 | 58 | 2510 | 5.33 | Cluster | |
IPL R | |||||||
Occipital cortex | −36 | −74 | 8 | 767 | 5.25 | Cluster | |
Devalued slips > upvalued go | Anterior insula L | −36 | 26 | −8 | 707 | 4.29 | Cluster |
SMA | 8 | 8 | 64 | 378 | 5.47 | Cluster | |
dACC | 8 | 18 | 34 | 431 | 4.18 | Cluster | |
Inferior parietal lobule L | −56 | −42 | 34 | 269 | 4.60 | Peak | |
Inferior parietal lobule R | 56 | −4 | 44 | 331 | 4.44 | Cluster | |
Supramarginal gyrus | −36 | 26 | −8 | 707 | 4.29 | Cluster | |
Upvalued go > devalued slips | ACC, caudate nucleus | −20 | 22 | 18 | 327 | 4.17 | Cluster |
Premotor/PMC | −26 | 0 | 42 | 529 | 4.80 | Cluster | |
Lateral OFC | −32 | 62 | 0 | 317 | 4.19 | Cluster | |
Superior parietal love L | −28 | −76 | 36 | 1482 | 4.12 | Cluster | |
Superior parietal love R | 30 | −62 | 38 | 4099 | 5.04 | Cluster | |
Occipital/visual cortex | −30 | −96 | 16 | 1307 | 6.43 | Cluster |
L = left; R = right; NAcc = nucleus accumbens; IPL = inferior parietal lobule; (d)ACC = (dorsal) anterior cingulate cortex; PMC = primary motor cortex; OFC = orbitofrontal cortex.
To identify regions where participants fail to adapt and continue to respond according to the learned S–R association, we contrasted incorrect incongruent trials (devalued go and upvalued no-go) to correct incongruent trials (upvalued go and devalued no-go, respectively), as the latter arguably require most goal-directed control to override the learned S–R mapping. The contrast comparing devalued go responses (i.e., slips of action) with upvalued go responses is shown in Figure 4A, and revealed increased activity in a fronto-parietal network, including the left anterior insula extending to the inferior lateral prefrontal cortex, SMA, dorsal anterior cingulate cortex, bilateral inferior parietal lobule, and supramarginal gyrus (all pFWE-cluster < .05; Table 2). Conversely, lower activity during slips of action compared with upvalued go responses was seen in the left anterior cingulate cortex extending into caudate nucleus, left lateral OFC, bilateral superior parietal lobe, and several occipital/primary visual areas (all pFWE-cluster < .05, Table 2). In addition, activation in the premotor/primary motor cortex did not survive cluster-level correction but did reach peak-voxel level significance (pFWE-voxel < .05). Although the previous contrast between devalued slips and correct upvalued go responses maximizes the difference between habitual versus goal-directed control, the conditions differ in terms of the original training outcome value (as well as test value). To mitigate this, we proceeded to compare devalued slips to still-valuable go responses, which only differ in their test outcome value. Thus, this contrast compares trials on which participants correctly continued responding according to the learned S–R association with trials on which they failed to override this association. Although we have used the same approach previously (in the study of Watson et al., 2018, the “slips versus respond valuable” contrast), this contrast was not preregistered and should thus be considered exploratory. Similar to the comparison with upvalued go responses, this comparison of slips with still-valuable go responses revealed increased anterior insula activity (bilaterally) during slips, but decreased activity in vMPFC (extending to NAcc), primary motor cortex, paracentral lobule, a large occipital cluster, and large parietal clusters (bilateral) including the angular gyrus and inferior parietal lobule (all pFWE-cluster < 0.05; Figure 4A).
As preregistered, we also compared upvalued no-go responses (“inhibition slips”) to correct devalued (no-go) trials, but this did not reveal any significant activation patterns. Moreover, we were not able to conduct the contrast between upvalued and still-valuable no-go trials, because of the low number of omission errors on still-valuable trials.
Our results thus identify the anterior insula as a common region associated with slips toward devalued outcomes, as activity in this region was higher during slips than during go responses toward upvalued and still-valuable outcomes. However, both contrasts are confounded by expected value (the outcome value during the test phase) as they both compare stimuli signaling a nonvaluable outcome (devalued) with stimuli signaling a valuable outcome (upvalued or still-valuable). To control for this, we ran some additional exploratory analyses, comparing activity during devalued slips with correct no-go responses on devalued and still-not-valuable trials. Although these contrasts are difficult to interpret by themselves—they are themselves confounded by pressing a button or not—looking at the overlap between all four contrasts overcomes the value-related confounds and hence could find a common process in the expression of habits. To this end, we used ImCalc to create binary images of all four contrasts thresholded at t(41) = 3.1 (equivalent to p < .001 uncorrected) and multiply them. The result of this inclusive masking analysis, which is akin to a conjunction analysis, shows that the bilateral anterior insula was commonly activated across all four contrasts (Figure 4B).
Symmetrical Outcome-Revaluation Test: Comparing Goal and Implementation Intentions
None of the planned contrasts comparing test-phase trials trained with implementation with goal intentions revealed significant activation patterns.
DISCUSSION
The aim of the present study was to investigate whether the brain can strategically go on automatic pilot. We investigated this by measuring the impact of strategic planning (i.e., implementation intentions vs. goal intentions) on the acquisition of instrumental actions as well as subsequent flexible, behavioral adjustment. When strategic planning was first introduced during the instrumental learning phase of our paradigm, implementation intentions improved performance relative to goal intentions. Furthermore, in line with the idea that their beneficial effect was mediated by accelerated S–R learning, an exploratory analysis revealed that implementation intentions were associated with reduced activity in the anterior caudate, a brain area previously implicated in goal-directed control (Watson et al., 2018; Liljeholm, Tricomi, O'Doherty, & Balleine, 2011). These effects of strategic planning on performance and neural activity were only apparent early in training, with participants reaching high levels of accuracy (and reduced activity in the anterior caudate) by the end of the learning phase independent of intention type. Our central question, however, was whether implementation intentions would actually impede performance when flexible, behavioral adjustment was required during the subsequent outcome-revaluation test. Importantly, we found no evidence for a detrimental effect of strategic planning on the ability to adapt behavior to changing outcome values, nor any effect on underlying neural activity patterns. We conclude that strategic planning of S–R mappings may allow people to go on automatic pilot to increase behavioral efficiency, but that this does not have to come at the expense of behavioral flexibility. Therefore, mental rehearsal of S–R links does not appear to suffice for the formation of a rigid habit, refuting the notion of “instant habits” and suggesting that behavioral repetition may be crucial for the development of rigid habits.
To shed light on the implications of these findings, we will first discuss them in some more detail, including the basic results (i.e., across strategies) on the relatively novel SORT. First, during the instrumental learning phase, we observed increasing accuracy and decreasing RTs over the course of training, suggesting that participants acquired the S–R mappings. This interpretation was further supported by high levels of subjective automaticity of responding at the end of the instrumental learning phase and increasing involvement of two distinct parts of the dorsal striatum: the posterior putamen, replicating findings from Tricomi et al. (2009), and the caudate nucleus head. Several previous fMRI studies have indirectly implicated the dorsal striatum in habit learning, either showing that with longer instrumental training this region becomes more active (Wunderlich, Dayan, & Dolan, 2012; Tricomi et al., 2009) or that functional connectivity with the (pre-)motor cortex increases (Zwosta et al., 2018; Horga et al., 2015). Although increased activity of the posterior putamen was only significant with small-volume correction (so not very robust), activity of the caudate nucleus as well as the hippocampus survived whole-brain correction. Both regions have previously been implicated in the encoding of S–R representations (McNamee, Liljeholm, Zika, & O'Doherty, 2015). Moreover, we found decreasing activity of the primary motor cortex (extending to mid-posterior cingulate), the hippocampus (extending into the putamen), bilateral temporal cortex, and the right anterior caudate, previously implicated in goal-directed control (Liljeholm et al., 2011; Balleine & O'Doherty, 2010). In line with previous findings, these results suggest that dissociable neural regions support instrumental learning. Notably, these findings were specific to learning to make go responses for valuable outcomes. In contrast, we did not see any changes in neural activity over the course of no-go training, despite high accuracy and reported automaticity at the end of training. Thus, our neuroimaging analyses do not provide evidence for the development of “inhibition habits” (Jahanshahi, Obeso, Rothwell, & Obeso, 2015).
Importantly, when strategic planning was introduced after the first 12 training blocks, implementation intentions initially improved go performance (reflected by higher accuracy during the first block-set than in the preceding (pre-)training block-set), whereas goal intentions impaired it (reflected by lower accuracy and slower RTs). In contrast, no-go learning (i.e., withholding a response for nonvaluable stimuli) was not affected by planning strategy. These findings replicate our previous results (van Timmeren & de Wit, 2022). Furthermore, in line with the notion of “instant habits,” this behavioral effect of implementation intentions early in training was accompanied by reduced activity in the anterior caudate relative to goal intentions. This early effect of implementation intentions quickly disappeared, however, and no differences with goal intentions were observed on accuracy and RTs in later training blocks, nor on subjective automaticity after training. Therefore, in support of the notion of strategic automaticity, it appears that instrumental acquisition initially benefitted from if–then planning, while dependency on goal-directed control (as suggested by anterior caudate activity) was reduced.
So far, there have been very few neuroimaging studies that have compared the use of implementation and goal intentions to support behavioral performance. One study (Gilbert, Gollwitzer, Cohen, Oettingen, & Burgess, 2009) showed that implementation intentions engaged the medial BA10 more (and lateral BA10 less) than a control condition, which was argued to reflect increased cue monitoring (and reduced internal information processing). This contrasts with our finding that implementation intentions lead to reduced engagement of the anterior caudate during the instrumental learning phase. However, their control condition was very different to ours. Whereas they specified the cue and the outcome that it signaled to be available (i.e., the S-O contingency; “if the cue appears, then I can score 5 points”), we used a goal intention control condition that specified the R-O contingency, which is arguably more akin to a typical goal intention (e.g., “I will exercise to lose weight”).
In the next phase of the SORT, signaled outcome values changed, requiring flexible adaptation of responding to the discriminative stimuli. This allowed us to determine whether strategic planning (during training) would induce the rigidity that is commonly regarded as a hallmark of learned habits that are stamped in through behavioral repetition. However, we failed to find convincing evidence that if–then planning impaired the ability to flexibly adjust responding when signaled outcome values changed. This was despite the fact that participants struggled to adjust learned S–R mappings overall, as reflected in a strong main effect of congruency. Furthermore, in line with the behavioral findings, we also found no evidence for an impact of planning on neural activation patterns during the extinction test phase. Therefore, this first neuroimaging investigation of the effect of implementation intentions on behavioral flexibility in an outcome-revaluation paradigm failed to provide evidence for a shift from goal-directed toward habitual control.
The evidence for intact behavioral flexibility despite if–then planning contrasts with results from an earlier study with this paradigm (van Timmeren & de Wit, 2022). In that study, we found that implementation intentions impaired test-phase performance overall, but this did not lead to inflexibility as would be reflected by lower accuracy on incongruent trials specifically. This general impairment was most likely because of the fact that implementation intentions, by focusing attention on the S–R mappings, blocked learning about the S-O contingencies. To prevent this from happening in the present study, we altered our paradigm to promote active S-O learning at first training phase, before intentions were introduced. As a result, participants already acquired high levels of S-O knowledge when they started using strategic planning. Integrating findings from both studies, it appears that when the agent has full knowledge of the (S-O) contingencies, implementation intentions do not impair flexibility. This finding is encouraging, because in most applied situations in real life, agents are perfectly aware of the three-term instrumental contingencies. Therefore, our results are in line with the idea of implementation intentions being “flexibly tenacious” (Legrand, Bieleke, Gollwitzer, & Mignon, 2017; Gollwitzer, Parks-Stamm, Jaudas, & Sheeran, 2008): People benefit from if–then planning when the situation specified in their plan is encountered (here in terms of higher accuracy and lower RTs during training), but are goal-directed in the sense that they only act on these planned S–R mappings when the signaled outcome is currently a goal.
Across intentions, however, we found that action slips toward devalued outcomes were associated with increased bilateral insula (both when compared with still-valuable and upvalued responses), replicating findings from the only study to date looking at neural activity during slips of action (Watson et al., 2018). The insula is a functionally heterogeneous region (Uddin, Nomi, Hébert-Seropian, Ghaziri, & Boucher, 2017), but the anterior part has been critically implicated in error and salience processing and response inhibition (Uddin, 2015; Chang, Yarkoni, Khaw, & Sanfey, 2013). Specifically, previous work shows that failure to inhibit a learned response (on the stop-signal paradigm) is associated with bilateral insular activity (Ramautar, Slagter, Kok, & Ridderinkhof, 2006). In addition, when compared with upvalued responses, slips were associated with increased activity in the dorsal anterior cingulate cortex, the SMA, and parietal cortex, all part of the salience network (Seeley et al., 2007). Conversely, lower activity during slips was seen in the vMPFC, or medial OFC, when compared with responses for still-valuable outcomes. Previous outcome-devaluation studies suggest that activity in this region mediates goal-directed instrumental learning (de Wit et al., 2012; Valentin et al., 2007). A similar contrast, comparing devalued action slips with responses toward upvalued outcomes, showed lower activity in the lateral OFC and ACC/caudate nucleus head, regions that have also been implicated in goal-directed control (e.g., Watson et al., 2018; McNamee et al., 2015; de Wit et al., 2012). Overall, our results suggest that habitual slips of action arise as a consequence of lapses in goal-directed control (as reflected by decreased activity in these regions) rather than by increased activation of S–R habit regions (i.e., the dorsal striatum). Finally, the informal conjunction analysis of devalued slips (Figure 4B), controlling for differences in expected value and motor response, showed that the anterior insula was commonly activated across all contrasts, implicating it as a key region mediating habitual action slips.
A lack of reliable, positive markers of habits is an important issue in human habit research (Watson, O'Callaghan, Perkes, Bradfield, & Turner, 2022; Kruglanski & Szumowska, 2020; De Houwer, 2019; De Houwer, Tanaka, Moors, & Tibboel, 2018; Watson & de Wit, 2018). In the context of the present study, it begs the question whether habit strength independently contributes to stimulus-dependent, outcome-insensitive responding (i.e., slips of action). A recently published study with the SORT adds weight to this concern, as we showed there that extensive instrumental training failed to impair test performance (Watson, Gladwin, et al., 2022). The lack of reliable evidence for overtraining effects (see also de Wit et al., 2018) could mean different things, but our current findings may offer an interesting explanation. Specifically, we observed that when the planning manipulation was first introduced during training, not only did implementation intentions improve performance, but goal intentions also significantly impaired performance. This may indicate that participants’ spontaneous strategy up to that point had not been to form goal intentions, but instead to switch as soon as they could to the more efficient strategy of focusing on the S–R mappings. In other words, they may have spontaneously formed implementation intentions (Bieleke & Keller, 2021). Therefore, rather than improving their performance with the explicit implementation intention manipulation, we impaired it in the goal intention condition. Such an early shift to reliance on S–R associations (i.e., within 12 blocks of training) may explain that previous experimental studies failed to find evidence for overtraining, as their short training conditions may already have been sufficiently long to induce this, and beyond that early shift additional training may not have significantly enhanced the strength of those associations. This idea accords well with results from a study by Pool and colleagues (2022) who found that, following outcome devaluation on a free-operant task, already after moderate training (12 blocks), outcome-insensitive habitual responding was seen in the majority of participants. Our findings further reinforce this interpretation by showing significant changes in neural activity over the course of this relatively short training, with activity of the anterior caudate (implicated in goal-directed learning) decreasing and of the dorsal striatum (implicated in habitual control) increasing. From our study it is unclear, however, how activity in these regions developed in the earliest stages of instrumental training, as that took place outside the scanner. Future research should determine how many behavioral repetitions it takes to permit this shift to an S–R strategy, by assessing the effect of a goal intention manipulation at different time points during training. Our hypothesis is that, at the start of training, this would not yet have a negative impact—relative to implementation intentions—but that it will after a few blocks.
In conclusion, we provide evidence for increased efficiency but preserved flexibility following strategic if–then planning. These behavioral findings were mirrored in our analyses of the underlying brain activity: Implementation intentions did not reduce the engagement of goal-directed control when goals changed, nor increase activity in habit regions. Therefore, our findings suggest that this strategic planning technique supports the implementation of a new target behavior while still allowing for flexible adjustment when goals change.
Reprint requests should be sent to Tim van Timmeren, Department of Social, Health and Organizational Psychology, Utrecht University, Heidelberglaan 1, Utrecht, The Netherlands, 3508 TC, or via e-mail: [email protected].
Data Availability Statement
Data to recreate the main behavioral analyses (with analysis pipeline and output) are available at OSF: https://www.doi.org/10.17605/OSF.IO/642QU. Whole-brain t-maps (without thresholding) of the main fMRI contrasts are available at https://neurovault.org/collections/13191/.
Author Contributions
Tim van Timmeren: Conceptualization; Data curation; Formal Analysis; Investigation; Methodology; Project administration; Visualization; Writing—Original draft; Writing—Review & editing. John O'Doherty: Conceptualization; Writing—Review & editing. Nadza Dzinalija: Investigation; Project administration; Writing—Review & editing. Sanne de Wit: Conceptualization; Funding Acquisition; Resources; Supervision; Writing—Original draft; Writing—Review & editing.
Funding Information
Tim van Timmeren, VIDI grant from Dutch Research Council (Nederlandse Organisatie voor Wetenschappelijk Onderzoek) (https://dx.doi.org/10.13039/501100001722), grant number: Van der Gaag Fund. Sanne de Wit, VIDI grant from Dutch Research Council (Nederlandse Organisatie voor Wetenschappelijk Onderzoek) (https://dx.doi.org/10.13039/501100003246), grant number: 016.145.382.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance. The authors of this article report its proportions of citations by gender category to be as follows: M/M = .512, W/M = .256, M/W = .070, and W/W = .163.