Abstract

This study presents the first direct investigation of the hypothesis that dopamine depletion of the dorsal striatum in mild Parkinson disease leads to impaired stimulus–response habit formation, thereby rendering behavior slow and effortful. However, using an instrumental conflict task, we show that patients are able to rely on direct stimulus–response associations when a goal-directed strategy causes response conflict, suggesting that habit formation is not impaired. If anything our results suggest a disease severity–dependent deficit in goal-directed behavior. These results are discussed in the context of Parkinson disease and the neurobiology of habitual and goal-directed behavior.

INTRODUCTION

If an act became no easier after being done several times, if the careful direction of consciousness were necessary to its accomplishment on each occasion, it is evident that the whole activity of a lifetime might be confined to one or two deeds (…). A man might be occupied all day in dressing and undressing himself (Maudsley, 1876, p. 155).

Since Maudsley (1876) wrote these words on the importance of habit formation, it has been proposed by many psychologists that instrumental behavior becomes habitual with extensive practice (Bolles, 1972; Kimble & Perlmutter, 1970; Tolman, 1932; James, 1890). The underlying associative mechanism was first described by Thorndike (1911) as a gradual stamping in of associations between contextual stimuli (S) and responses (R) that lead to rewarding outcomes (O). Via these direct S→R associations, instrumental actions can be activated with minimal cognitive effort, thereby freeing up cognitive resources. The ability to form habits allows for fast selection of appropriate responses in stable contexts and therefore plays a crucial role in much of our everyday decision-making.

Although the notion of habit formation has been around for a long time, a gradual shift from internal to external control over behavior with practice was not demonstrated experimentally until the 1980s. Adams (1982) showed that after extensive training, instrumental behavior of rats loses its direct sensitivity to the incentive value of the outcome, suggesting a transition from goal-directed behavior mediated by O→R associations1 to S→R habits directly driven by external cues (de Wit, Corlett, Aitken, Dickinson, & Fletcher, 2009). This overtraining paradigm was used only very recently to demonstrate that in humans, as in animals, practice leads to the development of behavioral autonomy (Tricomi, Balleine, & O'Doherty, 2009). Moreover, humans as well as other animals will revert to a habitual strategy even early in training if a goal-directed strategy causes response conflict (de Wit, Corlett, et al., 2009; de Wit, Niry, Wariyar, Aitken, & Dickinson, 2007).

Recently, the neurobiology of these distinct habitual and goal-directed control mechanisms has received an increasing amount of attention (Daw, Niv, & Dayan, 2005; Yin, Knowlton, & Balleine, 2004; Joel & Weiner, 2000). Although the BG has long been implicated in habit memory (Packard & Knowlton, 2002; Mishkin, Malamut, & Bachevalier, 1984), behavioral neuroscience studies with rodents have only recently begun to elucidate the specific neural mechanisms of experimentally defined behavioral control processes. This work has shown that the dorsomedial striatum and the prelimbic cortex subserve goal-directed actions (Yin, Knowlton, & Balleine, 2005; Corbit & Balleine, 2003; Killcross & Coutureau, 2003; Balleine & Dickinson, 1998), whereas habit formation is reflected in a shift in control toward the dorsolateral striatum (DLS) (Yin & Knowlton, 2006; Yin et al., 2004, 2005). Dopamine is thought to be crucially involved in this process (Wise, 2004), and the dopaminergic projection from the substantia nigra to the dorsal striatum has been implicated in the reinforcement of habits (Faure, Haberland, Conde, & El Massioui, 2005; see also Reynolds, Hyland, & Wickens, 2001).

Homologue areas (Joel & Weiner, 2000) are thought to be involved in human instrumental behavior. Several fMRI studies implicate the ventromedial pFC (vmPFC; de Wit, Corlett, et al., 2009; Tanaka, Balleine, & O'Doherty, 2008; Valentin, Dickinson, & O'Doherty, 2007) and the anterior caudate nucleus (O'Doherty et al., 2004) in goal-directed control. On the other hand, Tricomi et al. (2009) recently provided the first evidence for progressive recruitment of the human homologue area of the rodent DLS, namely, the dorsal putamen, with prolonged instrumental training. However, this evidence is merely correlational in that Tricomi et al. observed increased fMRI activations in this area as a function of practice. So far, there is no direct evidence that in humans the dorsal striatum plays a critical supporting role in S→R habit formation.

One approach to studying the importance of mesocorticolimbic circuits and dopamine for habitual control is by investigating the effects of Parkinson disease (PD). PD is associated with progressive nigrostriatal and mesocorticolimbic dopamine depletion and is accompanied by subtle cognitive impairments even in the early stages, resembling those seen in frontal lobe patients (Owen et al., 1992, 1995; Taylor, Saint-Cyr, & Lang, 1986). Mild PD is a particularly good model for assessing the hypothesized distinct roles of different parts of the striatum because studies have shown that, in early PD, dopamine depletion is most severe in the dorsal striatum, only later progressing to areas associated with goal-directed action control, including more ventral and medial parts of the striatum and pFC (Agid et al., 1993; Kish, Shannak, & Hornykiewicz, 1988). In keeping with this neurobiological pattern of dopamine depletion, it has long been hypothesized that, already at an early stage of the disease, PD is accompanied by a disruption of habit formation (Knowlton, Mangels, & Squire, 1996), which would render even simple everyday activities or performing more than one action at once effortful for these patients (Brown & Marsden, 1990, 1991). For instance, on the basis of evidence for impaired implicit, incremental associative learning of relationships between stimuli and outcomes in a probabilistic classification task but intact acquisition of declarative knowledge, Knowlton et al. (1996) have argued that PD patients exhibit a habit memory deficit (but see, Witt et al., 2006). Since then, similar feedback-based (probabilistic) learning impairments have been observed in a variety of instrumental learning paradigms (Frank, Samanta, Moustafa, & Sherman, 2007; Frank, Seeberger, & O'Reilly, 2004; Shohamy et al., 2004; Shohamy, Myers, Onlaor, & Gluck, 2004). Accordingly, it is now well accepted that mild PD can be accompanied by instrumental learning impairments (but see Swainson et al., 2006). However, a major problem with many of these instrumental learning studies is that paradigms were used that cannot distinguish between the deployment of habitual versus goal-directed associative structures in instrumental behavior as defined experimentally in the animal literature.

The primary aim of the present study is to address this confound between habit-based and goal-directed behavior in the PD literature. To this end, we employed a behavioral procedure that has been used successfully in previous studies to establish habits in both animals and humans. Specifically, we assessed the concurrent learning of multiple biconditional instrumental discriminations in which fruit pictures and points functioned both as discriminative stimuli and as outcomes for left and right keypresses (de Wit, Corlett, et al., 2009; de Wit et al., 2007; for animal studies with an equivalent task, see de Wit, Ostlund, Balleine, & Dickinson, 2009; de Wit, Kosaki, Balleine, & Dickinson, 2006; Dickinson & de Wit, 2003). The three types of discriminations are illustrated in Figure 1 (for a more elaborate explanation, see Methods). Whereas four different fruits functioned as stimuli and as outcomes in the standard biconditional discrimination, the other two discriminations each involved only two fruit pictures that were either the same in each component of the discrimination (congruent) or opposites (incongruent). In the latter incongruent discrimination, fruit pictures should become associated with opposite responses via S→R versus O→R associations. Critically, whereas performance on congruent and standard discriminations can be supported by both goal-directed associative structures as well as stimulus–response (S-R) habit formation, the incongruent discrimination requires predominant reliance on S→R associations to prevent response conflict due to O→R associations. In previous studies, this reliance on the habit system was reflected in overall poor performance on this incongruent discrimination relative to the congruent and standard discriminations that receive additional support from the goal-directed system (de Wit, Corlett, et al., 2009; de Wit et al., 2007). To assess directly the degree to which subjects adopted habit or goal-directed learning strategies to solve the different discriminations, we employed an “instructed” outcome devaluation test at the end of training (for detailed description, see Methods). If subjects formed O→R associations during training and were successful in using the instructed value to guide their behavior during the test, they should direct their actions toward the still-valuable fruit outcomes at the expense of devalued goals. In line with our theoretical account of incongruent performance, previous studies have shown that outcome devaluation test performance of young healthy volunteers is indeed impaired for the incongruent relative to the other discriminations (de Wit, Corlett, et al., 2009; de Wit et al., 2007).2

Figure 1. 

Grayscale representation of an example of the instrumental contingencies for the congruent, incongruent, and standard discriminations (left panel) and corresponding goal-directed stimulus→outcome→response (S→O→R) associative structures (right panel). The gray arrows represent O→R associations that cause conflict in the incongruent discrimination.

Figure 1. 

Grayscale representation of an example of the instrumental contingencies for the congruent, incongruent, and standard discriminations (left panel) and corresponding goal-directed stimulus→outcome→response (S→O→R) associative structures (right panel). The gray arrows represent O→R associations that cause conflict in the incongruent discrimination.

To summarize, with this study we aimed to extend the existing correlational evidence for the role of the striatum in the ability to form habits in humans (Tricomi et al., 2009) by assessing whether mild PD patients, characterized by relatively severe dopamine depletion in the DLS, exhibit a significant S→R habit formation deficit. In keeping with the above-reviewed literature, we predicted that mild PD patients would exhibit disproportionate difficulty with the learning of the incongruent discrimination, which requires the use of S→R habits. Conversely, performance on the subsequent outcome devaluation test should not be negatively affected early in the disease as it relies on goal-directed associative structures, which should at that stage be relatively intact. Importantly, we also investigated whether there was a negative relationship between disease severity and outcome devaluation test performance in our sample because of progressive dopaminergic depletion of areas that support goal-directed action control. Finally, to investigate the role of dopamine in the hypothesized habit formation deficit, we compared performance of groups of patients on versus off their normal regimen of dopaminergic medication.

METHODS

This study was approved by the Peterborough and Fenland Local Research Ethics Committee. All subjects gave written consent.

Patients

Thirty PD patients were recruited from the Brain Repair Centre at Addenbrooke's Hospital, Cambridge, UK. All patients were diagnosed by a neurologist, and all were receiving dopaminergic medication. Fifteen patients were tested taking their medication as usual (On group; 11 men/4 women), whereas the other half was asked to abstain from their medication 18 hr before the test session (Off group; 11 men/4 women). This procedure allowed us to investigate the effect of medication withdrawal using a between-subjects design (to prevent practice effects associated with a within-subject design). Average number of hours since taking the last dose was approximately 4.5 hr for patients in the On group and approximately 20 hr for the Off group. We endeavored to match the patient groups in terms of the type of medication as much as was feasible. As can be seen in Table 1, 23 of the 30 patients tested were receiving l-dopa. The remaining seven patients all received the D3 (and to lesser extent D2/D4) receptor agonist ropinirole. Demographics and clinical characteristics of the PD patients are detailed in Table 2. None of the patients had a significant neurological history unrelated to PD, and all patients were nondemented (Mini-Mental State Examination [MMSE] > 24) and nondepressed (Beck Depression Inventory [BDI] < 30, with a range of 0–25 in the patients and 4–19 in the controls) (Beck, Ward, Mendelson, Mock, & Erbaugh, 1961). The average disease duration was 6.2 years for the On group (SEM = 0.7) and 8.5 years for the Off group (SEM = 1.4). None of the patients showed evidence for a dopamine dysregulation syndrome. The severity of PD symptoms was assessed during the testing session with the Hoehn and Yahr (1967) rating scale and the 44-item Unified Parkinson Disease Rating Scale (UPDRS) (Fahn, Elton, & Committee, 1987). Hoehn and Yahr ratings ranged between I and III. We expected higher UPDRS scores for the Off group than for the On group. However, the two groups did not differ significantly in terms of disease severity (as reflected by their UPDRS scores), F < 1, suggesting that had medication status been matched, disease severity was likely more severe for the patients in the On group than that in the Off group. Therefore, any effects of medication might reflect effects of disease severity.

Table 1. 

Medications


PD On group
PD Off group
D2/D3 receptor agonists 12 11 
l-Dopa 10 13 
Pergolide (D1/D2) 
Amantadine 
COMT inhibitor 
Antidepressants 
MAO-B inhibitor 
Benzodiazepine 

PD On group
PD Off group
D2/D3 receptor agonists 12 11 
l-Dopa 10 13 
Pergolide (D1/D2) 
Amantadine 
COMT inhibitor 
Antidepressants 
MAO-B inhibitor 
Benzodiazepine 
Table 2. 

Background Details


Age
Edu
NART
BDI
MMSE
UPDRS
H&Y
Hours Since Last Dose
On (n = 15) 64.8 (6.5) 12.4 (3.2) 33.6 (9.1) 7.7 (4.0) 28.9 (1.4) 36.7 (13.2) 1.7 (0.6) 4.3 (4.9) 
Off (n = 15) 61.1 (2.1) 12.9 (2.7) 38.9 (9.0) 7.8 (6.9) 29.6 (0.8) 41.4 (18.9) 1.8 (0.8) 19.8 (2.8) 
CS (n = 14) 63.0 (8.1) 13.1 (3.1) 40.0 (5.3) 6.3 (4.8) 29.3 (1.0) N/A N/A N/A 

Age
Edu
NART
BDI
MMSE
UPDRS
H&Y
Hours Since Last Dose
On (n = 15) 64.8 (6.5) 12.4 (3.2) 33.6 (9.1) 7.7 (4.0) 28.9 (1.4) 36.7 (13.2) 1.7 (0.6) 4.3 (4.9) 
Off (n = 15) 61.1 (2.1) 12.9 (2.7) 38.9 (9.0) 7.8 (6.9) 29.6 (0.8) 41.4 (18.9) 1.8 (0.8) 19.8 (2.8) 
CS (n = 14) 63.0 (8.1) 13.1 (3.1) 40.0 (5.3) 6.3 (4.8) 29.3 (1.0) N/A N/A N/A 

Values are presented as mean (SD).

Edu = education; NART = National Adult Reading Test; BDI = Beck Depression Inventory; MMSE = Mini-Mental State Examination; UPDRS = Unified Parkinson Disease Rating Scale; H&Y = Hoehn & Yahr; N/A = not applicable.

Controls

Fifteen healthy age- and IQ-matched control volunteers were recruited through local advertisement in the Cambridge community. The data of one subject had to be excluded because of a technical error, leaving 14 subjects in the control (CS) group (9 men and 5 women). The background details of the control subjects are presented in Table 2. Separate one-way ANOVAs established that there were no significant differences between the On, Off, and CS groups in terms of: age (F < 1); education (F < 1); and premorbid IQ (as assessed with the National Adult Reading Test [NART]; Nelson, 1982), with estimated verbal IQ scores of 114, 119, and 120 for the On, Off, and CS groups, respectively, F(2, 41) = 2.67, MSE = 64.89; MMSE, F(2, 41) = 1.77, MSE = 1.16; BDI (F < 1). BDI did differ considerably between subjects, perhaps partly because positive scores can reflect the motor symptoms of PD rather than true depression and could be expected to affect instrumental performance. However, we failed to find evidence for correlations between the BDI score and the discriminative performance during training and accuracy of test performance (with Pearson correlations of −0.12 and −0.14).

Background Neuropsychological Tests

In addition to the instrumental conflict task, all volunteers received several background neuropsychological tests: letter and semantic fluency tasks (Benton, 1968), Stroop (1935) task, pattern recognition memory (PRM), and spatial recognition memory (SRM) (Sahakian et al., 1988). The results are presented in Table 3. Separate one-way ANOVAs showed that performance of the three groups on these tasks was statistically indistinguishable: letter fluency, F < 1; semantic fluency, F < 1; Stroop (in terms of Stroop interference divided by Stroop words), F < 1; PRM, F(2, 41) = 1.30, MSE = 75.07; SRM, F(2, 41) = 1.28, MSE = 89.84. Finally, in PD patients, disease severity in terms of UPDRS score did not correlate with performance on these neuropsychological tests, letter fluency (r = −0.18), semantic fluency (r = −0.18), Stroop (r = −0.22), PRM (r = −0.08), SRM (r = 0.09), nor with age (r = 0.21), education (r = 0.05), NART (r = −0.29), MMSE (r = −0.16), and BDI (r = 0.25).

Table 3. 

Results of the Background Neuropsychological Tests


FAS
Sem Flu
Str Words
Str Colors
Str Interference
PRM
SRM
On (n = 15) 42.7 (13.8) 33.7 (6.2) 86.0 (15.7) 62.1 (12.3) 33.0 (10.4) 88.7 (10.2) 77.9 (12.8) 
Off (n = 15) 43.7 (13.7) 31.8 (11.9) 93.0 (24.2) 64.5 (13.0) 32.6 (12.0) 92.5 (8.6) 80.6 (7.1) 
CS (n = 14) 41.1 (15.4) 33.2 (6.2) 101.6 (17.6) 69.9 (14.9) 39.7 (12.5) 93.8 (6.7) 83.6 (7.2) 

FAS
Sem Flu
Str Words
Str Colors
Str Interference
PRM
SRM
On (n = 15) 42.7 (13.8) 33.7 (6.2) 86.0 (15.7) 62.1 (12.3) 33.0 (10.4) 88.7 (10.2) 77.9 (12.8) 
Off (n = 15) 43.7 (13.7) 31.8 (11.9) 93.0 (24.2) 64.5 (13.0) 32.6 (12.0) 92.5 (8.6) 80.6 (7.1) 
CS (n = 14) 41.1 (15.4) 33.2 (6.2) 101.6 (17.6) 69.9 (14.9) 39.7 (12.5) 93.8 (6.7) 83.6 (7.2) 

Values are presented as mean (SD).

FAS = letter fluency; Sem Flu = semantic fluency; Str = Stroop; PRM = pattern recognition memory; SRM = spatial recognition memory.

Procedure

The full experiment took approximately 2 hr. The background and experimental tasks were always administered in the following order: NART, FAS, MMSE, instrumental conflict task, PRM and SRM, Stroop task, BDI, UPDRS, and Hoehn and Yahr (1967) rating scale. The computerized experimental task was adapted from the version used by de Wit et al. (2007). The main changes were that subjects received a demonstration of the task, as also in a previous fMRI study with this paradigm (de Wit, Corlett, et al., 2009; de Wit, Ostlund, et al., 2009), and that the instrumental training phase was longer than that in previous studies to ensure that subjects acquired the instrumental discriminations.

Stimuli

The stimuli consisted of colored icons representing the eight different fruits: orange, pineapple, pear, apple, banana, cherry, grape, and coconut (see also de Wit et al., 2007). For the demonstration of the task, we used three colored icons, representing beer, wine, and coffee. All pictures were presented on a standard PC monitor, and responses on a left (m) and right (z) key were recorded on a standard keyboard using a program written in Visual Basic 6.0.

Demonstration of Conflict Task and Instructions

All subjects received a demonstration of the conflict task, using the following instructions on the computer screen:

In this game, you will get the chance to earn points by collecting items from inside a box on the screen by opening the box by pressing either the right or the left key. If you press the correct key, the box will open to reveal a drink inside and points will be added to your total score. However, if you press the incorrect key, the box will be empty and no points will be added to your total. Your task is to learn which is the correct key to press. Sometimes it will be the left-hand key and sometimes the right-hand key. The picture on the front of the door should give you a clue about which is the correct response. To give you an impression of the game you will be asked to play later on, we will first give you some demonstration trials. Just follow the instructions on the screen.

Having read these instructions, subjects were shown a picture of a closed box with a picture of a glass of beer on the front door. At the bottom of the screen, we showed them the instructions “Press Left.” Pressing the left key led to a picture of an open empty box. On the following screen, subjects were again shown a picture of a glass of beer on the front door of a box, but this time with the instruction “Press Right.” Pressing the right key was rewarded with another glass of beer and 1 point. Subjects were then shown in the same fashion that a cup of coffee signaled that pressing the right key would not be rewarded, whereas pressing the left key was rewarded with a glass of wine and 1 point. Subjects were then given the following instructions:

You have had a chance to learn which was the correct key to press for two different pictures. In the following demonstration, you will no longer be told which response to make, and your task is to press the correct key. Only the first keypress on each trial will count and the quicker a correct response is made the more points will be added to your total, so try to respond as quickly as possible!

Subsequently, subjects received four practice trials with the beer stimulus and four trials with the coffee stimulus, randomly intermixed. Pressing the correct key for the beer and the coffee was rewarded with points and with either beer or a glass of wine inside the box, respectively. Pressing the incorrect key was always followed by an empty box. As in the real experiment, the faster a response was made, the more points were earned. The number of points awarded for correct responses within the following RT ranges was as follows: 0–1 sec, 5; >1–1.5 sec, 4; >1.5–2 sec, 3; >2–2.5 sec, 2; >2.5 sec, 1. The outcome display showed a picture of the drink outcome and the number of points earned. This display remained present for 1 sec before being replaced by the stimulus display of the next trial after a 1.5-sec intertrial interval). The total score was always displayed at the top of the screen. At the end of the discrimination training phase, subjects received instructions for the (outcome-cued) outcome devaluation test that was designed to assess the strength of O→R associations:

In the next phase, two open boxes will appear on the screen with different drinks inside them. One drink was earned by a left response in the first stage and the other by a right response. Although both drinks were valuable previously, one of them is now devalued and earns no points, whereas the other is still valuable and gains points. The devalued drink will have a cross on it. You should respond by pressing the key that earns a valued drink. The points you earn now will not be shown on the screen but you will see your final total at the end of the game. As in the training phase, only your first response will count.

The subjects were then shown two open boxes on the screen (one above the other), one containing a beer and one containing a glass of wine. On the first trial, the wine had a red cross superimposed on it, signifying that the left response associated with it no longer earned any points, whereas on the second trial the beer was shown with a cross, signifying that the right response was no longer rewarded. Each keypress marked the end of that trial and was immediately followed by the next test trial. Subjects therefore did not receive feedback about their performance during the test to ensure that their choices were guided by O→R associations acquired during instrumental training, but they were shown their total score at the end, followed by the final instructions:

The actual game will be very similar to this. However, it will be a lot harder, because you will be asked to learn the correct responses to many different fruit pictures. Try to collect as many points as possible. You should pay attention to the types of fruit that are found inside the boxes following each response, because later on you will be asked to gather some types of foods but not others. Remember to respond quickly, as quicker correct responses earn you more points. This is the end of the demonstration. If anything in these instructions is unclear, please ask the experimenter. If not, you're ready to go! Please tell the experimenter when you are ready to play the game. Good luck!

Discrimination Training

Once the experimenter had ensured that the instructions and demonstration had been understood (and if necessary, had rerun the demonstration until the instructions were clear), each participant was presented with the first trial. As with the demonstration phase, participants were shown boxes bearing a fruit and were required to use this information to select the left or right keypress. A correct response led to another fruit picture and a gain of a minimum of 1 point and a maximum of 5, depending on RT. Incorrect responses led to an empty box on the screen and 0 points. Three discriminations were trained together: cue-outcome incongruent, cue-outcome congruent, and standard (see Figure 1). Two fruit icons were assigned to the congruent discrimination, two to the incongruent discrimination, and four to the standard discrimination.

Performing the correct response to a fruit stimulus yielded the same fruit icon as the outcome in the congruent discrimination but the other fruit icon in the incongruent discrimination. In the example of a congruent discrimination in Figure 1, a banana signals that pressing the left key will be rewarded with another banana, whereas grapes signal that right keypresses will be rewarded with grapes. In contrast, in the incongruent example, each fruit functions as a stimulus and outcome for opposing responses: An apple signals that pressing the left key will be rewarded with a pineapple, whereas in the other component of the discrimination, the pineapple signals that pressing the opposite, right, key will be rewarded with the apple. Finally, in the standard discrimination, two fruit icons acted as the stimuli and the other two as the outcomes with the assignment of stimulus–outcome pairs remaining constant across training. Hence, in the example, the correct left response to a coconut stimulus consistently yields a cherry outcome, whereas a correct right response to the orange stimulus yields a pear outcome. As can be seen in the right panel of Figure 1, performance on the congruent and standard discriminations can be supported by an S→O→R associative structure. In contrast, O→R associations can cause response conflict in the case of the incongruent discrimination (for a detailed description and theoretical background, see de Wit, Corlett, et al., 2009; de Wit et al., 2007).

The icons were paired arbitrarily for the three discriminations, and the left response was correct for one icon and the right response for the other icon of a pair. To ensure that the identity of the icons was not confounded with discrimination type, the assignments of these icon and response pairings to the different discriminations were permutated across participants, in such a way that there were eight possible combinations, with each fruit icon functioning overall twice as stimulus and twice as outcome for each discrimination, once with a right and once with a left response.

Discrimination training consisted of eight 12-trial blocks. Within each block, there were two trials with each of the component contingencies from each of the three discriminations, which were presented in a random order that varied across participants. Therefore, every participant received a total of 16 trials with each component of the three discriminations (twice as much as in the original experiment of de Wit et al., 2007).

Outcome Devaluation Test

Following discrimination training, the participants were reminded of the instructions for the outcome devaluation test. This test consisted of four trials from each of the three discriminations, two with one of the outcomes devalued and two with the other outcome devalued. These 12 trials were presented in a different random order for each participant.

Questionnaires

Subjects were asked to indicate on a printed questionnaire for each fruit that had functioned as a discriminative stimulus, whether the right or the left response had been correct, and which fruit was presented inside the box following a correct response for that discriminative stimulus.

Data Analysis

Statistical analysis was performed using SPSS 15.0. We employed repeated measures ANOVA (RM-ANOVA), complemented with two-tailed t tests, to investigate whether PD patients and controls differed in accuracy (percentage of correct responses per block of training) and RT (sec) during discrimination training, in accuracy (average percentage of correct responses) during the outcome devaluation test, and in accuracy (number of correct answers out of two per discrimination) on the questionnaires. In addition, we compared patients on versus off medication on all of the abovementioned dependent measures. We included each patient's sum UPDRS score as a covariate in these RM-ANCOVAs because disease severity varied considerably within our sample (with UPDRS scores ranging from 19 to 82.5). All p values involving repeated measures factors are based on Greenhouse–Geisser sphericity corrections, and all significant (p < .05) first-order interactions involving the factor of interest (discrimination type) are reported.

RESULTS

Discrimination Training—Accuracy

To investigate the acquisition of instrumental discriminations, we conducted an RM-ANOVA on the percentage of correct responses, with the between-subjects factor Group (controls/PD patients) and within-subject factors Block and Discrimination. As can be seen in Figure 2, the instrumental discriminations were acquired gradually, as supported by a significant effect of Block, F(7, 294) = 17.14, MSE = 541.9, p < .0005. There was no Group × Block interaction, F = 1.02, MSE = 552.2, but discriminative performance was negatively affected overall in the PD patients relative to the CS group, F(1, 42) = 3.99, MSE = 6311, p = .05. We failed, however, to find evidence for a Group × Discrimination interaction, F(2, 84) = 1.70, MSE = 1479, indicating that the congruence effect did not differ between PD patients and controls (see Table 4 for average performance on the three discriminations). A main effect of Discrimination, F(2, 42) = 11.01 , MSE = 872.1, p < .0005, prompted pairwise comparisons. Congruent performance was significantly better than both incongruent and standard performance (ps < .01), but, unlike in previous studies, the difference between standard and incongruent performance was only marginally significant (p = .06). Importantly, two-tailed t tests on the percentage of correct responses on the final block of training show that both patients and controls performed significantly above chance on all discriminations (PD, ts ≥ 2.60; CS, ts ≥ 6.01).

Figure 2. 

Average percentage of correct responses (left graph) and response RTs (sec; right graph) across acquisition of congruent, standard, and incongruent discriminations by Parkinson patients on/off medication and the control (CS) group.

Figure 2. 

Average percentage of correct responses (left graph) and response RTs (sec; right graph) across acquisition of congruent, standard, and incongruent discriminations by Parkinson patients on/off medication and the control (CS) group.

Table 4. 

Average Percentage of Correct Responses during Eight Blocks of Training on Congruent, Standard, and Incongruent Discriminations + Outcome Devaluation Test Performance + Questionnaire (Q) Scores for Naming the Correct Response (R) and Outcome (O) by Parkinson Patients On/Off Medication and Control (CS) Group


B1
B2
B3
B4
B5
B6
B7
B8
Test
Q-R
Q-O
PD On 
Congruent 63 72 68 67 75 73 80 85 75 1.7 0.3 
Standard 48 58 65 67 65 75 72 73 65 1.4 0.5 
Incongruent 48 53 55 57 60 72 72 62 38 1.5 0.1 
 
PD Off 
Congruent 62 60 77 70 78 80 87 85 83 1.7 0.5 
Standard 53 55 58 60 67 62 68 62 68 1.4 0.7 
Incongruent 48 57 60 62 73 67 70 67 43 1.6 0.7 
 
CS 
Congruent 55 75 73 84 88 94 86 89 93 1.8 0.8 
Standard 61 75 84 73 82 84 80 88 72 1.5 0.6 
Incongruent 52 55 64 61 83 80 86 88 34 1.6 0.5 

B1
B2
B3
B4
B5
B6
B7
B8
Test
Q-R
Q-O
PD On 
Congruent 63 72 68 67 75 73 80 85 75 1.7 0.3 
Standard 48 58 65 67 65 75 72 73 65 1.4 0.5 
Incongruent 48 53 55 57 60 72 72 62 38 1.5 0.1 
 
PD Off 
Congruent 62 60 77 70 78 80 87 85 83 1.7 0.5 
Standard 53 55 58 60 67 62 68 62 68 1.4 0.7 
Incongruent 48 57 60 62 73 67 70 67 43 1.6 0.7 
 
CS 
Congruent 55 75 73 84 88 94 86 89 93 1.8 0.8 
Standard 61 75 84 73 82 84 80 88 72 1.5 0.6 
Incongruent 52 55 64 61 83 80 86 88 34 1.6 0.5 

To investigate whether medication affected instrumental learning, we conducted a separate RM-ANCOVA on the patients on/off medication. To take into account disease severity, we included each patient's UPDRS score as a covariate. This analysis only yielded a significant effect of Block, F(7, 189) = 2.96, MSE = 630.7, p < .05. There was no main effect of Group (on/off), F < 1, nor a Discrimination × Group interaction, F(2, 54) = 1.23, MSE = 1006. Finally, we failed to find an effect of the UPDRS covariate, F < 1. In the present analysis, there was no significant effect of Discrimination, F < 1, but we do not wish to attribute significance to this null effect in the patient group as we did not find evidence for a Group × Discrimination effect in the prior analysis.

Discrimination Training—Reaction Time

Fast responding was encouraged during discrimination training. In an RM-ANOVA of the RTs (sec) of PD patients versus controls, we only found a significant effect of Block, F(7, 294) = 44.94, MSE = 0.15, p < .0005, reflecting that subjects gradually learned to respond faster, as is depicted in the right panel of Figure 2. PD patients and control subjects responded equally fast overall, F(1, 42) = 1.62, MSE = 6.47, and there were no differences in RT depending on discrimination type, F = 1.03, MSE = 0.237. We also conducted a separate RM-ANCOVA on the patient data to investigate whether RT was affected by medication status and disease severity. There were no significant main effects of medication status, nor of disease severity, Fs < 1, nor any first-order interactions.

Outcome Devaluation Test

As can be seen in Table 4, performance was best on the congruent test trials and worst on the incongruent trials. This was confirmed with an RM-ANOVA with the between-subjects factor group (PD/CS) and the within-subject factor discrimination. A significant effect of Discrimination, F(2, 84) = 30.89, MSE = 787.6, p < .0005, was further investigated with pairwise comparisons, which showed that congruent performance was significantly superior to standard performance, which in turn was better than incongruent performance (ps < 0.05). Separate t tests for the PD/CS groups showed that performance was above chance level only for the congruent and standard discriminations (ps < 0.05). Therefore, we replicated the congruence effect during test observed in previous studies (de Wit, Corlett, et al., 2009; de Wit et al., 2007). We failed, however, to find an effect of group on the acquisition/deployment of goal-directed R-O knowledge. The patients and controls did not differ in their level of performance overall, F < 1, nor did discrimination interact with group, F = 1.02, MSE = 803.2.

Again, we conducted a separate RM-ANCOVA on test performance of the PD patients, with medication status as a between-subjects variable and UPDRS score as a covariate. In line with our hypothesis, a significant main effect of UPDRS, p < .05, indicated that disease severity negatively affected test performance. A two-tailed Pearson correlational analysis on the average test performance and UPDRS score yielded a significant negative correlation (r = −.37, p < .05), as shown in Figure 3. We also found that patients on medication tended to perform worse overall than patients off medication, but the effect of medication failed to reach significance, F(1, 27) = 3.38, MSE = 519.1, p = .08. Given that the medicated patients were likely to be clinically more severely affected, this apparent effect of medication might reflect an effect of disease severity, in line with the effect of UPDRS reported above.

Figure 3. 

Plotted is the negative correlation between the average percentage of correct responses on the outcome devaluation test and the UPDRS score of patients on medication (empty circles) and off medication (filled circles).

Figure 3. 

Plotted is the negative correlation between the average percentage of correct responses on the outcome devaluation test and the UPDRS score of patients on medication (empty circles) and off medication (filled circles).

Outcome Devaluation Test—Reaction Time

An RM-ANOVA with the between-subjects factor group (PD/CS) and the within-subject factor discrimination established that RTs during test did not differ between the three discriminations, F(2, 84) = 1.13, MSE = 1.393, with average RTs of 2.5 sec on both congruent and standard trials and 2.8 sec on incongruent trials. RTs of patients and controls were also statistically indistinguishable, F < 1, with average RTs of 2.8, 2.5, and 2.5 sec for the On, Off, and CS groups, respectively. The separate RM-ANCOVA on the PD patients showed that RT also did not depend on medication status, F < 1, and did not correlate with disease severity, F(1, 27) = 1.61, MSE = 10.06.

Questionnaires

Participants were asked to indicate the correct response and outcome for both discriminative stimuli of each discrimination and were given a point for each correct answer. The scores for response and outcome for each discrimination ranged therefore from a minimum of 0 to a maximum score of 2. These scores (see Table 4) were analyzed separately. One subject in the control group failed to fill out the questionnaire, leaving 13 subjects in that group.

As can be seen in Table 4, there was a high level of explicit S-R knowledge in all three groups, with average scores of 1.5, 1.6, and 1.6 for the On, Off, and CS groups, respectively. An RM-ANOVA comparing scores of PD patients and controls did not yield any significant effects. Explicit knowledge of the S-R relationships was the same for the different discriminations, F(2, 82) = 1.65, MSE = 0.39, and did not differ between the two groups, F < 1. An additional RM-ANCOVA on the performance of the PD patients also failed to yield significant effects. There was no effect of medication status, F < 1, nor of discrimination, F < 1. There was also no evidence for a correlation between disease severity and S-R memory, F(1, 27) = 1.57, MSE = 0.661.

Explicit memory of the instrumental outcomes in each component was uniformly poor, as can be see in Table 4, with average scores of 0.3, 0.6, and 0.6 for the On, Off, and CS groups, respectively. An RM-ANOVA comparing patients and controls showed that the level of knowledge did not differ between these two groups, F < 1, nor between discriminations, F(2, 82) = 1.33, MSE = 0.35. The RM-ANCOVA on the performance of the patients on versus off medication also did not yield any significant effects. Medication status did not affect explicit knowledge of the outcomes, F(1, 27) = 2.78, MSE = 0.71, nor did disease severity, F < 1.

In summary, we report the following findings:

  • PD patients show a general deficit in the acquisition of instrumental discriminations.

  • PD patients and controls perform better on the congruent than on the standard and the incongruent discrimination (whereas performance on the standard discrimination is only marginally significantly better than on the incongruent).

  • PD patients are able to solve the incongruent discrimination, which is thought to rely on direct S→R associations. In support of the latter assumption, patients and controls do not perform above chance level on incongruent trials of the outcome devaluation test.

  • Disease severity correlates negatively with performance on the outcome devaluation test.

  • Explicit memory of the instrumental S:R→O contingencies is not affected in PD patients.

DISCUSSION

This study presents the first direct investigation of a pervasive hypothesis about BG function, according to which early PD is associated with dopaminergic depletion of the dorsal striatum, resulting in impaired S→R habit formation as conceptualized and studied in animal research (e.g., Dickinson, 1985; Thorndike, 1911). This habit account could explain some of the everyday problems that PD patients encounter because behavior may indeed become effortful and be slowed down considerably if one always has to evaluate the outcome of each and every action before undertaking it. A habit deficit has previously been inferred on the basis of impaired performance of PD patients on, for example, the weather prediction task. However, the habitual status of performance on that task has never been assessed by direct sensitivity to current outcome value. Consequently, we do not know whether performance is in fact behaviorally autonomous. In the present study, we tested the habit hypothesis more directly by using an instrumental conflict task (de Wit, Corlett, et al., 2009; de Wit et al., 2007). Disrupted habit formation should have been expressed in an inability to solve the incongruent discrimination, which according to associative theory should rely predominantly on the formation of S→R habits. However, we found that PD patients solved the incongruent discrimination above chance by the end of training, suggesting that they adequately acquired habits.

At the same time we did find evidence for a general impairment during the feedback-based learning phase across discriminations in PD patients compared with controls. We therefore replicated earlier observations of a feedback-based learning deficit (Shohamy, Myers, Kalanithi, & Gluck, 2008). This learning deficit may well be due to impaired S→R formation, and this would certainly be consistent with an earlier demonstration that the dorsal striatum is engaged across discriminations during the acquisition phase of this task (de Wit, Corlett, et al., 2009). However, goal-directed support may similarly rely on the gradual building up of associations, and impaired goal-directed control may therefore also contribute to the present learning deficit. In line with the latter possibility, we found evidence for a disease severity–dependent impairment of performance on the subsequent outcome devaluation test. Performance on this test should not be negatively affected by impaired habit formation as it is mediated by goal-directed knowledge, so this result suggests that progressive PD leads to impaired goal-directed control. In the remainder of this article, we will discuss the implications of as well as potential issues with the presented evidence.

Our evidence for disrupted goal-directed control is consistent with other lines of evidence suggesting that progressive PD leads to a cognitive profile characteristic of more ventral corticostriatal circuits (Agid et al., 1993; Kish et al., 1988). In a previous fMRI study with the conflict task, we showed that the vmPFC was engaged during performance on the outcome devaluation test (de Wit, Corlett, et al., 2009). Moreover, there is evidence to suggest that activations in ventral corticostriatal circuits are modulated by contingency, with vmPFC tracking local changes in correlations between action and outcome rates (Tanaka et al., 2008). Therefore, pFC and anterior caudate nucleus may work together to support goal-directed learning. Interestingly, a goal-directed deficit may concur with a separate literature that highlights a shift from internal to external control in PD (van Spaendonck, Berger, Horstink, Borm, & Cools, 1995; Brown & Marsden, 1988; Cools, van den Bercken, Horstink, van Spaendonck, & Berger, 1984). Observations that PD patients have no problem initiating actions when presented with an unambiguous external stimulus (e.g., Rahman, Griffin, Quinn, & Jahanshahi, 2008; Praamstra, Stegeman, Cools, & Horstink, 1998) suggest that if anything remains intact in PD, it is the ability to act on direct S→R associations. Difficulties with the internal generation of actions and cognitive plans as well as enhanced cue-reliance and stimulus-driven behavior are more in line with a goal-directed impairment.

On the basis of work by, for example, Frank et al. (2004, 2007), showing that instrumental learning is dopamine dependent, we would expect a goal-directed deficit during learning and test to be remediated by dopaminergic medication. However, in the present study, we did not find evidence for superior goal-directed action in patients on medication. In fact, performance of the On group was marginally worse than that of the Off group. It is important to point out that a caveat of this study is that the On and Off groups were not matched well in terms of disease severity, with patients on medication being clinically more strongly affected. The absence of an effect of medication status should be replicated in future studies with more carefully matched patient groups.

One might argue that the instrumental conflict task is not sufficiently sensitive to detect habit formation deficits in PD. However, our results strongly suggest that incongruent performance relies on habit formation. When subjects were asked to select responses on the basis of the instructed value of the instrumental outcomes, they were able to do this only on congruent and standard trials. In contrast, performance on incongruent trials of the outcome devaluation test did not differ from chance level (see also de Wit, Corlett, et al., 2009; de Wit et al., 2007), indicating a lack of goal-directed control over incongruent performance. Further support for this possibility comes from the recent fMRI study with the conflict task, which showed not only that vmPFC is recruited during the outcome devaluation test but also that this area is preferentially engaged during congruent and standard training relative to incongruent (de Wit, Corlett, et al., 2009; de Wit, Ostlund, et al., 2009). We argue that in the absence of goal-directed control over incongruent performance, response selection was guided by direct S→R associations (see also de Wit, Corlett, et al., 2009; de Wit, Ostlund, et al., 2009; de Wit et al., 2007). Thus, the intact acquisition of the incongruent discrimination by PD patients suggests that the ability to form habits in the presence of conflicting O→R associations is not affected.

It could still be argued that a relative S→R habit deficit in PD patients was masked by their reliance on a more declarative rule formation strategy to solve the incongruent discrimination. Moody, Bookheimer, Vanek, and Knowlton (2004) showed that mild PD patients are able to perform normally on a probabilistic task, but at the same time these patients showed activations in temporal brain areas rather than the striatal areas that were activated in the control participants, raising the possibility that the PD patients adopted an alternative, more declarative strategy. Although we have so far not directly investigated the possibility of propositional encoding of the incongruent discrimination, we cannot exclude the possibility that this represents an alternative viable approach to the instrumental conflict task. However, our current study provides two lines of evidence against this possibility. First of all, PD patients performed at chance level during the incongruent trials of the outcome devaluation test. Successful encoding of the incongruent rule should have allowed them to select the appropriate response for each valuable outcome. Second, a questionnaire at the end of training failed to produce evidence for superior declarative knowledge of the instrumental contingencies in PD patients relative to controls.

However, it remains possible that conflict-induced habit formation relies on processes that are different from those underlying habit formation in the context of extensive training, and our data, therefore, do not exclude the possibility that PD is accompanied by deficits in training-induced habit formation. Nevertheless, we should stress that according to dual-system accounts of instrumental action, S→R associations are strengthened even in the early stages of acquisition, and indeed Tricomi et al. (2009) showed that the dorsal putamen was engaged from the outset of instrumental training. Of course goal-directed associations should usually dominate behavior control early on, but conflict due to R-O associations can cause a reliance on S→R associations from the outset (de Wit, Corlett, et al., 2009; de Wit, Ostlund, et al., 2009; de Wit et al., 2007). The possibility that the formation of strong S→R associations with extensive practice is affected in PD disease could be investigated directly with the kind of paradigm recently employed by Tricomi et al. If S→R habit formation through extensive practice is impaired in PD patients, they should paradoxically outperform control subjects on a subsequent outcome devaluation test.

Finally, it seems important to note that although it is often assumed that the dorsal striatum is crucially involved in S→R habit formation in humans, direct evidence is so far not overwhelming. Human studies have shown that this area is involved in procedural learning, but caution is warranted in equating this with the acquisition of S→R habits as it has been studied in the animal studies that implicate the dorsal striatum. An exception is the recent study of Tricomi et al. (2009), which produced correlational evidence for a role of this area in habit learning. Moreover, in the present study, we addressed for the first time the question whether intact functioning of the human dorsal striatum is a prerequisite for S→R habit formation. Although we failed to produce favorable evidence, this question clearly deserves further scrutiny in human studies employing experimental tasks that are analogous to the carefully constructed paradigms used in animal studies.

In summary, we investigated in PD patients and age-matched controls the ability to form habits by assessing trial-and-error learning of S-R mappings using an instrumental discrimination task but failed to find evidence for a relative impairment in the formation of S→R associations in PD patients. In fact, impaired performance with progressive disease severity on a subsequent outcome devaluation test suggests there may be a deficit in goal-directed control. This goal-directed deficit may be due to progressive depletion of ventral corticostriatal circuits. Therefore, our research does not lend support for the hypothesis that habit formation is disrupted in mild PD patients and consequently highlights the need for caution in accepting the habit account of effortful action in PD. The present findings represent an important initial step toward understanding the effects of PD on goal-directed versus habitual behavior and will hopefully inspire further investigations of instrumental dysfunction in PD with behavioral models that capture this crucial distinction.

Acknowledgments

This research was supported by a fast-track grant of the Parkinson Disease Society to Roshan Cools (RG47229). The study was carried out within the Cambridge Centre for Brain Repair (BRC) and the Behavioural and Clinical Neuroscience Institute (BCNI), Cambridge, UK. The authors thank Mike Aitken for programming support. Furthermore, they thank the BRC staff for their helpful assistance, in particular Sarah Mason and Kate Fisher.

Reprint requests should be sent to Sanne de Wit, Department of Psychology, Amsterdam Center for the Study of Adaptive Control in Brain and Behavior (Acacia), University of Amsterdam, Roeterstraat 15, 1018 WB, The Netherlands, or via e-mail: s.dewit@uva.nl.

Notes

1. 

Competing associative accounts of goal-directed action stress the importance of either the forward R→O association or the backward O→R association (for a review, see de Wit & Dickinson, 2009). As our research does not aim to distinguish between these accounts, we will for simplicity's sake refer to O→R associations.

2. 

It could be argued that the ultimate goal in the conflict task was to earn points (rather than the more specific fruit picture outcomes) and that subjects could rely on response-“correct” outcome learning to the same degree in all discrimination learning conditions. However, such a general goal should not allow for the activation of appropriate goal-directed actions via S→O→R associations as the general outcome (of points) should become associated both with right and left responses (we refer the interested reader to the “differential outcomes” literature; see, e.g., Urcuioli, 2005). Furthermore, the outcome devaluation effect provides evidence for reduced reliance on the fruit picture outcome in the incongruent condition.

REFERENCES

REFERENCES
Adams
,
C. D.
(
1982
).
Variations in the sensitivity of instrumental responding to reinforcer devaluation.
Quarterly Journal of Experimental Psychology
,
34B
,
77
98
.
Agid
,
Y.
,
Ruberg
,
M.
,
Javoy-Agid
,
F.
,
Hirsch
,
E.
,
Raisman-Vozari
,
R.
,
Vyas
,
S.
,
et al
(
1993
).
Are dopaminergic neurons selectively vulnerable to Parkinson's disease?
Advances in Neurology
,
60
,
148
164
.
Balleine
,
B. W.
, &
Dickinson
,
A.
(
1998
).
Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates.
Neuropharmacology
,
37
,
407
419
.
Beck
,
A. T.
,
Ward
,
C. H.
,
Mendelson
,
M.
,
Mock
,
J.
, &
Erbaugh
,
J.
(
1961
).
An inventory for measuring depression.
Archives of General Psychiatry
,
4
,
561
571
.
Benton
,
A. L.
(
1968
).
Different behavioral effects in frontal lobe disease.
Neuropsychologia
,
6
,
53
60
.
Bolles
,
R. C.
(
1972
).
Reinforcement, expectancy, and learning.
Psychological Review
,
79
,
349
409
.
Brown
,
R. G.
, &
Marsden
,
C. D.
(
1988
).
Internal versus external cues and the control of attention in Parkinson's disease.
Brain
,
111
,
323
345
.
Brown
,
R. G.
, &
Marsden
,
C. D.
(
1990
).
Cognitive function in Parkinson's disease: From description to theory.
Trends in Neurosciences
,
13
,
21
29
.
Brown
,
R. G.
, &
Marsden
,
C. D.
(
1991
).
Dual task performance and processing resources in normal subjects and patients with Parkinson's disease.
Brain
,
114
,
215
231
.
Cools
,
A. R.
,
van den Bercken
,
J. H.
,
Horstink
,
M. W.
,
van Spaendonck
,
K. P.
, &
Berger
,
H. J.
(
1984
).
Cognitive and motor shifting aptitude disorder in Parkinson's disease.
Journal of Neurology, Neurosurgery and Psychiatry
,
47
,
443
453
.
Corbit
,
L. H.
, &
Balleine
,
B. W.
(
2003
).
The role of prelimbic cortex in instrumental conditioning.
Behavioural Brain Research
,
146
,
145
157
.
Daw
,
N. D.
,
Niv
,
Y.
, &
Dayan
,
P.
(
2005
).
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control.
Nature Neuroscience
,
8
,
1704
1711
.
de Wit
,
S.
,
Corlett
,
P. R.
,
Aitken
,
M. R.
,
Dickinson
,
A.
, &
Fletcher
,
P. C.
(
2009
).
Differential engagement of the ventromedial prefrontal cortex by goal-directed and habitual behavior toward food pictures in humans.
Journal of Neuroscience
,
29
,
11330
11338
.
de Wit
,
S.
, &
Dickinson
,
A.
(
2009
).
Associative theories of goal-directed behaviour: A case for animal-human translational models.
Psychological Research
,
73
,
463
476
.
de Wit
,
S.
,
Kosaki
,
Y.
,
Balleine
,
B.
, &
Dickinson
,
A.
(
2006
).
Dorsomedial prefrontal cortex resolves response conflict in rats.
Journal of Neuroscience
,
26
,
5224
5229
.
de Wit
,
S.
,
Niry
,
D.
,
Wariyar
,
R.
,
Aitken
,
M. R. F.
, &
Dickinson
,
A.
(
2007
).
Stimulus–outcome interactions during conditional discrimination learning by rats and humans.
Journal of Experimental Psychology: Animal Behavior Processes
,
33
,
1
11
.
de Wit
,
S.
,
Ostlund
,
S. B.
,
Balleine
,
B. W.
, &
Dickinson
,
A.
(
2009
).
Resolution of conflict between goal-directed actions: Outcome encoding and neural control processes.
Journal of Experimental Psychology: Animal Behavior Processes
,
35
,
382
393
.
Dickinson
,
A.
(
1985
).
Actions and habits: The development of behavioural autonomy.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
308
,
67
78
.
Dickinson
,
A.
, &
de Wit
,
S.
(
2003
).
The interaction between discriminative stimuli and outcomes during instrumental learning.
Quarterly Journal of Experimental Psychology
,
56B
,
127
139
.
Fahn
,
S.
,
Elton
,
R. L.
, &
Committee
,
M. o. t. U. D.
(
1987
).
Unified Parkinson's Disease Rating Scale.
In S. Fahn, C. D. Marsden, D. Caine, & M. Goldstein (Eds.),
Recent developments in Parkinson's disease.
Florhan Park, NJ
:
McMillan Health Care Information
.
Faure
,
A.
,
Haberland
,
U.
,
Conde
,
F.
, &
El Massioui
,
N.
(
2005
).
Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation.
Journal of Neuroscience
,
25
,
2771
2780
.
Frank
,
M. J.
,
Samanta
,
J.
,
Moustafa
,
A. A.
, &
Sherman
,
S. J.
(
2007
).
Hold your horses: Impulsivity, deep brain stimulation, and medication in Parkinsonism.
Science
,
318
,
1309
1312
.
Frank
,
M. J.
,
Seeberger
,
L. C.
, &
O'Reilly
,
R. C.
(
2004
).
By carrot or by stick: Cognitive reinforcement learning in parkinsonism.
Science
,
306
,
1940
1943
.
Hoehn
,
M. M.
, &
Yahr
,
M. D.
(
1967
).
Parkinsonism: Onset, progression and mortality.
Neurology
,
17
,
427
442
.
James
,
W.
(
1890
).
The principles of psychology.
New York
:
Dover Publications
.
Joel
,
D.
, &
Weiner
,
I.
(
2000
).
The connections of the dopaminergic system with the striatum in rats and primates: An analysis with respect to the functional and compartmental organization of the striatum.
Neuroscience
,
96
,
451
474
.
Killcross
,
S.
, &
Coutureau
,
E.
(
2003
).
Coordination of actions and habits in the medial prefrontal cortex of rats.
Cerebral Cortex
,
13
,
400
408
.
Kimble
,
G. A.
, &
Perlmutter
,
L. C.
(
1970
).
The problem of volition.
Psychological Review
,
77
,
361
384
.
Kish
,
S. J.
,
Shannak
,
K.
, &
Hornykiewicz
,
O.
(
1988
).
Uneven pattern of dopamine loss in the striatum of patients with idiopathic Parkinson's disease. Pathophysiologic and clinical implications.
New England Journal of Medicine
,
318
,
876
880
.
Knowlton
,
B. J.
,
Mangels
,
J. A.
, &
Squire
,
L. R.
(
1996
).
A neostriatal habit learning system in humans.
Science
,
273
,
1399
1402
.
Maudsley
,
H.
(
1876
).
The physiology of mind.
London
:
Kessinger Publishing
.
Mishkin
,
M.
,
Malamut
,
B.
, &
Bachevalier
,
J.
(
1984
).
Memories and habits: Two neural systems.
In G. Lynch, L. McGaugh, & N. M. Weinberger (Eds.),
Neurobiology of learning and memory
(pp.
65
77
).
New York
:
Guilford
.
Moody
,
T. D.
,
Bookheimer
,
S. Y.
,
Vanek
,
Z.
, &
Knowlton
,
B. J.
(
2004
).
An implicit learning task activates medial temporal lobe in patients with Parkinson's disease.
Behavioral Neuroscience
,
118
,
438
442
.
Nelson
,
H. E.
(
1982
).
National Adult Reading Test (NART) test manual.
Windsor, UK
:
NFER-Nelson
.
O'Doherty
,
J.
,
Dayan
,
P.
,
Schultz
,
J.
,
Deichmann
,
R.
,
Friston
,
K.
, &
Dolan
,
R. J.
(
2004
).
Dissociable roles of ventral and dorsal striatum in instrumental conditioning.
Science
,
304
,
452
454
.
Owen
,
A. M.
,
James
,
M.
,
Leigh
,
P. N.
,
Summers
,
B. A.
,
Marsden
,
C. D.
,
Quinn
,
N. P.
,
et al
(
1992
).
Fronto-striatal cognitive deficits at different stages of Parkinson's disease.
Brain
,
115
,
1727
1751
.
Owen
,
A. M.
,
Sahakian
,
B. J.
,
Hidges
,
J. R.
,
Summers
,
B. A.
,
Polkey
,
C. E.
, &
Robbins
,
T. W.
(
1995
).
Dopamine-dependent frontostriatal planning deficits in early Parkinson's disease.
Neuropsychology
,
9
,
126
140
.
Packard
,
M. G.
, &
Knowlton
,
B. J.
(
2002
).
Learning and memory functions of the Basal Ganglia.
Annual Review of Neuroscience
,
25
,
563
593
.
Praamstra
,
P.
,
Stegeman
,
D. F.
,
Cools
,
A. R.
, &
Horstink
,
M. W.
(
1998
).
Reliance on external cues for movement initiation in Parkinson's disease. Evidence from movement-related potentials.
Brain
,
121
,
167
177
.
Rahman
,
S.
,
Griffin
,
H. J.
,
Quinn
,
N. P.
, &
Jahanshahi
,
M.
(
2008
).
The factors that induce or overcome freezing of gait in Parkinson's disease.
Behavioural Neurology
,
19
,
127
136
.
Reynolds
,
J. N.
,
Hyland
,
B. I.
, &
Wickens
,
J. R.
(
2001
).
A cellular mechanism of reward-related learning.
Nature
,
413
,
67
70
.
Sahakian
,
B. J.
,
Morris
,
R. G.
,
Evenden
,
J. L.
,
Heald
,
A.
,
Levy
,
R.
,
Philpot
,
M.
,
et al
(
1988
).
A comparative study of visuospatial memory and learning in Alzheimer-type dementia and Parkinson's disease.
Brain
,
111
,
695
718
.
Shohamy
,
D.
,
Myers
,
C. E.
,
Grossman
,
S.
,
Sage
,
J.
,
Gluck
,
M. A.
, &
Poldrack
,
R. A.
(
2004
).
Cortico-striatal contributions to feedback-based learning: Converging data from neuroimaging and neuropsychology.
Brain
,
127
,
851
859
.
Shohamy
,
D.
,
Myers
,
C. E.
,
Kalanithi
,
J.
, &
Gluck
,
M. A.
(
2008
).
Basal ganglia and dopamine contributions to probabilistic category learning.
Neuroscience and Biobehavioral Reviews
,
32
,
219
236
.
Shohamy
,
D.
,
Myers
,
C. E.
,
Onlaor
,
S.
, &
Gluck
,
M. A.
(
2004
).
Role of the basal ganglia in category learning: How do patients with Parkinson's disease learn?
Behavioral Neuroscience
,
118
,
676
686
.
Stroop
,
J. R.
(
1935
).
Studies of interference in serial verbal reactions.
Journal of Experimental Psychology
,
18
,
643
662
.
Swainson
,
R.
,
SenGupta
,
D.
,
Shetty
,
T.
,
Watkins
,
L. H.
,
Summers
,
B. A.
,
Sahakian
,
B. J.
,
et al
(
2006
).
Impaired dimensional selection but intact use of reward feedback during visual discrimination learning in Parkinson's disease.
Neuropsychologia
,
44
,
1290
1304
.
Tanaka
,
S. C.
,
Balleine
,
B. W.
, &
O'Doherty
,
J. P.
(
2008
).
Calculating consequences: Brain systems that encode the causal effects of actions.
Journal of Neuroscience
,
28
,
6750
6755
.
Taylor
,
A. E.
,
Saint-Cyr
,
J. A.
, &
Lang
,
A. E.
(
1986
).
Frontal lobe dysfunction in Parkinson's disease. The cortical focus of neostriatal outflow.
Brain
,
109
,
845
883
.
Thorndike
,
E. L.
(
1911
).
Animal intelligence: Experimental studies.
New York
:
Macmillan
.
Tolman
,
E. C.
(
1932
).
Purposive behavior in animals and man.
New York
:
Century
.
Tricomi
,
E.
,
Balleine
,
B. W.
, &
O'Doherty
,
J. P.
(
2009
).
A specific role for posterior dorsolateral striatum in human habit learning.
European Journal of Neuroscience
,
29
,
2225
2232
.
Urcuioli
,
P. J.
(
2005
).
Behavioral and associative effects of differential outcomes in discrimination learning.
Learning & Behavior
,
33
,
1
21
.
Valentin
,
V. V.
,
Dickinson
,
A.
, &
O'Doherty
,
J. P.
(
2007
).
Determining the neural substrates of goal-directed learning in the human brain.
Journal of Neuroscience
,
27
,
4019
4026
.
van Spaendonck
,
K. P.
,
Berger
,
H. J.
,
Horstink
,
M. W.
,
Borm
,
G. F.
, &
Cools
,
A. R.
(
1995
).
Card sorting performance in Parkinson's Disease: A comparison between acquisition and shifting performance.
Journal of Clinical and Experimental Neuropsychology
,
17
,
918
925
.
Wise
,
R. A.
(
2004
).
Dopamine, learning and motivation.
Nature Reviews Neuroscience
,
5
,
483
494
.
Witt
,
K.
,
Daniels
,
C.
,
Daniel
,
V.
,
Schmitt-Eliassen
,
J.
,
Volkmann
,
J.
, &
Deuschl
,
G.
(
2006
).
Patients with Parkinson's disease learn to control complex systems-an indication for intact implicit cognitive skill learning.
Neuropsychologia
,
44
,
2445
2451
.
Yin
,
H. H.
, &
Knowlton
,
B. J.
(
2006
).
The role of the basal ganglia in habit formation.
Nature Reviews Neuroscience
,
7
,
464
476
.
Yin
,
H. H.
,
Knowlton
,
B. J.
, &
Balleine
,
B. W.
(
2004
).
Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning.
European Journal of Neuroscience
,
19
,
181
189
.
Yin
,
H. H.
,
Knowlton
,
B. J.
, &
Balleine
,
B. W.
(
2005
).
Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning.
European Journal of Neuroscience
,
22
,
505
512
.