## Abstract

Humans and animals must evaluate the costs and expected benefits of their actions to make adaptive choices. Prior studies have demonstrated the involvement of the basal ganglia in this evaluation. However, little is known about the role of the external part of the globus pallidus (GPe), which is well positioned to integrate motor and reward-related information, in this process. To investigate this role, the activity of 126 neurons was recorded in the associative and limbic parts of the GPe of two monkeys performing a behavioral task in which different levels of force were required to obtain different amounts of liquid reward. The results first revealed that the activity of associative and limbic GPe neurons could be modulated not only by cognitive and limbic but also motor information at the same time, both during a single period or during different periods throughout the trial, mainly in an independent way. Moreover, as a population, GPe neurons encoded these types of information dynamically throughout the trial, when each piece of information was the most relevant for the achievement of the action. Taken together, these results suggest that GPe neurons could be dedicated to the parallel monitoring of task parameters essential to adjusting and maintaining goal-directed behavior.

## INTRODUCTION

The motivational value of any action takes into account both the cost and the benefit gained once the action is completed. Costs can include the effort required to perform a movement, the number of actions required, or the temporal delay before reward delivery. The benefits can depend on reward preference and/or magnitude. There is a large body of evidence that the BG are involved in the encoding of reward-related information and, more generally, motivational processes. Neuropsychological (Bhatia & Marsden, 1994; Laplane, Widlocher, & Pillon, 1981), human imaging (Schmidt et al., 2008; Pessiglione et al., 2007), lesion (Baunez, Dias, Cador, & Amalric, 2005; Berridge & Cromwell, 1990), and pharmacological (Grabli et al., 2004) studies have shown the involvement of several BG structures in these processes. In monkeys, the reward-related activities observed in BG input structures, such as the striatum and the subthalamic nucleus (STN; Nougaret & Ravel, 2015; Lau & Glimcher, 2008; Darbaky, Baunez, Arecchi, Legallet, & Apicella, 2005; Samejima, Ueda, Doya, & Kimura, 2005; Ravel, Legallet, & Apicella, 2003; Apicella, Ljungberg, Scarnati, & Schultz, 1991; Hikosaka, Sakamoto, & Usui, 1989), as well as in output structures, such as the internal part of the globus pallidus (GPi) and the substantia nigra pars reticulata (SNr; Joshua, Adler, Rosin, Vaadia, & Bergman, 2009; Pasquereau et al., 2007), are thought to be partly supported by dopaminergic neurons' activity (Morris, Arkadir, Nevet, Vaadia, & Bergman, 2004; Satoh, Nakai, Sato, & Kimura, 2003; Hollerman & Schultz, 1998).

Few studies have addressed the role of the external part of the globus pallidus (GPe) in motivational processes. Indeed, the GPe has been considered primarily as a motor relay station in the indirect cortico-GPe-STN-thalamo-cortical pathway (Vaillancourt, Yu, Mayka, & Corcos, 2007; Turner & Anderson, 1997; Mink & Thach, 1991a, 1991b; DeLong, 1971). However, the view of the GPe within the BG has changed in the past decades. Shin and Sommer (2010) have described the role of GPe neurons in oculomotor behavior, showing neuronal activities related to visual stimuli triggering saccades and reward occurrence. Arkadir, Morris, Vaadia, and Bergman (2004) have shown that GPe activity is driven not only by cues predicting future reward probability but also by cues predictive of aversive outcomes (Joshua et al., 2009). It has recently been shown that GPe neurons are important for cognitive functions such as learning (Schechtman, Noblejas, Mizrahi, Dauber, & Bergman, 2016) and for encoding the stable reward value of an object (Kim, Amita, & Hikosaka, 2017). Recent studies have highlighted the role of the GPe as a central player in the orchestration of neuronal activity within the BG network (Deffains et al., 2016; Mallet et al., 2016; Bolam, Hanley, Booth, & Bevan, 2000). It provides a massive GABAergic input to the striatum, STN, and BG output structures and can modulate cortical afferents. Those results raised questions about the role of the GPe, which is mainly thought to be involved in BG circuitry as a motor relay, and shed light on the need to reevaluate the GPe's function in linking actions to their rewarding values.

We have investigated here, through an operant task, whether and how two factors influencing the motivated behavior, as well as reward size and effort to obtain it, are represented in the GPe. GPe neurons were recorded in monkeys performing a task allowing the modulation of both the size of reward and the level of force required to obtain it. In parallel, given the design of the task, the influence on the neuronal activity of the applied force itself has been studied. The aim of the study was to determine whether GPe neurons could integrate force and reward information as a motivational index. If it has been a topic of interest in the BG, this aspect has been overlooked in studies in the GPe. However, given its position in the circuitry, we could expect neuronal activities in this structure reflecting an integration of motivational information, such as neuronal activities showing the same trend as the acceptance level of the monkeys or postreward activities proportionally modulated by the effort produced to obtain this reward. No population of neurons specifically encoded the cost–benefit ratio of the action or the attractiveness of the cues as found in the striatum (Nougaret & Ravel, 2015). Conversely, we have shown that neurons of the associative and limbic GPe display activities in close relationship with task parameters (i.e., motor execution and/or significance of the cues) at each step of a behavior. GPe neurons seemed to be more likely to process the force and reward information carried by the stimuli in an integrative way right after the stimuli's occurrence, while they would encode the information independently later in the trial, so as to maintain the integrity of each piece of information. These neurons are also highly sensitive to reward size, mostly independent of what effort was required to obtain the reward. Thus, the GPe is not directly involved in the encoding of motivational processes but plays a role in processing motor or cognitive information essential to performing appropriate goal-directed behavior and could send its valuation to the output structures once the action is performed and the reward is obtained.

## METHODS

### Animal and Apparatus

Two male rhesus monkeys (Macaca mulatta), weighing 8 and 7 kg at the beginning of the experiments (Monkeys M and Y, respectively), were trained to apply and maintain a force on a lever in response to visual cues to receive a liquid reward. All experimental procedures were in compliance with the National Institutes of Health's Guide for the Care and Use of Laboratory Animals, the French laws on animal experimentation, and the European Directive on the protection of animals used for scientific purposes.

### Behavioral Procedures

The monkeys were seated in a Plexiglas primate chair and faced a panel supporting a 17-in. screen on which visual cues could be presented. The screen was positioned 18 cm from the monkey; a lever equipped with strain gauges in the lower part of the panel was positioned at waist level. A sliding door at the front of the primate chair could be opened to allow the animal to position its hand on the lever. The liquid reward (water) was delivered via a metal spout positioned directly in front of the monkey's mouth. The liquid was delivered through a solenoid valve located outside the recording room.

As illustrated in Figure 1A, at the beginning of each trial, the monkeys had to develop a basal pressing force on the lever, between 0% and 20% of the maximal force, defined experimentally at 900 g based on the capabilities of the animals, during a 1-sec preparatory period. After this period, two visual cues, a green one and a red one, each being either a filled circle or a filled square, were presented vertically in the center of the screen. The shape of the green stimulus indicated the level of force the animals had to produce on the lever, and that of the red stimulus indicated the size of the upcoming reward. A green circle indicated that the animals had to produce a force between 20% and 55% of the maximal force (180–495 g; low force: f); a green square, a force between 55% and 90% of the maximal force (495–810 g; high force: F). In the same way, a red circle indicated to the animals that the reward delivered would be small (0.3 mL of water; small reward: r), whereas it would be large (1.2 mL of water; large reward: R) if a red square were displayed. The four possible combinations of cues (fR, FR, fr, and Fr) set the four different conditions of the task. In response to these stimuli, monkeys had to increase their pressing force on the lever to reach the required force in a period shorter than 1 sec (maximal RT) and hold this force for 1 sec (holding time) to obtain the reward. For each correct trial, monkeys were rewarded with the small or large reward according to the shape of the red stimuli. Both cues were extinguished as soon as the reward was delivered. To achieve the required force, monkeys were helped by visual feedback: A vertical rectangle representing the range of the required force was located just below the cues. In this rectangle, a white cursor indicated in real time the force developed on the lever when they were in the required force range. To keep cues constant across trial conditions, the animals saw the same rectangle for both the low and high force ranges. After reward delivery, the monkeys returned to a basal pressing force in preparation for the next trial. This did not begin until the total duration of the current trial, which lasted 4.5 sec regardless of the animal behavior, had elapsed.

Figure 1.

Task design and reconstruction of recording locations. (A) Task design. A pair of visual stimuli appeared after the maintenance of a basal pressing force by the animal for 1 sec. In response to these stimuli, the monkey increased its pressing force, reached the required force range, and held this force for 1 sec to obtain the reward. Four possible combinations of visual stimuli indicated to the animal the force he had to develop and the size of the upcoming reward. Green represented the force, red represented the reward, a filled circle represented a small size, and a filled square represented a large one. (B, C) Reconstruction of recording locations. All electrode recording sites were obtained from MR images. (B) 3-D reconstruction of the GPe of Monkey Y (left) and Monkey M (right) (transparent gray). Electrode artifacts are visible in each MR slice in the background (AC + 2 for Monkey Y, AC + 1 for Monkey M). The location of each recorded neuron was extrapolated from the tip of these electrodes. (C) Recording sites for Monkey Y from AC + 2 to AC − 2. GPe borders (black lines) have been drawn from the 3-D reconstruction images. Each dot represents the location of one neuron recorded during the task. The black ones represent the locations of associative neurons; and the white ones, the locations of limbic neurons. Arrows provide the orientation of the slice (D = dorsal; L = lateral).

Figure 1.

Task design and reconstruction of recording locations. (A) Task design. A pair of visual stimuli appeared after the maintenance of a basal pressing force by the animal for 1 sec. In response to these stimuli, the monkey increased its pressing force, reached the required force range, and held this force for 1 sec to obtain the reward. Four possible combinations of visual stimuli indicated to the animal the force he had to develop and the size of the upcoming reward. Green represented the force, red represented the reward, a filled circle represented a small size, and a filled square represented a large one. (B, C) Reconstruction of recording locations. All electrode recording sites were obtained from MR images. (B) 3-D reconstruction of the GPe of Monkey Y (left) and Monkey M (right) (transparent gray). Electrode artifacts are visible in each MR slice in the background (AC + 2 for Monkey Y, AC + 1 for Monkey M). The location of each recorded neuron was extrapolated from the tip of these electrodes. (C) Recording sites for Monkey Y from AC + 2 to AC − 2. GPe borders (black lines) have been drawn from the 3-D reconstruction images. Each dot represents the location of one neuron recorded during the task. The black ones represent the locations of associative neurons; and the white ones, the locations of limbic neurons. Arrows provide the orientation of the slice (D = dorsal; L = lateral).

There were three different cases in which a trial was considered as failed and no reward was given. First, trials in which the required force was not reached within the 1-sec force development period were considered “omission errors.” Second, trials in which the required force was not held for at least 1 sec (holding time) were considered “holding errors.” Last, trials in which the force developed was greater than the upper limit of the required force (495 and 810 g for the low and high forces, respectively) were considered “threshold errors.” Both “holding” and “threshold” errors were considered to be execution errors. In case of an error, the same combination of cues was presented again to the monkeys until they performed the trial correctly. Moreover, trials in which the monkeys began to increase their pressing force within 100 msec after the occurrence of the cues were considered to be anticipations and were not included in the database.

Before the electrophysiological recordings began, the monkeys were extensively trained (4–6 months) until a performance threshold of 80% correct trials was achieved, in which the preparatory period, the maximal RT, and the holding time were all of 1 sec. In each recording session, the four different combinations of cues were presented pseudorandomly from trial to trial. The first trial of a session was randomly chosen from a list of trials in which each condition was present in the same proportion. The same cues were not presented more than three times sequentially if trials were performed correctly.

### Surgery

Initial anesthesia was administered by an intramuscular injection of ketamine (10 mg/kg) and xylazine (0.5 mg/kg), followed by deep anesthesia induced by isoflurane. A polyether-ether-ketone recording chamber (19-mm inner diameter) was implanted over the left hemisphere. Recording chambers in both monkeys were positioned with a 20° angle laterally in the coronal plane. The targeted stereotaxic coordinates, relative to ear bars, were as follows: Monkey M: anterior = 18 mm, lateral = 16 mm; Monkey Y: anterior = 14 mm, lateral = 16 mm, based on the atlas of Saleem and Logothetis (2007). During the same surgery, two titanium cylinders were embedded in the orthopedic cement (Palacos with gentamycin) and fixed to the skull with titanium orthopedic bone screws for subsequent head restraint during neuronal recordings. After surgery, monkeys were given antibiotics (Marbocyl, 2 mg/kg) and analgesics (Tolfedine, 4 mg/kg) on the day of the surgery and for the 4 following days. The recording chamber was filled with an antibiotic solution (Marbocyl, 2 mg/mL) and sealed with a removable cap.

### Electrophysiological Recordings

While the monkeys were performing the task with head immobilization, extracellular activity of single neurons was recorded with custom-made glass-insulated tungsten microelectrodes based on the technique of Merrill and Ainsworth (1972). To record from the BG structures, a stainless steel guide tube (diameter = 0.6 mm) was lowered below the surface of the dura, and the microelectrode was passed inside the guide and was advanced using a manual hydraulic microdrive (M096; Narishige). The microelectrode was connected to a preamplifier located in close proximity to the microdrive. The neuronal signal was then amplified 5,000 times, filtered at 0.3–1.5 kHz, and converted to digital pulses through a window discriminator (Neurolog; Digitimer). The presentation of the cues, the force developed by the animal, the delivery of the reward, and digital pulses from neuronal activity were controlled by a computer using custom-designed software written in LabVIEW (LabVIEW; National Instrument).

The recording electrode was lowered to isolate neurons while the monkey performed the task. We isolated single neurons by continuously monitoring the waveform of the recorded neuronal impulses on an oscilloscope. Neurons were localized in the GPe using expected stereotaxic coordinates and the characteristic firing patterns associated with neurons in regions dorsal to the GPe. Along the electrode trajectory, striatal tissue dorsal to the GPe could be identified by the presence of both tonically active neurons (tonic firing rates in range from 3 to 10 spk · s−1; Apicella, Legallet, & Trouche, 1997) and the very low-frequency activity of phasically active neurons (Apicella, 2002). As the electrode continued to be lowered, the dorsolateral border of the GPe was identified by an increase in background noise immediately after a short silence (DeLong, 1971).

Within the GPe, most neurons exhibited high-frequency activity, in many cases interrupted by pauses. A minority of neurons had low-frequency discharge rates interrupted by high-frequency bursts. The typical electrophysiological activity of individual GPe neurons was characterized by a narrow and high-amplitude waveform (Elias et al., 2007; DeLong, 1971). The detection of the ventrolateral border of the GPe was based on the dorsoventral length of the GPe for each coronal plane, as described in the atlas of Saleem and Logothetis (2007), and assured by the presence, when entering the GPi, of most neurons displaying a high-frequency discharge with no pauses. The activity of the first well-isolated and stable pallidal unit in a trajectory was recorded for at least 10 trials per condition. After recording from a GPe neuron, the electrode was moved forward until another GPe neuron was encountered. Data from all GPe neurons recorded were included in analyses.

### Localization of Recordings

To assess the localization of our recordings, we used a high-resolution MRI scan for each monkey with electrodes positioned (five for monkey M and six for monkey Y) in trajectories from which we recorded GPe neurons. MR images were collected using a T1-weighed sequence (recovery time = 1700 msec, echo time = 4.414 msec, flip angle = 30°, in-plane resolution = 0.6 × 0.6 mm, thickness = 0.6 mm). On the basis of the localization of these electrode tips, we extrapolated the inferior/superior, anterior/posterior, and medial/lateral positions of each recorded neuron to generate a 3-D reconstruction using Brainsight software (Brainsight; Rogue Research; Figure 1B and C). The GPe position was determined based on the anterior commissure (AC) visualization as well as the shape of the striatum and surrounding cortical areas. The slices of the MR images of each monkey were matched with atlas sections.

The delimitations of the sensorimotor, associative, and limbic territories of the GPe were then determined based on previous studies (Grabli et al., 2004; François, Yelnik, Percheron, & Fénelon, 1994; Haber, Lynd-Balta, & Mitchell, 1993) and localized on a map of the GPe based on the MRI slices of each monkey.

### Data Analyses

All data analyses were performed using conventional statistical procedures with the R statistical computing environment (R Development Core Team, 2011). Data were analyzed from 13,046 trials performed during 126 recording sessions: 3,293 were performed in the fr condition; 2,656, in the fR condition; 3,975, in the Fr condition; and 3,122, in the FR condition.

#### Behavioral Analyses

RT, which was the duration between the onset of the cue and the time at which the monkey started to increase its pressing force on the lever, was measured only for correct trials. RTs were changed into z scores for normalization purposes, and a two-way ANOVA was performed with Required force and Expected reward as the two factors. Error rates (ERs; i.e., the total number of errors performed in a condition divided by the total number of trials [both correct and error trials] performed in this condition) were calculated and compared with a Pearson's chi-squared test. Each p value was corrected by Bonferroni correction, and differences were considered to be significant when p < .0083 (0.05/6, six possible comparisons). In each condition, the proportion of omission and execution errors was determined by dividing the number of one type of error (execution or omission) by the total number of errors in this condition. Acceptance level was computed by dividing the total number of trials accepted by the animal in a given condition (correct trials + holding and threshold errors) by the total number of trials performed in this condition. This acceptance level reflects whether the animal chose to perform the task or not, depending on the level of force and the reward size. The force developed by the animals in each trial at each time of the task was collected and averaged by condition to highlight possible differences within a same range of force between two different reward conditions.

#### Electrophysiological Analyses

Electrophysiological data were analyzed only for correct trials performed during the recording sessions. The average baseline firing rate and coefficient of variation (CV) of the interspike interval (ISI) were calculated for each neuron and for each monkey across all conditions. The average firing rate was calculated during a baseline period of 1 sec preceding the occurrence of the cues, corresponding to the preparatory period (Figure 1A). The CV was calculated during the same baseline period and corresponded to the mean of the ISI divided by the standard deviation of the ISI. Trials were divided into three periods for neuronal activity analyses (Figure 1A). The “cue threshold period” started with the occurrence of the cues and ended when the force developed on the lever exceeded the lower threshold of the force range. As a consequence, the duration of this period varied across trials depending on the animal's behavior. The “holding period” corresponded to the 1-sec period during which the monkey had to hold the required force, from the end of the cue threshold period to the reward delivery. Finally, the “postreward” period was a 1-sec period after reward delivery. In our task, the force applied by the animals and the force required, based on the shape of the stimuli, highly covariate. As a consequence, they could not be inserted as factors of the same model. To disentangle the “motor” modulation, that is, modulation by the force applied by the animals, from the “factors” modulation, that is, the force required, the expected/received reward, and the interaction between both, we have performed a three-step iterative generalized linear model (GLM) analysis for each of the three periods. For the first iteration, we considered that the force applied can be modelized as:
$F.app=F.appF+F.appR+F.appF,R+F.appRes$
In this model, we assumed that the force applied by the animals (F.app) can be explained by the amount of force required (F.appF), the size of the expected/received reward (F.appR), the interaction between both (F.appF,R), and a residual part not explained by those factors (F.appRes). From this first iteration, we extracted the residual part F.appRes, the part of the force applied that was not explained by the factors. F.appRes was used in the second iteration to evaluate the modulation of the spike count by the force applied, after the modulation by the factors had been extracted from it:
$Spikecount=SpikecountF.appRes+SpikecountRes$
SpikecountRes, the part of the spike count not explained by the force applied, was extracted and used in the last iteration. We thus considered that SpikecountRes represented the spike count cleared from modulations elicited by the force applied and measured the effect of the information about force and reward carried by the cues on this spike count as follows:
$SpikecountRes=SpikecountResF+SpikecountResR+SpikecountResF,R+SpikecountResRes$
In this third step, we were measuring the modulation of the spike count not explained by the force applied (SpikecountRes), the amount of force required ($SpikecountResF$), the size of expected/received reward ($SpikecountResR$), the interaction between both ($SpikecountResFR$), or a part not explained by those factors ($SpikecountResRes$). In summary, this iteration returned information about the modulation of the spike count by the force and reward factors. To compensate for a high risk of FWE rate due to multiple comparisons and to minimize the probability of making Type I errors under the null hypothesis (three periods and 126 neurons), we performed bootstrap analyses for the second and third iterations (Lindquist & Mejia, 2015; Maris & Oostenveld, 2007). Moreover, it allowed us to compute p values without making distributional assumptions about the data. Bootstrap consisted of randomly resampling neuronal data to obtain replications of the same size as the original data set. We performed 999 times the analysis for each period and each neuron, each time with a different resampling. We extracted the likelihood ratio for each resampled data set and compared the values with the one obtained from the original data set. If the original likelihood ratio fell in the highest ventile (equivalent p value of .05), the neuron was considered to be significantly modulated by the factor considered during the analyzed period. The number of neurons modulated by the force applied after the second iteration and the number of neurons modulated by the force and reward factors and their interaction after the third iteration were collected. For each neuron and each period, we estimated a force selectivity index (FSI) and a reward selectivity index (RSI). The selectivity indices (SI) were defined as follows: SI = (μ1 − μ2) / √((SS1 + SS2) / (df1 + df2)), where μx was the mean of the SpikecountRes during a given period, SS was the sum of squares, and dfx was the degree of freedom (number of trials − 1) for each pair of conditions described below (Peck, Lau, & Salzman, 2013). To calculate the FSI, we compared the neuronal activity during trials in the high-force conditions (Fr and FR) with the neuronal activity during trials in the low-force conditions (fr and fR). To calculate the RSI, we compared, in the same periods, the neuronal activity during trials in the large reward conditions (fR and FR) with the neuronal activity during trials in the small reward conditions (fr and Fr). An index above zero indicated a stronger modulation in the high conditions, whereas an index below zero indicated a stronger modulation in the low conditions.

To examine the dynamics of the encoding of the force applied and the force and reward factors, a sliding window analysis was used, with windows of 200 msec shifted in increments of 10 msec. We performed the GLM analyses as described previously during periods covering the entire trial duration: a peristimuli period (from −1000 to 500 msec after the cues' occurrence; 131 bins) and a holding–postreward period beginning 500 msec before the beginning of the holding period, including the holding period of 1 sec and the postreward period of 1 sec (from −500 to 2500 msec after crossing of the low threshold of the required force; 281 bins). We considered the beginning of each window as the reference for each time measured (i.e., if the modulation was observed in the window between 50 and 250 msec, we considered that it occurred at 50 msec). For each factor, a modulation was considered to be significant if the percentage of modulated neurons was higher than the percentage of neurons modulated by chance (computed on the mean of the percentage of neurons modulated by this factor during the baseline period) plus 2 SDs for at least five consecutive steps.

## RESULTS

### Modulation of the Behavioral Responses by the Required Force Level and the Expected Reward Size

Behavioral analyses were performed on 126 sessions (56 from Monkey M and 70 from Monkey Y) during which we recorded GPe neurons.

#### RTs

Average RTs to reach the required force threshold after the occurrence of cues were computed from the correct trials only (4,461 from Monkey M and 5,425 from Monkey Y; Figure 2A and E). RTs were significantly shorter for the large reward trials than for the small reward ones in Monkey M (two-way ANOVA on RT z score, preward < .001, F(1, 4457) = 48.48). RTs were also significantly shorter during the high-force trials than during the low-force ones in this monkey (two-way ANOVA on RT z score, pforce < .05, F(1, 4457) = 6.55). There was no significant difference among the RTs of Monkey Y, although there was a slight decrease for the most favorable condition: low force/large reward. In both monkeys, there was no interaction effect between the required force level and the size of the expected reward on the RTs.

Figure 2.

Behavioral results. (A–D) Behavioral results for Monkey M. (E–H) Behavioral results for Monkey Y. (A, E) RTs of the animals in the four conditions of the task. r = small reward; R = large reward. Solid black line: high force; dashed gray line: low force. The error bars represent the standard errors to the mean. (B, F) ERs of the animals in the four conditions of the task. Same conventions as A and E. (C, G) Acceptance level of the animals in the four conditions of the task (fR: low force/large reward; FR: high force/large reward; fr: low force/small reward; Fr: high force/small reward). (D, H) Mean of the force developed by the animals in the four conditions of the task. Dark gray lines: large reward; light gray lines: small reward; thick lines: high force; thin lines: low force. The stars indicate for the RTs the influence of the force and reward on the animal's behavior (two-way ANOVA, *p < .05, ***p < .001). For the ERs, the differences among conditions (Pearson's chi-squared test; ***corrected p < .001); and for the force applied, the influence of the expected/received reward size (Wilcoxon rank sum test on the average force, *p < .05, ***p < .001).

Figure 2.

Behavioral results. (A–D) Behavioral results for Monkey M. (E–H) Behavioral results for Monkey Y. (A, E) RTs of the animals in the four conditions of the task. r = small reward; R = large reward. Solid black line: high force; dashed gray line: low force. The error bars represent the standard errors to the mean. (B, F) ERs of the animals in the four conditions of the task. Same conventions as A and E. (C, G) Acceptance level of the animals in the four conditions of the task (fR: low force/large reward; FR: high force/large reward; fr: low force/small reward; Fr: high force/small reward). (D, H) Mean of the force developed by the animals in the four conditions of the task. Dark gray lines: large reward; light gray lines: small reward; thick lines: high force; thin lines: low force. The stars indicate for the RTs the influence of the force and reward on the animal's behavior (two-way ANOVA, *p < .05, ***p < .001). For the ERs, the differences among conditions (Pearson's chi-squared test; ***corrected p < .001); and for the force applied, the influence of the expected/received reward size (Wilcoxon rank sum test on the average force, *p < .05, ***p < .001).

#### ERs

ERs were computed from the total number of trials performed by the animals (5,413 from Monkey M and 7,633 from Monkey Y), including correct and error trials (Figure 2B and F). The ERs were significantly higher in the small reward conditions than in the large reward ones for the same required force (low force: p < .01 (p = 8.99.10−14 and 3.81.10−28), χ2 = 55.6 and 121.0 for Monkeys M and Y, respectively; high force: p < .01 (p = 1.08.10−13 and 2.87.10−63), χ2 = 55.2 and 281.9 for Monkeys M and Y, respectively; Figure 2B and F). Moreover, for the same expected reward, the ERs were significantly higher in the high-force conditions than in the low-force ones (small reward: p < .01 (p = 3.43.10−11 and 1.44.10−23), χ2 = 43.9 and 100.1 for Monkeys M and Y, respectively; large reward: p < .01 (p = 1.05.10−11 and 1.97.10−5), χ2 = 46.2 and 18.2 for Monkeys M and Y, respectively).

#### Acceptance Level

The level of acceptance allowed us to rank the four conditions in the same preference order for the two animals—low force/high reward (fR), high force/high reward (FR), low force/small reward (fr), and high force/small reward (Fr; Figure 2C and G)—from the condition in which they had the highest acceptance level to the lowest acceptance level. For both monkeys, the size of expected reward seemed to be more relevant than the level of effort for their decision of whether to perform the task. In the fR conditions, monkeys decided to perform the action in 99.7% (Monkey M) and 98.5% (Monkey Y) of the trials. In contrast, in the Fr conditions, they only performed the action in 88.2% (Monkey M) and 71.2% (Monkey Y) of the trials. FR trials were accepted more frequently (96.7% for Monkey M and 96% for Monkey Y) than fr trials (93.3% for Monkey M and 90.1% for Monkey Y). These results show that the monkeys understood the task and integrated the cost of each condition (particularly the less favorable one). Indeed, not only the effort to be made but more so the size of the expected reward contributed to the subjective value of the action.

As depicted in Figure 2D and H, for the same amount of force required, the average force applied by the animals was different depending on the expected/received reward in some periods. This result led us to consider the force applied as a factor in our analyses of the neuronal activity to isolate a reward effect from any mechanical variation.

### Electrophysiological Results

#### Localization of the Recordings

One hundred twenty-six neurons recorded (56 and 70 from Monkeys M and Y, respectively) were located inside the GPe. In the antero/posterior plane from the AC, neurons were recorded from AC + 2 to AC − 2 and AC + 2 to AC − 3, for Monkeys M and Y, respectively. All recorded neurons were located in the anterior part of the GPe; the most posterior part, described to be essentially sensorimotor (Worbe et al., 2013; Grabli et al., 2004; François et al., 1994; Haber et al., 1993), was not investigated. Considering the medial/lateral and inferior/superior positions of our recordings, we determined that 75 neurons (34 for Monkey M and 41 for Monkey Y) were located in the dorsal “associative part” of the GPe and 51 (22 for Monkey M and 29 for Monkey Y) were in the ventral and rostral “limbic part” of the structure (Figure 1B and C). No correlation between the localization and the neuronal properties at encoding either the force or the reward information could be found.

#### Electrophysiological Properties of GPe Neurons

Most of the 126 GPe neurons recorded (41/56 and 51/70 from Monkeys M and Y, respectively; i.e., 92/126 [73%]) were neurons exhibiting high-frequency activity (Figure 3A),with a mean firing rate of 58.08 ± 2.37 spk · s−1 (min = 20.09, max = 132.70) and a CV of their ISIs of 1.06 ± 0.04 (min = 0.49, max = 1.9). Most of these neurons showed pauses in their firing rate and were assumed to be the high-frequency discharge interrupted by pauses neurons previously described (Elias et al., 2007; Arkadir et al., 2004; DeLong, 1971). Twenty-one neurons (7/56 and 14/70 from Monkeys M and Y, respectively; 21/126 [17%]) with a lower frequency of discharge and occasional brief high-frequency bursts were recorded (Figure 3B). The mean firing rate of these neurons was 12.42 ± 1.7 spk · s−1 (min = 1.10, max = 33.12), and the mean CV was 1.50 ± 0.06 (min = 1.1, max = 2.4). The pioneering study of DeLong (1971) classified the GPe neurons into two categories: 85% of neurons exhibiting a high-frequency discharge and 15% exhibiting a low-frequency discharge and bursts. In addition to these two types of neurons, we recorded 13 neurons (8/56 and 5/70 from Monkeys M and Y, respectively; 13/126 [10%]) with regular patterns of discharge (Figure 3C). These cells showed a lower activity frequency than the high-frequency discharge interrupted by pauses, with a regular pattern and a mean firing rate of 31.47 ± 3.61 spk · s−1 (min = 8.6, max = 56.66). Their CVs were very low (below 0.5) when the neurons showed a firing rate > 20 spk · s−1 and slightly higher (between 0.5 and 0.7) for neurons with a firing rate < 20 spk · s−1. Although they could correspond to the “border cells” described by Mitchell, Richardson, Baker, and DeLong (1987), on the basis of their properties, they were found throughout the GPe and not only at the border of the structure. The properties of these three subpopulations of neurons, based on their firing rate and CV, are summarized in Figure 3D.

Figure 3.

Different types of neurons recorded in the GPe. (A–C) Raster plots of three neurons illustrating the three types of GPe neurons. Each line represents a trial, and each point represents a spike; the smoothed lines on the raster represent the spike density. (A) Example of an irregular neuron, showing a high-frequency activity and a high CV. (B) Example of a bursty neuron, showing a low frequency of discharge and occasional brief high-frequency bursts. (C) Example of a regular neuron, showing a low frequency of discharge and a low CV. (D) Scatterplot of the distribution of GPe neurons as a function of their CV (ordinate) and firing rate (abscissa; circle: irregular neurons, triangle: bursty neurons, square: regular neurons). Filled forms represent the average of the ISI CV and the firing rate for each type of neuron.

Figure 3.

Different types of neurons recorded in the GPe. (A–C) Raster plots of three neurons illustrating the three types of GPe neurons. Each line represents a trial, and each point represents a spike; the smoothed lines on the raster represent the spike density. (A) Example of an irregular neuron, showing a high-frequency activity and a high CV. (B) Example of a bursty neuron, showing a low frequency of discharge and occasional brief high-frequency bursts. (C) Example of a regular neuron, showing a low frequency of discharge and a low CV. (D) Scatterplot of the distribution of GPe neurons as a function of their CV (ordinate) and firing rate (abscissa; circle: irregular neurons, triangle: bursty neurons, square: regular neurons). Filled forms represent the average of the ISI CV and the firing rate for each type of neuron.

The number of neurons modulated by the force applied or the task factors (force, reward, and/or interaction) did not differ (corrected p > .05) across these three types of neurons and between the two monkeys or the GPe territories (associative vs. limbic) during any trial period (comparison among categories: χ2 < 8.09, corrected p < .05; comparison between monkeys: χ2 < 0.73, corrected p < .05; comparison between territories: χ2 < 2.14, corrected p < .05). Consequently, these three groups of neurons and the data from the two monkeys across territories were pooled and considered as a single population for the subsequent analyses.

#### Modulation of the Neuronal Activity by the Force Applied and Task Factors during the Three Periods of the Task

This visuomotor task was designed to explore the responses of GPe neurons to stimuli carrying effort- and reward-related information. However, given the presence of the motor response, it also allowed us to study their modulations by the motor feature of the task, that is, the force applied. After the occurrence of the cues, the average activity increased in all four conditions and remained above chance all along the periods of the task (Figure 4A). Around 30% of the neurons (37/126, 29%) showed a modulation in their activity by the force applied during at least one period of the task. No continuity in the modulation by the force applied was seen along the trial. Indeed, most of these neurons (30/37, 81%) were modulated during only one period. The number of neurons modulated by the force applied was different among the task periods (χ2 = 10.1 p = .006), with 10% (13/126) of modulated neurons during the cue threshold period, 6% (8/126) during the holding period, and 19% (24/126) during the postreward period. The barplot in Figure 4B depicts a summary of the modulation of the neuronal activity by the force applied and the force and reward factors during the three periods of the task. During the cue threshold period, 56 of 126 (44%) neurons were modulated by one of the task factors; 71 of 126 (56%), during the holding period; and 62 of 126 (49%), during the postreward period. Only 13 of 126 neurons (10%) showed no modulation of their activity by the force applied and by the task factors. We then compared the proportions of neuronal responses with the force applied and the force and reward factors during each analyzed period. During the cue threshold period, we found that more neurons were modulated by the force factor than by the force applied (χ2 = 4.37, p < .05) but there was no difference among the number of neurons modulated by the force factor, the reward factor, or the interaction between both. During the holding period, a higher number of neurons was significantly modulated by the force or reward factor than by an interaction or the force applied (χ2 > 14.731, p < .05). Finally, during the postreward period, more neurons were modulated by the amount of reward than by any other variable (χ2 > 14.662, p < .05).

Figure 4.

Average neuronal activity and percentage of responsive neurons during the different periods of the task. (A) Average spike density (σ = 50) of the whole population of recorded GPe neurons (n = 126) at the occurrence of the visual stimuli (left) and during the holding and postreward periods (right). The four lines represent the four conditions of the task. (B) Percentage of neurons responding to the force applied, the force factor, the reward factor, and the interaction between both during the three periods of the task (Pearson's chi-squared test; *corrected p < .05, ***corrected p < .001).

Figure 4.

Average neuronal activity and percentage of responsive neurons during the different periods of the task. (A) Average spike density (σ = 50) of the whole population of recorded GPe neurons (n = 126) at the occurrence of the visual stimuli (left) and during the holding and postreward periods (right). The four lines represent the four conditions of the task. (B) Percentage of neurons responding to the force applied, the force factor, the reward factor, and the interaction between both during the three periods of the task (Pearson's chi-squared test; *corrected p < .05, ***corrected p < .001).

To further study the dynamics of these modulations throughout the task, we have examined, for each neuron, which effect in one period followed or was followed by an effect in another period. As previously mentioned, the neurons modulated by the force applied during one period were usually not involved in the encoding of the force applied during other periods. Moreover, most of these neurons were comodulated by the task factors during the same period or during other periods of the task. Indeed, only 10 neurons were modulated by the force applied without any effect of the force or the reward factors along the task, emphasizing the lack of selectivity of these neurons to encode the motor aspect of the action. Furthermore, the neurons showing a force effect during the cue threshold period were not the same as those showing a force effect during the holding period: Only 9 of 30 (30%) of these neurons shared this modulation during both periods. On the other hand, more than two thirds (28/41, 68%) of the neurons showing a reward effect during the holding period still encoded only this factor during the postreward period. Consequently, the population of neurons encoding only the reward size during the postreward period was composed by neurons already involved in this process during the holding period, plus neurons recruited to encode this factor when the reward was received.

#### Weight and Direction of Activity Modulation by Task Factors

We estimated an FSI and an RSI for each neuron in each period (see Methods) to quantify the modulation by the task factors force and reward, respectively. An index above zero indicates a stronger activity in the high-force/reward conditions. Conversely, an index below zero indicates a stronger activity in the low-force/reward conditions. During the four defined periods, a comparable number of neurons were FSI-positive and FSI-negative (cue threshold: 12 vs. 18; holding: 21 vs. 19; postreward: 7 vs. 6; binomial test, p > .05). Similarly, a comparable number of neurons were RSI-positive and RSI-negative (cue threshold: 12 vs. 14; holding: 17 vs. 24; postreward: 25 vs. 28; binomial test, p > .05). Figure 5AC (left) illustrates the distribution of FSI and RSI positive and negative neurons. The distributions of the FSI and RSI of the modulated neurons were centered around 0 (Wilcoxon test, p > .05). Consequently, the examples to the right of the figure illustrate the type of modulation that can be found in the activity of GPe neurons and not the activity of the entire population (Figure 5AC, right).

Figure 5.

FSI and RSI of GPe neurons during the three periods of the task. (A–C, left) Scatter plots of FSI versus RSI for each neuron in each period (n = 126). FSIs > 0 indicate higher modulation in the high-force conditions, and indices < 0 indicate higher modulation in the low-force conditions. RSIs > 0 indicate higher modulation in the large reward conditions, and indices < 0 indicate higher modulation in the small reward conditions. Symbol style indicates the significance of the modulation for each neuron when we performed the GLM (square for force factor, filled circle for reward factor, cross for interaction between both, and little filled circle for none of these three significant modulations). Symbols in gray and arrows indicate the neurons illustrated on the right part of the figure for each period. (A–C, right) Same representation as Figure 3AC. Trials are ranked according to the four conditions of the task (fR: low force/large reward; FR: high force/large reward; fr: low force/small reward; Fr: high force/small reward). (A, right) Neuron showing an interaction effect during the cue threshold period. (B, right) Neuron showing a positive force effect during the holding period. (C, right) Neuron showing a positive reward effect during the postreward period. Spk = spike.

Figure 5.

FSI and RSI of GPe neurons during the three periods of the task. (A–C, left) Scatter plots of FSI versus RSI for each neuron in each period (n = 126). FSIs > 0 indicate higher modulation in the high-force conditions, and indices < 0 indicate higher modulation in the low-force conditions. RSIs > 0 indicate higher modulation in the large reward conditions, and indices < 0 indicate higher modulation in the small reward conditions. Symbol style indicates the significance of the modulation for each neuron when we performed the GLM (square for force factor, filled circle for reward factor, cross for interaction between both, and little filled circle for none of these three significant modulations). Symbols in gray and arrows indicate the neurons illustrated on the right part of the figure for each period. (A–C, right) Same representation as Figure 3AC. Trials are ranked according to the four conditions of the task (fR: low force/large reward; FR: high force/large reward; fr: low force/small reward; Fr: high force/small reward). (A, right) Neuron showing an interaction effect during the cue threshold period. (B, right) Neuron showing a positive force effect during the holding period. (C, right) Neuron showing a positive reward effect during the postreward period. Spk = spike.

#### Dynamics of the Force Applied and Task Factors Effects throughout the Task

To precisely define the temporal profile of neuronal activity modulation by the force applied and the task factors throughout the whole trial, we performed a sliding windows analysis (see Methods; Figure 6). After the presentation of the cues, the reward effect (Figure 6C, left) occurred before (60 msec) the force applied (Figure 6A, left) or force (Figure 6B, left) effects (190 and 240 msec, respectively).

Figure 6.

Dynamics of modulation by the force applied and the task factors throughout the task. From left to right are the percentage of neurons in each 200-msec sliding window for a peristimuli period (from −1000 to 500 msec after the visual stimuli occurrence; 131 bins), a perionset of the change in force period (from −500 to 500 msec after the onset of the change in force; 81 bins), a holding–postreward period beginning 500 msec before the holding period and including the holding period of 1 sec, and the postreward period of 1 sec and 500 msec after it (from −500 to 2500 msec after the crossing of the low threshold of the required force; 281 bins). Each bar represents the percentage of significant neurons in a 200-msec window. From top to bottom are the results for the modulation of activity by the force applied, the level of required force, the size of the expected reward, and the interaction between these two last factors. The horizontal dotted lines represent the level of significance, which corresponded, in each case, to the mean percentage of significant neurons during the baseline period (81 bins) plus 2 SDs computed during the same period.

Figure 6.

Dynamics of modulation by the force applied and the task factors throughout the task. From left to right are the percentage of neurons in each 200-msec sliding window for a peristimuli period (from −1000 to 500 msec after the visual stimuli occurrence; 131 bins), a perionset of the change in force period (from −500 to 500 msec after the onset of the change in force; 81 bins), a holding–postreward period beginning 500 msec before the holding period and including the holding period of 1 sec, and the postreward period of 1 sec and 500 msec after it (from −500 to 2500 msec after the crossing of the low threshold of the required force; 281 bins). Each bar represents the percentage of significant neurons in a 200-msec window. From top to bottom are the results for the modulation of activity by the force applied, the level of required force, the size of the expected reward, and the interaction between these two last factors. The horizontal dotted lines represent the level of significance, which corresponded, in each case, to the mean percentage of significant neurons during the baseline period (81 bins) plus 2 SDs computed during the same period.

At the beginning of the holding period, 49 neurons (38.9%) showed a force effect (Figure 6B, right). Gradually, as the reward approached, GPe neurons were less driven by the amount of required force, because only 13 neurons (10.3%) showed a force effect at the end of the holding period. During this period, the modulation by the amount of expected reward also decreased. However, compared with the required force influence, it was quite steady, with 17.5–28.6% of neurons showing a reward effect (Figure 6C, right). Consequently, during the first 500 msec of the holding period, the activity of GPe neurons tended to be more driven by the amount of required force, and during the last 500 msec, neurons were driven equally by both factors. Interestingly, during the holding period, supposedly characterized by motor activity, GP neurons were scarcely modulated by the force applied, with only 2.4–11.9% of the neurons showing this effect across this period (Figure 6A, right).

As expected, during the postreward period, the number of neurons showing a force effect remained low (between 4.0% and 15.9%; Figure 6B, right), whereas the number of neurons showing a reward effect increased significantly in comparison with the holding period (average number plus 2 SDs), starting 240 msec after the reward occurrence and continuing to increase until 670 msec after the reward delivery, when 46% of the neurons were modulated by the amount of reward (Figure 6C, right). There was also an important increase in the number of neurons modulated by the force applied in this period, starting 440 msec after the reward and decreasing slightly toward the end of the period (Figure 6A, right). Finally, in comparison with the strong force and reward effects observed throughout the task, the proportion of neurons modulated by an interaction of these two factors remained low, except in the cue threshold period. Conversely, during the holding and postreward periods, force and reward information appeared to be mostly encoded in an independent manner.

## DISCUSSION

We designed our task to modulate the motivation of the animals and study how this motivation can be influenced by motor effort and reward processes. Our behavioral data confirmed that there was modulation of the motivational level of the animals, in terms of whether they chose to perform the given action leading to a specific reward at a certain cost. Our results revealed three main properties of associative and limbic GPe neurons. First, at the single neuron level as well as at the population level, GPe neurons can be modulated by motor, cognitive, and/or limbic information at the same time during a single period or during different periods throughout the trial course, mostly in an independent way. Second, at the population level, GPe neurons encoded the force applied and both force and reward level information dynamically along the task, at the time at which each piece of information was the most relevant for the correct execution of the action. Finally, and following the second point, after the reward delivery, GPe neurons' activity was strongly modulated by the reward size.

### Convergence of Motor and Cognitive Information in the GPe

The task used in our study required motor action and involved cognitive and limbic processes during the various periods within a trial. For example, cognitive and limbic processes are required after the occurrence of the cues for their evaluation, and motor processes are required during the action initiation and execution. This study shows that few GPe neurons were dedicated only to the encoding of the force applied, the motor parameter of our task. They usually encoded both the force applied and the task factors or were modulated only by one or both task factors, the force required and the expected reward. Indeed, a single GPe neuron could be modulated by motor, cognitive, and/or limbic information during a single period or other periods of the trial. These results are consistent with the theory of convergence of information at the GPe level. Previous electrophysiological studies have suggested that the GPe is a site of convergence of information processed separately in different cortical areas and striatal territories (Arkadir et al., 2004). This hypothesis is supported by anatomical data such as the convergence of inputs from the striatum and the STN onto single pallidal neurons (Parent & Hazrati, 1995), the reduction of the number of neurons between the striatum and the GPe (Oorschot, 1996), and the organization of the dendritic field of GPe neurons (Kita & Kita, 2001). Information regarding the movement, as well as force and reward levels converging at the GPe level, could be conveyed in several cases by a single pallidal neuron but also at the population level, through neurons modulated by the force applied, and force and/or reward information. Interactions between factors were mainly observed during the cue threshold period, when the information should be computed to set the correct action to execute. Interactions were only sparsely found later in the trials, suggesting that the GPe is more likely to play a role in the linear summation of these different types of information from the input structures of the basal BG without a real integration as such. However, GPe neurons could have an integrative role by making convergent synaptic contacts in their target structures, namely, STN, striatum, substantia nigra pars compacta, and output nuclei (Bolam et al., 2000).

### GPe Neurons Encoded the Force Applied and Both Force and Reward Levels Information Dynamically along the Task

In this study, monkeys performed a deterministic task in which cues regarding the level of effort required and the size of the upcoming reward were given to the monkey simultaneously, and these same cues served as a go signal to initiate the production of the required force. Modulation of GPe neurons by reward size was observed throughout the trial until reward delivery, which is the ultimate goal of the action, as an uninterrupted rewarding message sent during the execution of the action. During the execution period, most neurons encoded the force level information; but very few, the actual force applied. This highlights the ability of GPe neurons to integrate the representation of the force information in their activity and not only its execution. This is in line with the role of the dorsal GPe in representing behavioral goals to be referred to at the time of motor target decision (Saga, Hashimoto, Tremblay, Tanji, & Hoshi, 2013) and confirms the role of the GPe in nonmotor processes (Kim et al., 2017; Schechtman et al., 2016). On the contrary, very few neurons were sensitive to the reward probability during the movement execution in a probabilistic task (Arkadir et al., 2004), suggesting that the GPe might not be a part of the BG circuitry involved in the reward prediction error processing or reward uncertainty. The continuous influence of reward information on neuronal activity supports previous findings showing that it is rather involved in maintaining an already well-established goal-directed behavior with a certain and stable reward value (Kim et al., 2017). The influence of the expected reward on neuronal activity related to goal-directed actions has been demonstrated in the striatum (Hassani, Cromwell, & Schultz, 2001; Hollerman, Tremblay, & Schultz, 1998). The striatum could play a major role in the establishment of goal-oriented behaviors, whereas the GPe could be involved in maintaining stable behavior. The independent processing of information found during the action execution, contrasting with the less independent processing at the time of action planning, might be a way to avoid interference with new information. The GPe could operate as a receiver of several types of information from the input structures and could process each in an independent manner, allowing new information to update the behavior while maintaining already established ones. The encoding of the expected reward during the execution of the action could inform the animal during its effort about the future reward that will be delivered at the end of its action, possibly allowing it to maintain a certain level of motivation to overcome the effort required. These features could lead to the hypothesis of a role of the associative and limbic GPe in monitoring important information when it is essential for the action execution. After reward delivery, when the cognitive load decreases, the neurons could shift to the encoding of movement, as shown by the larger number of neurons modulated by the force applied during this period.

### GPe Neurons Are Strongly Modulated by Reward Size, but only Sparsely Integrate Force and Reward Information

The increase in neuronal activity at the population level after cues occur is consistent with previous findings (Deffains et al., 2016; Noblejas et al., 2015). Noblejas and colleagues have shown that the average GPe activity increases in response to relevant behavioral events. In our task, reacting to the cues is crucial to succeeding in receiving the expected reward. Thus, the increase in the average neuronal activity at this time would allow a disinhibition of the output structures via the indirect pathway, to facilitate the future action. Interestingly, it is also around the time of action planning that we observed the highest number of neurons whose activity was modulated by the interaction between the force and reward factors. These features support the idea that the GPe could be an important player in motivational processes occurring at the BG level.

The large number of reward-sensitive neurons at the reward delivery suggests a role for the GPe in the encoding of the behavioral outcome. The message sent by these neurons to their targets (the output structures, the GPi, the SNr, and/or the STN) has to be powerful to signal the achievement of the action, thus allowing the information regarding the gain resulting from the performed action to be maintained in the system. This message was found to be driven equally by two groups of neurons, providing feedback about the benefit (small or large reward) obtained whatever the circumstances. Neurons encoding different aspects of rewarding and punishing outcome have been described in several subcortical areas in monkeys, including the striatum (Cromwell & Schultz, 2003; Ravel et al., 2003) and the STN (Espinosa-Parrilla, Baunez, & Apicella, 2015; Darbaky et al., 2005). In our study, the encoding of the reward appeared more as a feedback of the action performed (a cost–benefit balance feedback) than as a pure inhibition of the movement, unlike what has been classically seen in the indirect pathway (Mink, 1996; Albin, Young, & Penney, 1989). The long latency to reach the maximal mobilization of GPe neurons in this process suggests that these neurons might encode the consequences of the action once the reward has been consumed rather than the amount of reward.

The present findings suggest that the GPe could have a large influence on motivated behaviors, before sending information about effort and reward to the GPi and the SNr. Some clinical studies highlight the involvement of the pallidum in disorders of diminished motivation (Bhatia & Marsden, 1994), athymhormia or autoactivation deficit (Habib & Poncet, 1988; Laplane et al., 1981), depression (Bielau et al., 2005), or Gilles de la Tourette syndrome (Piedimonte et al., 2013) and bring out the importance of GPe neurons in balancing the activation and inhibition of cortical areas. The present results showed that the GPe could be involved in updating the consequences of an action, based on motivational information. In patients experiencing the disorders mentioned above, this permanent revision of the consequences of the action could be prevented and therefore lead to inappropriate behaviors. To better understand the neural bases of motivational processes, it will be very interesting to study in the same experimental paradigm how this information is encoded within the input structures of the upper stage of the BG, the striatum and the STN, which are also the two major afferences of the GPe.

## Acknowledgments

This work was supported by Centre National de la Recherche Scientifique, the Aix-Marseille Université, and the Fondation de France (Parkinson Disease Program grant 2008 005902 to S. R.). We thank Drs. Christelle Baunez, Guillaume Masson, Mathias Pessiglione, Janine M. Simmons, Romain Trachel, and Dakota Smith for helpful comments and discussions.

Reprint requests should be sent to Sabrina Ravel, Institut de Neurosciences de la Timone, UMR7289 Centre National de la Recherche Scientifique and Aix-Marseille Université, 13005 Marseille, France, or via e-mail: sabrina.ravel@univ-amu.fr.

## REFERENCES

Albin
,
R. L.
,
Young
,
A. B.
, &
Penney
,
J. B.
(
1989
).
The functional anatomy of basal ganglia disorders
.
Trends in Neurosciences
,
12
,
366
375
.
Apicella
,
P.
(
2002
).
Tonically active neurons in the primate striatum and their role in the processing of information about motivationally relevant events
.
European Journal of Neuroscience
,
16
,
2017
2026
.
Apicella
,
P.
,
Legallet
,
E.
, &
Trouche
,
E.
(
1997
).
Responses of tonically discharging neurons in the monkey striatum to primary rewards delivered during different behavioral states
.
Experimental Brain Research
,
116
,
456
466
.
Apicella
,
P.
,
Ljungberg
,
T.
,
Scarnati
,
E.
, &
Schultz
,
W.
(
1991
).
Responses to reward in monkey dorsal and ventral striatum
.
Experimental Brain Research
,
85
,
491
500
.
,
D.
,
Morris
,
G.
,
,
E.
, &
Bergman
,
H.
(
2004
).
Independent coding of movement direction and reward prediction by single pallidal neurons
.
Journal of Neuroscience
,
24
,
10047
10056
.
Baunez
,
C.
,
Dias
,
C.
,
,
M.
, &
Amalric
,
M.
(
2005
).
The subthalamic nucleus exerts opposite control on cocaine and “natural” rewards
.
Nature Neuroscience
,
8
,
484
489
.
Berridge
,
K. C.
, &
Cromwell
,
H. C.
(
1990
).
Motivational–sensorimotor interaction controls aphagia and exaggerated treading after striatopallidal lesions
.
Behavioral Neuroscience
,
104
,
778
795
.
Bhatia
,
K. P.
, &
Marsden
,
C. D.
(
1994
).
The behavioural and motor consequences of focal lesions of the basal ganglia in man
.
Brain
,
117
,
859
876
.
Bielau
,
H.
,
Trübner
,
K.
,
Krell
,
D.
,
,
M. W.
,
Bernstein
,
H. G.
,
Stauch
,
R.
, et al
(
2005
).
Volume deficits of subcortical nuclei in mood disorders: A postmortem study
.
European Archives of Psychiatry and Clinical Neuroscience
,
255
,
401
412
.
Bolam
,
J. P.
,
Hanley
,
J. J.
,
Booth
,
P. A. C.
, &
Bevan
,
M. D.
(
2000
).
Synaptic organisation of the basal ganglia
.
Journal of Anatomy
,
196
,
527
542
.
Cromwell
,
H. C.
, &
Schultz
,
W.
(
2003
).
Effects of expectations for different reward magnitudes on neuronal activity in primate striatum
.
Journal of Neurophysiology
,
89
,
2823
2838
.
Darbaky
,
Y.
,
Baunez
,
C.
,
Arecchi
,
P.
,
Legallet
,
E.
, &
Apicella
,
P.
(
2005
).
Reward-related neuronal activity in the subthalamic nucleus of the monkey
.
NeuroReport
,
16
,
1241
1244
.
Deffains
,
M.
,
Iskhakova
,
L.
,
Katabi
,
S.
,
Haber
,
S. N.
,
Israel
,
Z.
, &
Bergman
,
H.
(
2016
).
Subthalamic, not striatal, activity correlates with basal ganglia downstream activity in normal and parkinsonian monkeys
.
eLife
,
5
,
e16443
.
DeLong
,
M. R.
(
1971
).
Activity of pallidal neurons during movement
.
Journal of Neurophysiology
,
34
,
414
427
.
Elias
,
S.
,
Joshua
,
M.
,
Goldberg
,
J. A.
,
Heimer
,
G.
,
,
D.
,
Morris
,
G.
, et al
(
2007
).
Statistical properties of pauses of the high-frequency discharge neurons in the external segment of the globus pallidus
.
Journal of Neuroscience
,
27
,
2525
2538
.
Espinosa-Parrilla
,
J. F.
,
Baunez
,
C.
, &
Apicella
,
P.
(
2015
).
Modulation of neuronal activity by reward identity in the monkey subthalamic nucleus
.
European Journal of Neuroscience
,
42
,
1705
1717
.
François
,
C.
,
Yelnik
,
J.
,
Percheron
,
G.
, &
Fénelon
,
G.
(
1994
).
Topographic distribution of the axonal endings from the sensorimotor and associative striatum in the macaque pallidum and substantia nigra
.
Experimental Brain Research
,
102
,
305
318
.
Grabli
,
D.
,
McCairn
,
K.
,
Hirsch
,
E. C.
,
Agid
,
Y.
,
Féger
,
J.
,
François
,
C.
, et al
(
2004
).
Behavioural disorders induced by external globus pallidus dysfunction in primates: I. Behavioural study
.
Brain
,
127
,
2039
2054
.
Haber
,
S. N.
,
Lynd-Balta
,
E.
, &
Mitchell
,
S. J.
(
1993
).
The organization of the descending ventral pallidal projections in the monkey
.
Journal of Comparative Neurology
,
329
,
111
128
.
Habib
,
M.
, &
Poncet
,
M.
(
1988
).
Loss of vitality, of interest and of the affect (athymhormia syndrome) in lacunar lesions of the corpus striatum
.
Revue Neurologique
,
144
,
571
577
.
Hassani
,
O. K.
,
Cromwell
,
H. C.
, &
Schultz
,
W.
(
2001
).
Influence of expectation of different rewards on behavior-related neuronal activity in the striatum
.
Journal of Neurophysiology
,
85
,
2477
2489
.
Hikosaka
,
O.
,
Sakamoto
,
M.
, &
Usui
,
S.
(
1989
).
Functional properties of monkey caudate neurons: III. Activities related to expectation of target and reward
.
Journal of Neurophysiology
,
61
,
814
832
.
Hollerman
,
J. R.
, &
Schultz
,
W.
(
1998
).
Dopamine neurons report an error in the temporal prediction of reward during learning
.
Nature Neuroscience
,
1
,
304
309
.
Hollerman
,
J. R.
,
Tremblay
,
L.
, &
Schultz
,
W.
(
1998
).
Influence of reward expectation on behavior-related neuronal activity in primate striatum
.
Journal of Neurophysiology
,
80
,
947
963
.
Joshua
,
M.
,
,
A.
,
Rosin
,
B.
,
,
E.
, &
Bergman
,
H.
(
2009
).
Encoding of probabilistic rewarding and aversive events by pallidal and nigral neurons
.
Journal of Neurophysiology
,
101
,
758
772
.
Kim
,
H. F.
,
Amita
,
H.
, &
Hikosaka
,
O.
(
2017
).
Indirect pathway of caudal basal ganglia for rejection of valueless visual objects
.
Neuron
,
94
,
920
930
.
Kita
,
H.
, &
Kita
,
T.
(
2001
).
Number, origins, and chemical types of rat pallidostriatal projection neurons
.
Journal of Comparative Neurology
,
437
,
438
448
.
Laplane
,
D.
,
Widlocher
,
D.
, &
Pillon
,
B.
(
1981
).
Compulsive behaviour of the obsessional type with bilateral circumscribed pallidostriatal necrosis (encephalopathy following a wasp sting)
.
Revue Neurologique
,
137
,
269
276
.
Lau
,
B.
, &
Glimcher
,
P. W.
(
2008
).
Value representations in the primate striatum during matching behavior
.
Neuron
,
58
,
451
463
.
Lindquist
,
M. A.
, &
Mejia
,
A.
(
2015
).
Zen and the art of multiple comparisons
.
Psychosomatic Medicine
,
77
,
114
125
.
Mallet
,
N.
,
Schmidt
,
R.
,
Leventhal
,
D.
,
Chen
,
F.
,
Amer
,
N.
,
Boraud
,
T.
, et al
(
2016
).
Arkypallidal cells send a stop signal to striatum
.
Neuron
,
89
,
308
316
.
Maris
,
E.
, &
Oostenveld
,
R.
(
2007
).
Nonparametric statistical testing of EEG- and MEG-data
.
Journal of Neuroscience Methods
,
164
,
177
190
.
Merrill
,
E. G.
, &
Ainsworth
,
A.
(
1972
).
Glass-coated platinum-plated tungsten microelectrodes
.
Medical & Biological Engineering
,
10
,
662
672
.
Mink
,
J. W.
(
1996
).
The basal ganglia: Focused selection and inhibition of competing motor programs
.
Progress in Neurobiology
,
50
,
381
425
.
Mink
,
J. W.
, &
Thach
,
W. T.
(
1991a
).
Basal ganglia motor control: I. Nonexclusive relation of pallidal discharge to five movement modes
.
Journal of Neurophysiology
,
65
,
273
300
.
Mink
,
J. W.
, &
Thach
,
W. T.
(
1991b
).
Basal ganglia motor control: II. Late pallidal timing relative to movement onset and inconsistent pallidal coding of movement parameters
.
Journal of Neurophysiology
,
65
,
301
329
.
Mitchell
,
S. J.
,
Richardson
,
R. T.
,
Baker
,
F. H.
, &
DeLong
,
M. R.
(
1987
).
The primate nucleus basalis of Meynert: Neuronal activity related to a visuomotor tracking task
.
Experimental Brain Research
,
68
,
506
515
.
Morris
,
G.
,
,
D.
,
Nevet
,
A.
,
,
E.
, &
Bergman
,
H.
(
2004
).
Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons
.
Neuron
,
43
,
133
143
.
Noblejas
,
M. I.
,
Schechtman
,
E.
,
,
A.
,
Joshua
,
M.
,
Katabi
,
S.
, &
Bergman
,
H.
(
2015
).
Hold your pauses: External globus pallidus neurons respond to behavioural events by decreasing pause activity
.
European Journal of Neuroscience
,
42
,
2415
2425
.
Nougaret
,
S.
, &
Ravel
,
S.
(
2015
).
Modulation of tonically active neurons of the monkey striatum by events carrying different force and reward information
.
Journal of Neuroscience
,
35
,
15214
15226
.
Oorschot
,
D. E.
(
1996
).
Total number of neurons in the neostriatal, pallidal, subthalamic, and substantia nigral nuclei of the rat basal ganglia: A stereological study using the cavalieri and optical disector methods
.
Journal of Comparative Neurology
,
366
,
580
599
.
Parent
,
A.
, &
Hazrati
,
L. N.
(
1995
).
Functional anatomy of the basal ganglia: II. The place of subthalamic nucleus and external pallidum in basal ganglia circuitry
.
Brain Research Reviews
,
20
,
128
154
.
Pasquereau
,
B.
,
,
A.
,
,
D.
,
Bezard
,
E.
,
Goillandeau
,
M.
,
Bioulac
,
B.
, et al
(
2007
).
Shaping of motor responses by incentive values through the basal ganglia
.
Journal of Neuroscience
,
27
,
1176
1183
.
Peck
,
C. J.
,
Lau
,
B.
, &
Salzman
,
C. D.
(
2013
).
The primate amygdala combines information about space and value
.
Nature Neuroscience
,
16
,
340
348
.
Pessiglione
,
M.
,
Schmidt
,
L.
,
Draganski
,
B.
,
Kalisch
,
R.
,
Lau
,
H.
,
Dolan
,
R. J.
, et al
(
2007
).
How the brain translates money into force: A neuroimaging study of subliminal motivation
.
Science
,
316
,
904
906
.
Piedimonte
,
F.
,
Andreani
,
J. C. M.
,
Piedimonte
,
L.
,
Graff
,
P.
,
Bacaro
,
V.
,
Micheli
,
F.
, et al
(
2013
).
Behavioral and motor improvement after deep brain stimulation of the globus pallidus externus in a case of Tourette's syndrome
.
Neuromodulation
,
16
,
55
58
.
R Development Core Team
. (
2011
).
R: A language and environment for statistical computing
.
Vienna
:
R Foundation for Statistical Computing
.
Ravel
,
S.
,
Legallet
,
E.
, &
Apicella
,
P.
(
2003
).
Responses of tonically active neurons in the monkey striatum discriminate between motivationally opposing stimuli
.
Journal of Neuroscience
,
23
,
8489
8497
.
Saga
,
Y.
,
Hashimoto
,
M.
,
Tremblay
,
L.
,
Tanji
,
J.
, &
Hoshi
,
E.
(
2013
).
Representation of spatial- and object-specific behavioral goals in the dorsal globus pallidus of monkeys during reaching movement
.
Journal of Neuroscience
,
33
,
16360
16371
.
Saleem
,
K.
, &
Logothetis
,
N.
(
2007
).
A combined MRI and histology atlas of the rhesus monkey brain in stereotaxic coordinates
.
San Diego, CA
:
.
Samejima
,
K.
,
Ueda
,
Y.
,
Doya
,
K.
, &
Kimura
,
M.
(
2005
).
Representation of action-specific reward values in the striatum
.
Science
,
310
,
1337
1340
.
Satoh
,
T.
,
Nakai
,
S.
,
Sato
,
T.
, &
Kimura
,
M.
(
2003
).
Correlated coding of motivation and outcome of decision by dopamine neurons
.
Journal of Neuroscience
,
23
,
9913
9923
.
Schechtman
,
E.
,
Noblejas
,
M. I.
,
Mizrahi
,
A. D.
,
Dauber
,
O.
, &
Bergman
,
H.
(
2016
).
Pallidal spiking activity reflects learning dynamics and predicts performance
.
Proceedings of the National Academy of Sciences, U.S.A.
,
113
,
e6281
e6289
.
Schmidt
,
L.
,
D'Arc
,
B. F.
,
Lafargue
,
G.
,
Galanaud
,
D.
,
Czernecki
,
V.
,
Grabli
,
D.
, et al
(
2008
).
Disconnecting force from money: Effects of basal ganglia damage on incentive motivation
.
Brain
,
131
,
1303
1310
.
Shin
,
S.
, &
Sommer
,
M. A.
(
2010
).
Activity of neurons in monkey globus pallidus during oculomotor behavior compared with that in substantia nigra pars reticulata
.
Journal of Neurophysiology
,
103
,
1874
1887
.
Turner
,
R. S.
, &
Anderson
,
M. E.
(
1997
).
Pallidal discharge related to the kinematics of reaching movements in two dimensions
.
Journal of Neurophysiology
,
77
,
1051
1074
.
Vaillancourt
,
D. E.
,
Yu
,
H.
,
Mayka
,
M. A.
, &
Corcos
,
D. M.
(
2007
).
Role of the basal ganglia and frontal cortex in selecting and producing internally guided force pulses
.
Neuroimage
,
36
,
793
803
.
Worbe
,
Y.
,
Sgambato-Faure
,
V.
,
Epinat
,
J.
,
Chaigneau
,
M.
,
Tandé
,
D.
,
François
,
C.
, et al
(
2013
).
Towards a primate model of Gilles de la Tourette syndrome: Anatomo-behavioural correlation of disorders induced by striatal dysfunction
.
Cortex
,
49
,
1126
1140
.