Humans and animals must evaluate the costs and expected benefits of their actions to make adaptive choices. Prior studies have demonstrated the involvement of the basal ganglia in this evaluation. However, little is known about the role of the external part of the globus pallidus (GPe), which is well positioned to integrate motor and reward-related information, in this process. To investigate this role, the activity of 126 neurons was recorded in the associative and limbic parts of the GPe of two monkeys performing a behavioral task in which different levels of force were required to obtain different amounts of liquid reward. The results first revealed that the activity of associative and limbic GPe neurons could be modulated not only by cognitive and limbic but also motor information at the same time, both during a single period or during different periods throughout the trial, mainly in an independent way. Moreover, as a population, GPe neurons encoded these types of information dynamically throughout the trial, when each piece of information was the most relevant for the achievement of the action. Taken together, these results suggest that GPe neurons could be dedicated to the parallel monitoring of task parameters essential to adjusting and maintaining goal-directed behavior.
The motivational value of any action takes into account both the cost and the benefit gained once the action is completed. Costs can include the effort required to perform a movement, the number of actions required, or the temporal delay before reward delivery. The benefits can depend on reward preference and/or magnitude. There is a large body of evidence that the BG are involved in the encoding of reward-related information and, more generally, motivational processes. Neuropsychological (Bhatia & Marsden, 1994; Laplane, Widlocher, & Pillon, 1981), human imaging (Schmidt et al., 2008; Pessiglione et al., 2007), lesion (Baunez, Dias, Cador, & Amalric, 2005; Berridge & Cromwell, 1990), and pharmacological (Grabli et al., 2004) studies have shown the involvement of several BG structures in these processes. In monkeys, the reward-related activities observed in BG input structures, such as the striatum and the subthalamic nucleus (STN; Nougaret & Ravel, 2015; Lau & Glimcher, 2008; Darbaky, Baunez, Arecchi, Legallet, & Apicella, 2005; Samejima, Ueda, Doya, & Kimura, 2005; Ravel, Legallet, & Apicella, 2003; Apicella, Ljungberg, Scarnati, & Schultz, 1991; Hikosaka, Sakamoto, & Usui, 1989), as well as in output structures, such as the internal part of the globus pallidus (GPi) and the substantia nigra pars reticulata (SNr; Joshua, Adler, Rosin, Vaadia, & Bergman, 2009; Pasquereau et al., 2007), are thought to be partly supported by dopaminergic neurons' activity (Morris, Arkadir, Nevet, Vaadia, & Bergman, 2004; Satoh, Nakai, Sato, & Kimura, 2003; Hollerman & Schultz, 1998).
Few studies have addressed the role of the external part of the globus pallidus (GPe) in motivational processes. Indeed, the GPe has been considered primarily as a motor relay station in the indirect cortico-GPe-STN-thalamo-cortical pathway (Vaillancourt, Yu, Mayka, & Corcos, 2007; Turner & Anderson, 1997; Mink & Thach, 1991a, 1991b; DeLong, 1971). However, the view of the GPe within the BG has changed in the past decades. Shin and Sommer (2010) have described the role of GPe neurons in oculomotor behavior, showing neuronal activities related to visual stimuli triggering saccades and reward occurrence. Arkadir, Morris, Vaadia, and Bergman (2004) have shown that GPe activity is driven not only by cues predicting future reward probability but also by cues predictive of aversive outcomes (Joshua et al., 2009). It has recently been shown that GPe neurons are important for cognitive functions such as learning (Schechtman, Noblejas, Mizrahi, Dauber, & Bergman, 2016) and for encoding the stable reward value of an object (Kim, Amita, & Hikosaka, 2017). Recent studies have highlighted the role of the GPe as a central player in the orchestration of neuronal activity within the BG network (Deffains et al., 2016; Mallet et al., 2016; Bolam, Hanley, Booth, & Bevan, 2000). It provides a massive GABAergic input to the striatum, STN, and BG output structures and can modulate cortical afferents. Those results raised questions about the role of the GPe, which is mainly thought to be involved in BG circuitry as a motor relay, and shed light on the need to reevaluate the GPe's function in linking actions to their rewarding values.
We have investigated here, through an operant task, whether and how two factors influencing the motivated behavior, as well as reward size and effort to obtain it, are represented in the GPe. GPe neurons were recorded in monkeys performing a task allowing the modulation of both the size of reward and the level of force required to obtain it. In parallel, given the design of the task, the influence on the neuronal activity of the applied force itself has been studied. The aim of the study was to determine whether GPe neurons could integrate force and reward information as a motivational index. If it has been a topic of interest in the BG, this aspect has been overlooked in studies in the GPe. However, given its position in the circuitry, we could expect neuronal activities in this structure reflecting an integration of motivational information, such as neuronal activities showing the same trend as the acceptance level of the monkeys or postreward activities proportionally modulated by the effort produced to obtain this reward. No population of neurons specifically encoded the cost–benefit ratio of the action or the attractiveness of the cues as found in the striatum (Nougaret & Ravel, 2015). Conversely, we have shown that neurons of the associative and limbic GPe display activities in close relationship with task parameters (i.e., motor execution and/or significance of the cues) at each step of a behavior. GPe neurons seemed to be more likely to process the force and reward information carried by the stimuli in an integrative way right after the stimuli's occurrence, while they would encode the information independently later in the trial, so as to maintain the integrity of each piece of information. These neurons are also highly sensitive to reward size, mostly independent of what effort was required to obtain the reward. Thus, the GPe is not directly involved in the encoding of motivational processes but plays a role in processing motor or cognitive information essential to performing appropriate goal-directed behavior and could send its valuation to the output structures once the action is performed and the reward is obtained.
Animal and Apparatus
Two male rhesus monkeys (Macaca mulatta), weighing 8 and 7 kg at the beginning of the experiments (Monkeys M and Y, respectively), were trained to apply and maintain a force on a lever in response to visual cues to receive a liquid reward. All experimental procedures were in compliance with the National Institutes of Health's Guide for the Care and Use of Laboratory Animals, the French laws on animal experimentation, and the European Directive on the protection of animals used for scientific purposes.
The monkeys were seated in a Plexiglas primate chair and faced a panel supporting a 17-in. screen on which visual cues could be presented. The screen was positioned 18 cm from the monkey; a lever equipped with strain gauges in the lower part of the panel was positioned at waist level. A sliding door at the front of the primate chair could be opened to allow the animal to position its hand on the lever. The liquid reward (water) was delivered via a metal spout positioned directly in front of the monkey's mouth. The liquid was delivered through a solenoid valve located outside the recording room.
As illustrated in Figure 1A, at the beginning of each trial, the monkeys had to develop a basal pressing force on the lever, between 0% and 20% of the maximal force, defined experimentally at 900 g based on the capabilities of the animals, during a 1-sec preparatory period. After this period, two visual cues, a green one and a red one, each being either a filled circle or a filled square, were presented vertically in the center of the screen. The shape of the green stimulus indicated the level of force the animals had to produce on the lever, and that of the red stimulus indicated the size of the upcoming reward. A green circle indicated that the animals had to produce a force between 20% and 55% of the maximal force (180–495 g; low force: f); a green square, a force between 55% and 90% of the maximal force (495–810 g; high force: F). In the same way, a red circle indicated to the animals that the reward delivered would be small (0.3 mL of water; small reward: r), whereas it would be large (1.2 mL of water; large reward: R) if a red square were displayed. The four possible combinations of cues (fR, FR, fr, and Fr) set the four different conditions of the task. In response to these stimuli, monkeys had to increase their pressing force on the lever to reach the required force in a period shorter than 1 sec (maximal RT) and hold this force for 1 sec (holding time) to obtain the reward. For each correct trial, monkeys were rewarded with the small or large reward according to the shape of the red stimuli. Both cues were extinguished as soon as the reward was delivered. To achieve the required force, monkeys were helped by visual feedback: A vertical rectangle representing the range of the required force was located just below the cues. In this rectangle, a white cursor indicated in real time the force developed on the lever when they were in the required force range. To keep cues constant across trial conditions, the animals saw the same rectangle for both the low and high force ranges. After reward delivery, the monkeys returned to a basal pressing force in preparation for the next trial. This did not begin until the total duration of the current trial, which lasted 4.5 sec regardless of the animal behavior, had elapsed.
There were three different cases in which a trial was considered as failed and no reward was given. First, trials in which the required force was not reached within the 1-sec force development period were considered “omission errors.” Second, trials in which the required force was not held for at least 1 sec (holding time) were considered “holding errors.” Last, trials in which the force developed was greater than the upper limit of the required force (495 and 810 g for the low and high forces, respectively) were considered “threshold errors.” Both “holding” and “threshold” errors were considered to be execution errors. In case of an error, the same combination of cues was presented again to the monkeys until they performed the trial correctly. Moreover, trials in which the monkeys began to increase their pressing force within 100 msec after the occurrence of the cues were considered to be anticipations and were not included in the database.
Before the electrophysiological recordings began, the monkeys were extensively trained (4–6 months) until a performance threshold of 80% correct trials was achieved, in which the preparatory period, the maximal RT, and the holding time were all of 1 sec. In each recording session, the four different combinations of cues were presented pseudorandomly from trial to trial. The first trial of a session was randomly chosen from a list of trials in which each condition was present in the same proportion. The same cues were not presented more than three times sequentially if trials were performed correctly.
Initial anesthesia was administered by an intramuscular injection of ketamine (10 mg/kg) and xylazine (0.5 mg/kg), followed by deep anesthesia induced by isoflurane. A polyether-ether-ketone recording chamber (19-mm inner diameter) was implanted over the left hemisphere. Recording chambers in both monkeys were positioned with a 20° angle laterally in the coronal plane. The targeted stereotaxic coordinates, relative to ear bars, were as follows: Monkey M: anterior = 18 mm, lateral = 16 mm; Monkey Y: anterior = 14 mm, lateral = 16 mm, based on the atlas of Saleem and Logothetis (2007). During the same surgery, two titanium cylinders were embedded in the orthopedic cement (Palacos with gentamycin) and fixed to the skull with titanium orthopedic bone screws for subsequent head restraint during neuronal recordings. After surgery, monkeys were given antibiotics (Marbocyl, 2 mg/kg) and analgesics (Tolfedine, 4 mg/kg) on the day of the surgery and for the 4 following days. The recording chamber was filled with an antibiotic solution (Marbocyl, 2 mg/mL) and sealed with a removable cap.
While the monkeys were performing the task with head immobilization, extracellular activity of single neurons was recorded with custom-made glass-insulated tungsten microelectrodes based on the technique of Merrill and Ainsworth (1972). To record from the BG structures, a stainless steel guide tube (diameter = 0.6 mm) was lowered below the surface of the dura, and the microelectrode was passed inside the guide and was advanced using a manual hydraulic microdrive (M096; Narishige). The microelectrode was connected to a preamplifier located in close proximity to the microdrive. The neuronal signal was then amplified 5,000 times, filtered at 0.3–1.5 kHz, and converted to digital pulses through a window discriminator (Neurolog; Digitimer). The presentation of the cues, the force developed by the animal, the delivery of the reward, and digital pulses from neuronal activity were controlled by a computer using custom-designed software written in LabVIEW (LabVIEW; National Instrument).
The recording electrode was lowered to isolate neurons while the monkey performed the task. We isolated single neurons by continuously monitoring the waveform of the recorded neuronal impulses on an oscilloscope. Neurons were localized in the GPe using expected stereotaxic coordinates and the characteristic firing patterns associated with neurons in regions dorsal to the GPe. Along the electrode trajectory, striatal tissue dorsal to the GPe could be identified by the presence of both tonically active neurons (tonic firing rates in range from 3 to 10 spk · s−1; Apicella, Legallet, & Trouche, 1997) and the very low-frequency activity of phasically active neurons (Apicella, 2002). As the electrode continued to be lowered, the dorsolateral border of the GPe was identified by an increase in background noise immediately after a short silence (DeLong, 1971).
Within the GPe, most neurons exhibited high-frequency activity, in many cases interrupted by pauses. A minority of neurons had low-frequency discharge rates interrupted by high-frequency bursts. The typical electrophysiological activity of individual GPe neurons was characterized by a narrow and high-amplitude waveform (Elias et al., 2007; DeLong, 1971). The detection of the ventrolateral border of the GPe was based on the dorsoventral length of the GPe for each coronal plane, as described in the atlas of Saleem and Logothetis (2007), and assured by the presence, when entering the GPi, of most neurons displaying a high-frequency discharge with no pauses. The activity of the first well-isolated and stable pallidal unit in a trajectory was recorded for at least 10 trials per condition. After recording from a GPe neuron, the electrode was moved forward until another GPe neuron was encountered. Data from all GPe neurons recorded were included in analyses.
Localization of Recordings
To assess the localization of our recordings, we used a high-resolution MRI scan for each monkey with electrodes positioned (five for monkey M and six for monkey Y) in trajectories from which we recorded GPe neurons. MR images were collected using a T1-weighed sequence (recovery time = 1700 msec, echo time = 4.414 msec, flip angle = 30°, in-plane resolution = 0.6 × 0.6 mm, thickness = 0.6 mm). On the basis of the localization of these electrode tips, we extrapolated the inferior/superior, anterior/posterior, and medial/lateral positions of each recorded neuron to generate a 3-D reconstruction using Brainsight software (Brainsight; Rogue Research; Figure 1B and C). The GPe position was determined based on the anterior commissure (AC) visualization as well as the shape of the striatum and surrounding cortical areas. The slices of the MR images of each monkey were matched with atlas sections.
The delimitations of the sensorimotor, associative, and limbic territories of the GPe were then determined based on previous studies (Grabli et al., 2004; François, Yelnik, Percheron, & Fénelon, 1994; Haber, Lynd-Balta, & Mitchell, 1993) and localized on a map of the GPe based on the MRI slices of each monkey.
All data analyses were performed using conventional statistical procedures with the R statistical computing environment (R Development Core Team, 2011). Data were analyzed from 13,046 trials performed during 126 recording sessions: 3,293 were performed in the fr condition; 2,656, in the fR condition; 3,975, in the Fr condition; and 3,122, in the FR condition.
RT, which was the duration between the onset of the cue and the time at which the monkey started to increase its pressing force on the lever, was measured only for correct trials. RTs were changed into z scores for normalization purposes, and a two-way ANOVA was performed with Required force and Expected reward as the two factors. Error rates (ERs; i.e., the total number of errors performed in a condition divided by the total number of trials [both correct and error trials] performed in this condition) were calculated and compared with a Pearson's chi-squared test. Each p value was corrected by Bonferroni correction, and differences were considered to be significant when p < .0083 (0.05/6, six possible comparisons). In each condition, the proportion of omission and execution errors was determined by dividing the number of one type of error (execution or omission) by the total number of errors in this condition. Acceptance level was computed by dividing the total number of trials accepted by the animal in a given condition (correct trials + holding and threshold errors) by the total number of trials performed in this condition. This acceptance level reflects whether the animal chose to perform the task or not, depending on the level of force and the reward size. The force developed by the animals in each trial at each time of the task was collected and averaged by condition to highlight possible differences within a same range of force between two different reward conditions.
To examine the dynamics of the encoding of the force applied and the force and reward factors, a sliding window analysis was used, with windows of 200 msec shifted in increments of 10 msec. We performed the GLM analyses as described previously during periods covering the entire trial duration: a peristimuli period (from −1000 to 500 msec after the cues' occurrence; 131 bins) and a holding–postreward period beginning 500 msec before the beginning of the holding period, including the holding period of 1 sec and the postreward period of 1 sec (from −500 to 2500 msec after crossing of the low threshold of the required force; 281 bins). We considered the beginning of each window as the reference for each time measured (i.e., if the modulation was observed in the window between 50 and 250 msec, we considered that it occurred at 50 msec). For each factor, a modulation was considered to be significant if the percentage of modulated neurons was higher than the percentage of neurons modulated by chance (computed on the mean of the percentage of neurons modulated by this factor during the baseline period) plus 2 SDs for at least five consecutive steps.
Modulation of the Behavioral Responses by the Required Force Level and the Expected Reward Size
Behavioral analyses were performed on 126 sessions (56 from Monkey M and 70 from Monkey Y) during which we recorded GPe neurons.
Average RTs to reach the required force threshold after the occurrence of cues were computed from the correct trials only (4,461 from Monkey M and 5,425 from Monkey Y; Figure 2A and E). RTs were significantly shorter for the large reward trials than for the small reward ones in Monkey M (two-way ANOVA on RT z score, preward < .001, F(1, 4457) = 48.48). RTs were also significantly shorter during the high-force trials than during the low-force ones in this monkey (two-way ANOVA on RT z score, pforce < .05, F(1, 4457) = 6.55). There was no significant difference among the RTs of Monkey Y, although there was a slight decrease for the most favorable condition: low force/large reward. In both monkeys, there was no interaction effect between the required force level and the size of the expected reward on the RTs.
ERs were computed from the total number of trials performed by the animals (5,413 from Monkey M and 7,633 from Monkey Y), including correct and error trials (Figure 2B and F). The ERs were significantly higher in the small reward conditions than in the large reward ones for the same required force (low force: p < .01 (p = 8.99.10−14 and 3.81.10−28), χ2 = 55.6 and 121.0 for Monkeys M and Y, respectively; high force: p < .01 (p = 1.08.10−13 and 2.87.10−63), χ2 = 55.2 and 281.9 for Monkeys M and Y, respectively; Figure 2B and F). Moreover, for the same expected reward, the ERs were significantly higher in the high-force conditions than in the low-force ones (small reward: p < .01 (p = 3.43.10−11 and 1.44.10−23), χ2 = 43.9 and 100.1 for Monkeys M and Y, respectively; large reward: p < .01 (p = 1.05.10−11 and 1.97.10−5), χ2 = 46.2 and 18.2 for Monkeys M and Y, respectively).
The level of acceptance allowed us to rank the four conditions in the same preference order for the two animals—low force/high reward (fR), high force/high reward (FR), low force/small reward (fr), and high force/small reward (Fr; Figure 2C and G)—from the condition in which they had the highest acceptance level to the lowest acceptance level. For both monkeys, the size of expected reward seemed to be more relevant than the level of effort for their decision of whether to perform the task. In the fR conditions, monkeys decided to perform the action in 99.7% (Monkey M) and 98.5% (Monkey Y) of the trials. In contrast, in the Fr conditions, they only performed the action in 88.2% (Monkey M) and 71.2% (Monkey Y) of the trials. FR trials were accepted more frequently (96.7% for Monkey M and 96% for Monkey Y) than fr trials (93.3% for Monkey M and 90.1% for Monkey Y). These results show that the monkeys understood the task and integrated the cost of each condition (particularly the less favorable one). Indeed, not only the effort to be made but more so the size of the expected reward contributed to the subjective value of the action.
As depicted in Figure 2D and H, for the same amount of force required, the average force applied by the animals was different depending on the expected/received reward in some periods. This result led us to consider the force applied as a factor in our analyses of the neuronal activity to isolate a reward effect from any mechanical variation.
Localization of the Recordings
One hundred twenty-six neurons recorded (56 and 70 from Monkeys M and Y, respectively) were located inside the GPe. In the antero/posterior plane from the AC, neurons were recorded from AC + 2 to AC − 2 and AC + 2 to AC − 3, for Monkeys M and Y, respectively. All recorded neurons were located in the anterior part of the GPe; the most posterior part, described to be essentially sensorimotor (Worbe et al., 2013; Grabli et al., 2004; François et al., 1994; Haber et al., 1993), was not investigated. Considering the medial/lateral and inferior/superior positions of our recordings, we determined that 75 neurons (34 for Monkey M and 41 for Monkey Y) were located in the dorsal “associative part” of the GPe and 51 (22 for Monkey M and 29 for Monkey Y) were in the ventral and rostral “limbic part” of the structure (Figure 1B and C). No correlation between the localization and the neuronal properties at encoding either the force or the reward information could be found.
Electrophysiological Properties of GPe Neurons
Most of the 126 GPe neurons recorded (41/56 and 51/70 from Monkeys M and Y, respectively; i.e., 92/126 [73%]) were neurons exhibiting high-frequency activity (Figure 3A),with a mean firing rate of 58.08 ± 2.37 spk · s−1 (min = 20.09, max = 132.70) and a CV of their ISIs of 1.06 ± 0.04 (min = 0.49, max = 1.9). Most of these neurons showed pauses in their firing rate and were assumed to be the high-frequency discharge interrupted by pauses neurons previously described (Elias et al., 2007; Arkadir et al., 2004; DeLong, 1971). Twenty-one neurons (7/56 and 14/70 from Monkeys M and Y, respectively; 21/126 [17%]) with a lower frequency of discharge and occasional brief high-frequency bursts were recorded (Figure 3B). The mean firing rate of these neurons was 12.42 ± 1.7 spk · s−1 (min = 1.10, max = 33.12), and the mean CV was 1.50 ± 0.06 (min = 1.1, max = 2.4). The pioneering study of DeLong (1971) classified the GPe neurons into two categories: 85% of neurons exhibiting a high-frequency discharge and 15% exhibiting a low-frequency discharge and bursts. In addition to these two types of neurons, we recorded 13 neurons (8/56 and 5/70 from Monkeys M and Y, respectively; 13/126 [10%]) with regular patterns of discharge (Figure 3C). These cells showed a lower activity frequency than the high-frequency discharge interrupted by pauses, with a regular pattern and a mean firing rate of 31.47 ± 3.61 spk · s−1 (min = 8.6, max = 56.66). Their CVs were very low (below 0.5) when the neurons showed a firing rate > 20 spk · s−1 and slightly higher (between 0.5 and 0.7) for neurons with a firing rate < 20 spk · s−1. Although they could correspond to the “border cells” described by Mitchell, Richardson, Baker, and DeLong (1987), on the basis of their properties, they were found throughout the GPe and not only at the border of the structure. The properties of these three subpopulations of neurons, based on their firing rate and CV, are summarized in Figure 3D.
The number of neurons modulated by the force applied or the task factors (force, reward, and/or interaction) did not differ (corrected p > .05) across these three types of neurons and between the two monkeys or the GPe territories (associative vs. limbic) during any trial period (comparison among categories: χ2 < 8.09, corrected p < .05; comparison between monkeys: χ2 < 0.73, corrected p < .05; comparison between territories: χ2 < 2.14, corrected p < .05). Consequently, these three groups of neurons and the data from the two monkeys across territories were pooled and considered as a single population for the subsequent analyses.
Modulation of the Neuronal Activity by the Force Applied and Task Factors during the Three Periods of the Task
This visuomotor task was designed to explore the responses of GPe neurons to stimuli carrying effort- and reward-related information. However, given the presence of the motor response, it also allowed us to study their modulations by the motor feature of the task, that is, the force applied. After the occurrence of the cues, the average activity increased in all four conditions and remained above chance all along the periods of the task (Figure 4A). Around 30% of the neurons (37/126, 29%) showed a modulation in their activity by the force applied during at least one period of the task. No continuity in the modulation by the force applied was seen along the trial. Indeed, most of these neurons (30/37, 81%) were modulated during only one period. The number of neurons modulated by the force applied was different among the task periods (χ2 = 10.1 p = .006), with 10% (13/126) of modulated neurons during the cue threshold period, 6% (8/126) during the holding period, and 19% (24/126) during the postreward period. The barplot in Figure 4B depicts a summary of the modulation of the neuronal activity by the force applied and the force and reward factors during the three periods of the task. During the cue threshold period, 56 of 126 (44%) neurons were modulated by one of the task factors; 71 of 126 (56%), during the holding period; and 62 of 126 (49%), during the postreward period. Only 13 of 126 neurons (10%) showed no modulation of their activity by the force applied and by the task factors. We then compared the proportions of neuronal responses with the force applied and the force and reward factors during each analyzed period. During the cue threshold period, we found that more neurons were modulated by the force factor than by the force applied (χ2 = 4.37, p < .05) but there was no difference among the number of neurons modulated by the force factor, the reward factor, or the interaction between both. During the holding period, a higher number of neurons was significantly modulated by the force or reward factor than by an interaction or the force applied (χ2 > 14.731, p < .05). Finally, during the postreward period, more neurons were modulated by the amount of reward than by any other variable (χ2 > 14.662, p < .05).
To further study the dynamics of these modulations throughout the task, we have examined, for each neuron, which effect in one period followed or was followed by an effect in another period. As previously mentioned, the neurons modulated by the force applied during one period were usually not involved in the encoding of the force applied during other periods. Moreover, most of these neurons were comodulated by the task factors during the same period or during other periods of the task. Indeed, only 10 neurons were modulated by the force applied without any effect of the force or the reward factors along the task, emphasizing the lack of selectivity of these neurons to encode the motor aspect of the action. Furthermore, the neurons showing a force effect during the cue threshold period were not the same as those showing a force effect during the holding period: Only 9 of 30 (30%) of these neurons shared this modulation during both periods. On the other hand, more than two thirds (28/41, 68%) of the neurons showing a reward effect during the holding period still encoded only this factor during the postreward period. Consequently, the population of neurons encoding only the reward size during the postreward period was composed by neurons already involved in this process during the holding period, plus neurons recruited to encode this factor when the reward was received.
Weight and Direction of Activity Modulation by Task Factors
We estimated an FSI and an RSI for each neuron in each period (see Methods) to quantify the modulation by the task factors force and reward, respectively. An index above zero indicates a stronger activity in the high-force/reward conditions. Conversely, an index below zero indicates a stronger activity in the low-force/reward conditions. During the four defined periods, a comparable number of neurons were FSI-positive and FSI-negative (cue threshold: 12 vs. 18; holding: 21 vs. 19; postreward: 7 vs. 6; binomial test, p > .05). Similarly, a comparable number of neurons were RSI-positive and RSI-negative (cue threshold: 12 vs. 14; holding: 17 vs. 24; postreward: 25 vs. 28; binomial test, p > .05). Figure 5A–C (left) illustrates the distribution of FSI and RSI positive and negative neurons. The distributions of the FSI and RSI of the modulated neurons were centered around 0 (Wilcoxon test, p > .05). Consequently, the examples to the right of the figure illustrate the type of modulation that can be found in the activity of GPe neurons and not the activity of the entire population (Figure 5A–C, right).
Dynamics of the Force Applied and Task Factors Effects throughout the Task
To precisely define the temporal profile of neuronal activity modulation by the force applied and the task factors throughout the whole trial, we performed a sliding windows analysis (see Methods; Figure 6). After the presentation of the cues, the reward effect (Figure 6C, left) occurred before (60 msec) the force applied (Figure 6A, left) or force (Figure 6B, left) effects (190 and 240 msec, respectively).
At the beginning of the holding period, 49 neurons (38.9%) showed a force effect (Figure 6B, right). Gradually, as the reward approached, GPe neurons were less driven by the amount of required force, because only 13 neurons (10.3%) showed a force effect at the end of the holding period. During this period, the modulation by the amount of expected reward also decreased. However, compared with the required force influence, it was quite steady, with 17.5–28.6% of neurons showing a reward effect (Figure 6C, right). Consequently, during the first 500 msec of the holding period, the activity of GPe neurons tended to be more driven by the amount of required force, and during the last 500 msec, neurons were driven equally by both factors. Interestingly, during the holding period, supposedly characterized by motor activity, GP neurons were scarcely modulated by the force applied, with only 2.4–11.9% of the neurons showing this effect across this period (Figure 6A, right).
As expected, during the postreward period, the number of neurons showing a force effect remained low (between 4.0% and 15.9%; Figure 6B, right), whereas the number of neurons showing a reward effect increased significantly in comparison with the holding period (average number plus 2 SDs), starting 240 msec after the reward occurrence and continuing to increase until 670 msec after the reward delivery, when 46% of the neurons were modulated by the amount of reward (Figure 6C, right). There was also an important increase in the number of neurons modulated by the force applied in this period, starting 440 msec after the reward and decreasing slightly toward the end of the period (Figure 6A, right). Finally, in comparison with the strong force and reward effects observed throughout the task, the proportion of neurons modulated by an interaction of these two factors remained low, except in the cue threshold period. Conversely, during the holding and postreward periods, force and reward information appeared to be mostly encoded in an independent manner.
We designed our task to modulate the motivation of the animals and study how this motivation can be influenced by motor effort and reward processes. Our behavioral data confirmed that there was modulation of the motivational level of the animals, in terms of whether they chose to perform the given action leading to a specific reward at a certain cost. Our results revealed three main properties of associative and limbic GPe neurons. First, at the single neuron level as well as at the population level, GPe neurons can be modulated by motor, cognitive, and/or limbic information at the same time during a single period or during different periods throughout the trial course, mostly in an independent way. Second, at the population level, GPe neurons encoded the force applied and both force and reward level information dynamically along the task, at the time at which each piece of information was the most relevant for the correct execution of the action. Finally, and following the second point, after the reward delivery, GPe neurons' activity was strongly modulated by the reward size.
Convergence of Motor and Cognitive Information in the GPe
The task used in our study required motor action and involved cognitive and limbic processes during the various periods within a trial. For example, cognitive and limbic processes are required after the occurrence of the cues for their evaluation, and motor processes are required during the action initiation and execution. This study shows that few GPe neurons were dedicated only to the encoding of the force applied, the motor parameter of our task. They usually encoded both the force applied and the task factors or were modulated only by one or both task factors, the force required and the expected reward. Indeed, a single GPe neuron could be modulated by motor, cognitive, and/or limbic information during a single period or other periods of the trial. These results are consistent with the theory of convergence of information at the GPe level. Previous electrophysiological studies have suggested that the GPe is a site of convergence of information processed separately in different cortical areas and striatal territories (Arkadir et al., 2004). This hypothesis is supported by anatomical data such as the convergence of inputs from the striatum and the STN onto single pallidal neurons (Parent & Hazrati, 1995), the reduction of the number of neurons between the striatum and the GPe (Oorschot, 1996), and the organization of the dendritic field of GPe neurons (Kita & Kita, 2001). Information regarding the movement, as well as force and reward levels converging at the GPe level, could be conveyed in several cases by a single pallidal neuron but also at the population level, through neurons modulated by the force applied, and force and/or reward information. Interactions between factors were mainly observed during the cue threshold period, when the information should be computed to set the correct action to execute. Interactions were only sparsely found later in the trials, suggesting that the GPe is more likely to play a role in the linear summation of these different types of information from the input structures of the basal BG without a real integration as such. However, GPe neurons could have an integrative role by making convergent synaptic contacts in their target structures, namely, STN, striatum, substantia nigra pars compacta, and output nuclei (Bolam et al., 2000).
GPe Neurons Encoded the Force Applied and Both Force and Reward Levels Information Dynamically along the Task
In this study, monkeys performed a deterministic task in which cues regarding the level of effort required and the size of the upcoming reward were given to the monkey simultaneously, and these same cues served as a go signal to initiate the production of the required force. Modulation of GPe neurons by reward size was observed throughout the trial until reward delivery, which is the ultimate goal of the action, as an uninterrupted rewarding message sent during the execution of the action. During the execution period, most neurons encoded the force level information; but very few, the actual force applied. This highlights the ability of GPe neurons to integrate the representation of the force information in their activity and not only its execution. This is in line with the role of the dorsal GPe in representing behavioral goals to be referred to at the time of motor target decision (Saga, Hashimoto, Tremblay, Tanji, & Hoshi, 2013) and confirms the role of the GPe in nonmotor processes (Kim et al., 2017; Schechtman et al., 2016). On the contrary, very few neurons were sensitive to the reward probability during the movement execution in a probabilistic task (Arkadir et al., 2004), suggesting that the GPe might not be a part of the BG circuitry involved in the reward prediction error processing or reward uncertainty. The continuous influence of reward information on neuronal activity supports previous findings showing that it is rather involved in maintaining an already well-established goal-directed behavior with a certain and stable reward value (Kim et al., 2017). The influence of the expected reward on neuronal activity related to goal-directed actions has been demonstrated in the striatum (Hassani, Cromwell, & Schultz, 2001; Hollerman, Tremblay, & Schultz, 1998). The striatum could play a major role in the establishment of goal-oriented behaviors, whereas the GPe could be involved in maintaining stable behavior. The independent processing of information found during the action execution, contrasting with the less independent processing at the time of action planning, might be a way to avoid interference with new information. The GPe could operate as a receiver of several types of information from the input structures and could process each in an independent manner, allowing new information to update the behavior while maintaining already established ones. The encoding of the expected reward during the execution of the action could inform the animal during its effort about the future reward that will be delivered at the end of its action, possibly allowing it to maintain a certain level of motivation to overcome the effort required. These features could lead to the hypothesis of a role of the associative and limbic GPe in monitoring important information when it is essential for the action execution. After reward delivery, when the cognitive load decreases, the neurons could shift to the encoding of movement, as shown by the larger number of neurons modulated by the force applied during this period.
GPe Neurons Are Strongly Modulated by Reward Size, but only Sparsely Integrate Force and Reward Information
The increase in neuronal activity at the population level after cues occur is consistent with previous findings (Deffains et al., 2016; Noblejas et al., 2015). Noblejas and colleagues have shown that the average GPe activity increases in response to relevant behavioral events. In our task, reacting to the cues is crucial to succeeding in receiving the expected reward. Thus, the increase in the average neuronal activity at this time would allow a disinhibition of the output structures via the indirect pathway, to facilitate the future action. Interestingly, it is also around the time of action planning that we observed the highest number of neurons whose activity was modulated by the interaction between the force and reward factors. These features support the idea that the GPe could be an important player in motivational processes occurring at the BG level.
The large number of reward-sensitive neurons at the reward delivery suggests a role for the GPe in the encoding of the behavioral outcome. The message sent by these neurons to their targets (the output structures, the GPi, the SNr, and/or the STN) has to be powerful to signal the achievement of the action, thus allowing the information regarding the gain resulting from the performed action to be maintained in the system. This message was found to be driven equally by two groups of neurons, providing feedback about the benefit (small or large reward) obtained whatever the circumstances. Neurons encoding different aspects of rewarding and punishing outcome have been described in several subcortical areas in monkeys, including the striatum (Cromwell & Schultz, 2003; Ravel et al., 2003) and the STN (Espinosa-Parrilla, Baunez, & Apicella, 2015; Darbaky et al., 2005). In our study, the encoding of the reward appeared more as a feedback of the action performed (a cost–benefit balance feedback) than as a pure inhibition of the movement, unlike what has been classically seen in the indirect pathway (Mink, 1996; Albin, Young, & Penney, 1989). The long latency to reach the maximal mobilization of GPe neurons in this process suggests that these neurons might encode the consequences of the action once the reward has been consumed rather than the amount of reward.
The present findings suggest that the GPe could have a large influence on motivated behaviors, before sending information about effort and reward to the GPi and the SNr. Some clinical studies highlight the involvement of the pallidum in disorders of diminished motivation (Bhatia & Marsden, 1994), athymhormia or autoactivation deficit (Habib & Poncet, 1988; Laplane et al., 1981), depression (Bielau et al., 2005), or Gilles de la Tourette syndrome (Piedimonte et al., 2013) and bring out the importance of GPe neurons in balancing the activation and inhibition of cortical areas. The present results showed that the GPe could be involved in updating the consequences of an action, based on motivational information. In patients experiencing the disorders mentioned above, this permanent revision of the consequences of the action could be prevented and therefore lead to inappropriate behaviors. To better understand the neural bases of motivational processes, it will be very interesting to study in the same experimental paradigm how this information is encoded within the input structures of the upper stage of the BG, the striatum and the STN, which are also the two major afferences of the GPe.
This work was supported by Centre National de la Recherche Scientifique, the Aix-Marseille Université, and the Fondation de France (Parkinson Disease Program grant 2008 005902 to S. R.). We thank Drs. Christelle Baunez, Guillaume Masson, Mathias Pessiglione, Janine M. Simmons, Romain Trachel, and Dakota Smith for helpful comments and discussions.
Reprint requests should be sent to Sabrina Ravel, Institut de Neurosciences de la Timone, UMR7289 Centre National de la Recherche Scientifique and Aix-Marseille Université, 13005 Marseille, France, or via e-mail: firstname.lastname@example.org.