Abstract

Knowing whether core reward regions carry information about the positions of relevant objects is crucial for adjudicating between choice models. One limitation of previous studies, including our own, is that spatial positions can be consistently differentially associated with rewards, and thus position can be confounded with attention, motor plans, or target identity. We circumvented these problems by using a task in which value—and thus choices—was determined solely by a frequently changing rule, which was randomized relative to spatial position on each trial. We presented offers asynchronously, which allowed us to control for reward expectation, spatial attention, and motor plans in our analyses. We find robust encoding of the spatial position of both offers and choices in two core reward regions, orbitofrontal Area 13 and ventral striatum, as well as in dorsal striatum of macaques. The trial-by-trial correlation in noise in encoding of position was associated with variation in choice, an effect known as choice probability correlation, suggesting that the spatial encoding is associated with choice and is not incidental to it. Spatial information and reward information are not carried by separate sets of neurons, although the two forms of information are temporally dissociable. These results highlight the ubiquity of multiplexed information in association cortex and argue against the idea that these ostensible reward regions serve as part of a pure value domain.

INTRODUCTION

Many models of economic choice posit the existence of a pure value domain in which values of options can be compared in the absence of factors that are not directly associated with value. Among these factors, space is particularly interesting because of its close connections to input and output and thus its strong conceptual difference from value. The pure value domain is a central element of serial discrete stage models, in which comparison is conceptually distinct from object localization and action planning (Padoa-Schioppa & Conen, 2017; Levy & Glimcher, 2012; Padoa-Schioppa, 2011; Rangel & Hare, 2010; Padoa-Schioppa, Jandolo, & Visalberghi, 2006). These discrete stage models can be contrasted with distributed models in which choice involves a gradual transformation from an input to an output domain (Hunt & Hayden, 2017; Pearson, Watson, & Platt, 2014; Cisek, 2012; Cisek & Kalaska, 2010). In such models, spatial information related to positions of options (input) and potential actions (output) will naturally be found ubiquitously throughout the system.

In past decades, ample evidence and counterevidence regarding the question of whether core reward regions carry information about space have appeared (e.g., McGinty, Rangel, & Newsome, 2016; Strait et al., 2016; Grattan & Glimcher, 2014; Luk & Wallis, 2013; Abe & Lee, 2011; Padoa-Schioppa & Assad, 2006; Mulder, Shibata, Trullier, & Wiener, 2005). One critical reason for the continued uncertainty is the potential confound between spatial encoding and object identity (Padoa-Schioppa & Cai, 2011). That is, reward-related neural activity may be object-specific and may depend on spatial position because the brain uses space as an index for the identity of the object (this is natural in highly constrained computerized tasks). A second potential confound is closely related. A given spatial position may, over time, be consistently associated with a particular reward, and learning processes then generate a strong association between that place and the reward. (This will be especially common in overtrained laboratory tasks.) Then, differential reward-related firing would lead to ersatz spatial selectivity. This confound would also be observed if the association produced consistent modulations in attention or motor planning for certain spatial positions.

To get around these problems, we used a Conceptual Set-shifting Task (CSST), a variant of the WCST optimized for primates (Sleezer & Hayden, 2016; Moore, Killiany, Herndon, Rosene, & Moss, 2005). In this difficult and engaging task, the target identified was defined solely by rule, which changed frequently (every 15 correct trials). The rule was defined as a particular color (cyan, magenta, or yellow) or shape (circle, triangle, star) that was behaviorally relevant (i.e., rewarded) on each trial, and spatial positions of all stimuli were changed randomly on every trial. These features allowed us to fully dissociate spatial position from target identity and value, as well as attention and motor planning, across trials.

We found robust encoding of the positions of both offers and choices in two core reward regions, orbitofrontal Area 13 (OFC) and ventral striatum (VS); variations in spatial encoding predicted variations in choice. We also analyzed responses in dorsal striatum (DS), a region whose spatial repertoire is better established for comparison (e.g., Stott & Redish, 2014; Lau & Glimcher, 2007). Responses in all three regions were qualitatively similar, indicating that the spatial selectivity in OFC and VS is not substantially weaker than that observed in DS. Encoding of the position of the chosen offer peaked during the feedback epoch in all areas and was limited to a brief postchoice epoch in OFC but had a longer duration in the two striatal regions. Spatial and reward information did not appear to be processed by distinct subsets of specialized neurons; neurons showed consistent multiplexed selectivity. Nonetheless, space and reward showed distinct time courses, with spatial information generally signaled more rapidly than reward information. Finally, the encoding schemes used for offers and choices were unrelated, suggesting that space is not an intrinsic organizing feature of these regions, as it is in the parietal or occipital cortex, but rather part of the broader set of task-relevant variables these neurons encode.

METHODS

Surgical Procedures

All animal procedures were approved by the university committee on animal resources at the University of Rochester and were designed and conducted in compliance with the Public Health Service's Guide for the Care and Use of Animals. Two male rhesus macaques (Macaca mulatta) served as subjects. We used standard electrophysiological techniques described previously (Strait, Blanchard, & Hayden, 2014). A small prosthesis for holding the head was used. Animals were habituated to laboratory conditions and then trained to perform oculomotor tasks for liquid reward. A Cilux recording chamber (Crist Instruments, Hagerstown, MD) was placed over the striatum and the OFC. The position was verified by magnetic resonance imaging with the aid of a Brainsight system (Rogue Research, Montreal, Quebec, Canada). Animals received appropriate analgesics and antibiotics after all procedures. Throughout both behavioral and physiological recording sessions, the chamber was kept sterile with regular antibiotic washes and sealed with sterile caps.

Recording Sites

All data presented in this report were collected and analyzed in previous studies; all results presented here are new (Sleezer, Loconte, Castagno, & Hayden, 2017; Sleezer, Castagno, & Hayden, 2016; Sleezer & Hayden, 2016). OFC, VS, and DS were approached through a standard recording grid (Crist Instruments) using a standard atlas for all area definitions (Paxinos, Huang, & Toga, 2000). OFC was defined by the coronal planes situated between 29 and 36 mm rostral to the interaural plane and the horizontal planes situated between 0 and 9 mm from the ventral surface and lateral to the medial orbital sulcus. Recordings were made from Area 13m (Ongür & Price, 2000) and from VS and DS according to that atlas. VS was defined as lying within the coronal planes situated between 28 and 21 mm rostral to the interaural plane, the horizontal planes situated between 0 and 8 mm from the ventral surface of striatum, and the sagittal planes between 0 and 9 mm from the medial wall. DS was defined as the regions of striatum dorsal to the VS within the same coronal planes (Figure 1). The majority of our VS recording sites were located in a region corresponding to the core of the nucleus accumbens. Recordings were made broadly throughout these regions.

Figure 1. 

Task and recording sites. (A) Task structure and timeline of WCST. (B) ROIs. Recordings were made in OFC (highlighted in blue), VS (highlighted in orange), and DS (highlighted in green).

Figure 1. 

Task and recording sites. (A) Task structure and timeline of WCST. (B) ROIs. Recordings were made in OFC (highlighted in blue), VS (highlighted in orange), and DS (highlighted in green).

Recording locations were confirmed before each recording session using our Brainsight system with structural magnetic resonance images taken before the experiment. Neuroimaging was performed at the Rochester Center for Brain Imaging on a Siemens 3T MAGNETOM Trio Tim (Berlin, Germany) using 0.5-mm voxels. Recording locations were confirmed by listening for characteristic sounds of white and gray matter during recording, which in all cases matched the loci indicated by the Brainsight system. The Brainsight system typically offers an error of <1 mm in the horizontal plane and <2 mm in the z dimension.

Electrophysiological Techniques and Eye Tracking

Single electrodes (Frederick Haer; impedance range = 0.8–4 MΩ) were lowered using a microdrive (NAN Instruments, Nazaret Illit, Israel) until waveforms between one and three neurons were isolated. Individual action potentials were isolated on a Plexon system. Neurons were selected for study solely on the basis of the quality of isolation; they were never preselected based on task-related response properties. Eye position was sampled at 1000 Hz by an infrared eye-monitoring camera system (SR Research, Ottawa, Ontario, Canada). Stimuli were controlled by a computer running MATLAB (The MathWorks, Natick, MA) with Psychtoolbox and Eyelink Toolbox (Brainard, 1997). Visual stimuli were presented a feedback period, the presentation of the first choice option, and the feedback period on the current trial, which allowed us to look at the offer epoch and the choice epoch.

Experimental Design and Statistical Analysis

The CSST

Monkeys performed an analog of the WCST known as the CSST (Moore et al., 2005). The task involved two dimensions (color and shape) and six specific rules (three shapes: circle, star, and triangle; three colors: cyan, magenta, and yellow; Figure 1A). On each trial in the offer epoch, three stimuli were presented at different positions asynchronously (1-sec asynchrony). The offers were positioned at the same distance from the central fixation. The position of each offer was independent of its identity, which was defined by its color and/or shape. After the stimuli were presented separately in the offer epoch, all three stimuli appeared simultaneously in the choice epoch with a central fixation spot. The monkey was required to fixate on the central dot for 100 msec and then indicate its choice by shifting gaze to its preferred stimulus and maintaining fixation on it for 250 msec. After a successful 250 msec fixation, visual feedback was provided for 400 msec. Correct choices were followed by positive visual feedback (a green outline around the chosen stimulus), and incorrect choices were followed by negative feedback (a red outline around the chosen stimulus). After visual feedback, there was a 500-msec delay in which the screen was blank. After the delay, correct choices were followed by a preferred flavor liquid (which was determined by a preference test before recording and happened to be water in all cases) reward. Incorrect choices were followed by no reward. All trials were separated by an 800-msec intertrial interval (ITI). During this time, the screen was blank, and the monkeys' gaze was unconstrained. The key element of the task then is that spatial information is specifically never relevant for choice. This is an important advance in our study relative to some previous ones. The animal never had an opportunity to learn that space might have a special value-related relationship. Indeed, the entire past training history of these animals was associated with rapidly changing tasks in which assigning values to positions would be actively punished and being willing to immediately adjust reward expectations for locations would be beneficial.

Prior Training

These subjects had never been trained on tasks in which spatial–reward associations had been a critical part of their repertoires in their training history. Specifically, all previous training involved dynamic tasks in which risky or certain options varied unpredictably across trials or across short blocks. Specifically, subjects were familiar with the diet selection task (Blanchard & Hayden, 2014), the hot-hand task (Blanchard, Wilke, & Hayden, 2014), the peak-end task (Blanchard, Wolfe, Vlaev, Winston, & Hayden, 2014), the patch-leaving task (Blanchard & Hayden, 2015), and the token gambling task (Azab & Hayden, 2017, 2018). In all of these tasks, there was no consistent relationship between spatial location and reward, and thus, subjects had no past opportunity to form a link between rewards and spatial positions. Although the subjects used in our 2016 study did not have spatial elements in their training history, the tasks we used in those studies (two-option gambling tasks) did potentially confound object identity with spatial position (Strait et al., 2016).

Spatial Selectivity

Peristimulus time histograms for spatial selectivity analysis were constructed by aligning spike rasters to the onset of the first offer and averaging firing rates across. Firing rates were calculated in 10-msec bins. For display, example cells' peristimulus time histograms were smoothed using a time-averaging kernel with a window size of 250 msec (identical to other selectivity analysis). To examine significantly modulated neurons during the offer epochs, one-way ANOVA was performed against average firing rate with the position of the offer. For ANOVA, a 500-msec sliding window was used and slid in 10-msec jumps. We performed a binomial test against 5% by expected chance (i.e., alpha = .05) to determine whether a significant portion of single neurons reached significance on their own, thereby allowing conclusions about the neural population as a whole. To provide percentages in each offer epoch, the time range between 200 and 700 msec from the onset of an offer was selected.

Choice Position Selectivity

For choice position selectivity analysis, we included additional epochs. Specifically, we included the “choice epoch” (750 msec beginning immediately after Offer 3 epoch offset until the end of feedback offset), the “delay” (500 msec beginning with the offset of the visual feedback), the ITI (500 msec beginning with the ITI onset), and the “next trial” (the Offer 1 epoch for the following trial). The difference of choice position coding between consecutive epochs was tested by Theil–Sen line test for proportion significantly modulated neuron between consecutive epochs.

Choice Probability

To examine whether neural activity directly relates to behavioral choice, we employed choice probability (CP) correlation analysis. Our analysis is tailored to our task but is designed to be conceptually similar to those used in perceptual decision-making (Liu & Newsome, 2005; Britten, Newsome, Shadlen, Celebrini, & Movshon, 1996; Britten, Shadlen, Newsome, & Movshon, 1992). For each epoch, we selected the only trials in which positions in the given epoch were incorrect within that trial. We did this to eliminate changes due to correct versus incorrect, reward versus no reward, or match versus nonmatch. We then separated these trials into two bins, those in which “upcoming choice corresponds to that position” versus “upcoming choice does not correspond to that position.” Then we compared the neural activity of those two conditions in the given epoch with the area under the curve in receiver operation curve (ROC) analysis. The significance of CP for each neuron was defined by 95% confidence interval.

Latency Analysis

To explore differences in latency, we used a method introduced by Siegel, Buschman, and Miller (2015). We measured the latency of particular task variables as the time when that information reached half of its maximum. In contrast to measuring the time at which information reaches statistical significance (a method we have used in the past; Strait, Sleezer, & Hayden, 2015), this method is more robust to differences in the strength of information or the amount of data because the shape of the function is less likely to be affected and therefore will not change its half-maximum points. We constrained the time range of this analysis to 750 msec within every offer epoch. For statistical comparisons of temporal latencies between regions and types of information, we estimated the standard error of latencies by bootstrapping across neurons (1000 resamples).

Statistical Analysis for Population Analysis

Error bars for percentages of neurons were estimated by bootstrapping. We randomly resampled all the neurons 1000 times from each area (with replacement). By this procedure, we can have percentage value at each resample. We calculated the mean of that bootstrapped percentage value and standard deviation of that bootstrapping, which is then treated as the SEM.

Data Availability

The data sets generated during the current study are available on the Hayden Lab Web site www.haydenlab.com/ or from the authors on reasonable request. The code generated to do the analyses for the current study is available from the corresponding author on reasonable request.

RESULTS

All data reported here were analyzed for previous studies; however, all results presented here are new (Sleezer et al., 2016, 2017; Sleezer & Hayden, 2016).

Monkeys Show Weak Spatial Biases in Choice

Two macaques (Subjects B and C) performed 269 sessions (Subject B: 99 sessions; Subject C: 170 sessions) of an analogue of the WCST known as the CSST (Moore et al., 2005). On each trial, the subject chose one of three asynchronously presented (asynchrony of 1 sec) colored shapes based on one of six rules (Figure 1A). The rule was either one color (cyan, magenta, or yellow) or one shape (circle, star, or triangle). The spatial positions of the offers were randomized on every trial and had no predictive power over reward, either trial by trial or in the aggregate. The correct rule changed randomly every 15 trials (see Methods for details). As a consequence, the specific stimulus–reward linkage changed frequently, and the specific reward spatial position changed on every trial.

The key element of the task then is that spatial information is specifically never relevant for choice, and the goal of the behavioral analysis is to control for the possibility that our observed effects could be explained by the confounding factors of behavioral biases in choice, saccade kinematics, or eye position. Spatial biases in behavior were quite weak. The average preferences for Positions 1, 2, and 3 were respectively 33.15 ± 0.19%, 33.53± 0.18%, and 33.31 ± 0.20% for Subject B and 33.15 ± 0.15%, 33.91 ± 0.14%, and 32.94 ± 0.16% for Subject C (± numbers refer to SEM). The proportion of individual sessions in which each of the three positions was chosen significantly more often than chance was 1.39%, 2.24%, and 2.56% for Positions 1, 2, and 3, respectively (ANOVA, p < .05). Given a large number of trials per session, these extremely weak biases suggest that subjects made their choices based almost entirely on the identity of the presented options and that our subjects did not consistently assign higher subjective values to particular positions. This is not surprising, given that spatial position was not systematically associated with reward amount. Across all sessions, average reward expectation associated with each position (global rate of reward rate of given position calculated relative to 100% for a standard reward) was 75.05 ± 0.5%, 75.10 ± 0.51%, and 75.21 ± 0.69% for Subject B and 84.22 ± 0.50%, 81.29 ± 0.60%, and 83.76 ± 0.60% for Subject C. (Subject C received slightly larger standard rewards to motivate performance.)

Subjects were free to fixate each of the offer targets during the probe period but were not required to do so. Nonetheless, they almost always did so (gaze frequency at Positions 1, 2, and 3 as a percentage of total trials, respectively: Subject B, 95.5%, 95.9%, and 96.1%; Subject C: 97.0%, 96.5%, and 97.1%). The likelihood of saccade did not, in general, depend on the position of the target (specifically, it depended significantly on the position of the target in 5.46% of sessions on average: 5.05% in Subject B and 5.88% in Subject C; chance levels are 5% for this measure).

We also found no evidence that spatial position affected the subjects' saccade latencies to inspect each of the asynchronously presented offers. Specifically, we considered the latency of the first saccade to the first target (on the trials the subject chose to fixate it). The average saccadic latency difference between the fastest and the other two positions was 4.3 msec for Subject B and 6.1 msec for Subject C and was <8.0 msec on 95% of sessions. We also found weak effects of latency to choose (latency difference for the fastest and the average of the other two positions): 12.3 msec for Subject B and 12.0 msec for Subject C; this effect was significant in only 4.7% of sessions (chance = 5.0%). Once a subject fixated an offer, they could look freely wherever they wanted. We measured dwell time for the first offer and for all three offers. We found no evidence that dwell time depended on position (ANOVA, p = .38 for Subject B and p = .86 for Subject C). Finally, we found no evidence that saccade latency depended on the saccade direction for offers (Subject B: ANOVA, p = .61, Subject C: p = .94) or for choices (Subject B, ANOVA: p = .19, Subject C: p = .83).

OFC and Striatum Encode Positions of Offers

We collected responses of 115 neurons in OFC (Subject B: 49, Subject C: 66), 103 neurons in VS (Subject B: 47, Subject C: 56), and 204 neurons in DS (Subject B: 77, Subject C: 127) during the CSST. We collected an average of 578, 631, and 572 trials per neuron, respectively, in these three areas.

Presentation of the three offers was staggered, and their spatial order was randomized. This feature of the task design allowed us to assess the response to offers at all three positions on each trial and to factor out possible order biases. A previous study using this data set shows that neuronal responses differentiated the correct offer from the other two (Sleezer et al., 2017). This pattern suggests that subjects were able to rapidly identify the correct option and begin the process of selection before the ostensible choice period during the offer epoch. Because the neural response differs between correct and incorrect offers, including both offers in analysis could cause potential confound. Thus, we used only used incorrect offers for analyzing offer epoch. Incorrect offers are shown at least twice in a single trial. Responses of an example neuron from each region to incorrect offers are shown in Figure 2. The example neuron from OFC has a greater response during the first offer epoch when the offer appears at Position 1 relative to when it appears at Position 2 or Position 3. Around the peak of the response, this neuron shows a significant dependence on spatial position (Offer 1: F(2, 236) = 7.118, p < .001). This neuron also shows significant modulation by position during the Offer 3 epochs (Offer 3: F(2, 339) = 5.426, p = .0048). Typical cells in both VS and DS show similar patterns (VS Offer 1, F(2, 320) = 6.466, p = .0017; DS Offer 1, F(2, 369) = 4.083, p = .0176; Offer 2: F(2, 378) = 6.632, p = .0015; Offer 3: F(2, 408) = 25.92, p < .001).

Figure 2. 

Example of neurons from each area and percentage of neurons significantly tuned to spatial position. Raster plot and average response of spatially tuned neurons from (A) OFC, (B) VS, and (C) DS (250 msec smoothed). Columns correspond to offer epochs. Shaded error bars indicate 95% confidence intervals (see Methods). Horizontal black boxes indicate offer onset duration (400 msec), and horizontal gray boxes indicate offer offset duration (600 msec). Vertical solid black lines indicate the time of offer offset (400 msec after offer onset). (D) Percentage of neurons whose responses are modulated significantly by the position of the offer. Shaded error bars indicate SEM. Horizontal thick gray solid lines indicate 5%, and horizontal thin gray solid lines indicate expected chance level obtained by two-sided binomial test against 5% (9.57%, the maximum value obtained from OFC).

Figure 2. 

Example of neurons from each area and percentage of neurons significantly tuned to spatial position. Raster plot and average response of spatially tuned neurons from (A) OFC, (B) VS, and (C) DS (250 msec smoothed). Columns correspond to offer epochs. Shaded error bars indicate 95% confidence intervals (see Methods). Horizontal black boxes indicate offer onset duration (400 msec), and horizontal gray boxes indicate offer offset duration (600 msec). Vertical solid black lines indicate the time of offer offset (400 msec after offer onset). (D) Percentage of neurons whose responses are modulated significantly by the position of the offer. Shaded error bars indicate SEM. Horizontal thick gray solid lines indicate 5%, and horizontal thin gray solid lines indicate expected chance level obtained by two-sided binomial test against 5% (9.57%, the maximum value obtained from OFC).

Across the data set, the proportion of cells with spatially modulated responses was greater than chance in all four epochs in all three regions (Figure 2D). For example, we observed significant coding of the position of Offer 1 in 12% of neurons from OFC (n = 14/115, p = .002, binomial test), 18% of neurons from VS (n = 19/103, p < .001), and 13% of neurons from DS (n = 27/204, p < .001). Similar results were observed in the other two epochs (see Table 1).

Table 1. 

Statistics of Significantly Tuned Neurons for Position from Each Region in Each Epoch

  OFC (n = 115) VS (n = 103) DS (n = 204) 
Offer 1 
Significant neurons 14 19 27 
Percent 12.17 18.45 13.24 
p .002 <.001 <.001 
  
Offer 2 
Significant neurons 14 14 28 
Percent 12.17 13.59 13.73 
p .002 <.001 <.001 
  
Offer 3 
Significant neurons 15 12 28 
Percent 13.04 11.56 13.73 
p <.001 .005 <.001 
  
Choice 
Significant neurons 19 15 46 
Percent 16.52 14.56 22.55 
p <.001 <.001 <.001 
  OFC (n = 115) VS (n = 103) DS (n = 204) 
Offer 1 
Significant neurons 14 19 27 
Percent 12.17 18.45 13.24 
p .002 <.001 <.001 
  
Offer 2 
Significant neurons 14 14 28 
Percent 12.17 13.59 13.73 
p .002 <.001 <.001 
  
Offer 3 
Significant neurons 15 12 28 
Percent 13.04 11.56 13.73 
p <.001 .005 <.001 
  
Choice 
Significant neurons 19 15 46 
Percent 16.52 14.56 22.55 
p <.001 <.001 <.001 

p Value is obtained using a two-way binomial test against 5% expected chance. Time range selected for offers was 200–700 msec from each stimulus onset, and time range for choice was 750 msec from choice onset.

We performed a further test in which we removed any neuron for which the subject's behavior showed a significant spatial bias that day. These neurons were uncommon in our sample (6.1% in OFC, 15.5% in VS, and 15.7% in DS), and effect sizes were small in all cases. For these follow-up analyses, our significant effects remained (OFC Offer 1: n = 15/108, Offer 2: n = 18/108, Offer 3: n = 15/108, all ps < .001; VS Offer 1: n = 16/87, Offer 2: n = 14/87, Offer 3: n = 13/87, all ps < .001; DS Offer 1: n = 25/172, Offer 2: n = 24/172, Offer 3: n = 22/172, all ps < .001, binomial test).

Because a large smoothing window can conceal or exaggerate effects, we next repeated this analysis, except with reduced window of 100 msec. The reported effects remained or became more salient in all the epochs (OFC Offer 1: n = 16/115, Offer 2: n = 20/115, Offer 3: n = 18/115, all ps < .001; VS Offer 1: n = 19/103, Offer 2: n = 16/103, Offer 3: n = 15/103, all ps < .001; DS Offer 1: n = 32/204, Offer 2: n = 30/204, Offer 3: n = 30/204, all ps < .001, binomial test).

It is possible that a rule change can induce different behavioral states, depending on whether subjects understood the rule or not. Thus, we performed separately analyzed spatial tuning limiting the analysis period to the later period of the blocks, when subjects were unlikely to be actively searching for task rules. Specifically, we excluded the trial right after the first block trial (i.e., the inevitable error trials). The rationale behind the analysis is that the subject performed based on a new rule above chance at two trials after inevitable errors (Sleezer & Hayden, 2016). We excluded only one trial after inevitable error so that removing too many trials would mask statistical results. The proportion of significantly modulated neurons remained above chance in every epoch in every area (OFC Offer 1: n = 14/115, p = .0154; Offer 2: n = 16/115, p < .001; Offer 3: n = 14/115, p < .0154; VS Offer 1: n = 19/103, Offer 2: n = 15/103, Offer 3: n = 16/103, all ps < .001; DS Offer 1: n = 30/204, Offer 2: n = 27/204, Offer 3: n = 28/204, all ps < .001, binomial test).

To examine whether the three regions differed, we compared firing rates across each epoch by the F statistics using a Bonferroni-corrected multiple-comparison method. In all cases, for all epochs, we found no significant differences between the areas (all multiple comparisons, p > .05; see Table 2).

Table 2. 

Multiple Comparisons of F Statistics for Firing Rate of Core Reward Regions

  Multiple Comparisons of F Statistics (Bonferroni Corrected p) 
OFC–VS OFC–DS VS–DS 
Spatial 
Offer 1 .403 .594 .87 
Offer 2 .981 .961 .999 
Offer 3 .951 .375 .608 
Choice .989 .062 .108 
  
Rewards 
Offer 1 .073 .001* .508 
Offer 2 .355 .005* .298 
Offer 3 .018 <.001* .652 
Choice .064 .933 .366 
  Multiple Comparisons of F Statistics (Bonferroni Corrected p) 
OFC–VS OFC–DS VS–DS 
Spatial 
Offer 1 .403 .594 .87 
Offer 2 .981 .961 .999 
Offer 3 .951 .375 .608 
Choice .989 .062 .108 
  
Rewards 
Offer 1 .073 .001* .508 
Offer 2 .355 .005* .298 
Offer 3 .018 <.001* .652 
Choice .064 .933 .366 
*

p < .05.

OFC and Striatum Neurons Encode Chosen Position

Encoding of offer position relates to the brain's inputs; encoding of choice relates to its outputs. We next investigated encoding of the chosen position during the choice epoch in all three areas. Note that we used correct trials only to control, to the extent we could, for attention, motor plans, and reward expectation in this analysis. We observed a significant effect of chosen position in three example neurons (Figure 3; OFC: F(2, 408) = 6.758, p = .0013; VS: F(2, 434) = 4.053, p = .0149; DS: F(2, 461) = 63.12, p < .001).

Figure 3. 

Evolution of choice position selectivity in the population. Population activity for encoding position of choice from (A) OFC (blue), (B) VS (orange), and (C) DS (green). Vertical solid lines indicate the end of offer epoch and the start of the next epoch. Vertical thick gray lines indicate choice offset (3350 msec), feedback offset (3750 msec), and delay offset (4250 msec), respectively. Horizontal black boxes indicate offer onset (400 msec from each offer start), and gray boxes (dark: 350 msec, light: 500 msec) indicate choice and feedback, during which all offers are shown simultaneously. Shaded error bar indicates SEM. Horizontal thick gray solid lines indicate 5%, and horizontal thin gray solid lines indicate expected chance level obtained by two-sided binomial test against 5% (9.57%, the maximum value obtained from OFC).

Figure 3. 

Evolution of choice position selectivity in the population. Population activity for encoding position of choice from (A) OFC (blue), (B) VS (orange), and (C) DS (green). Vertical solid lines indicate the end of offer epoch and the start of the next epoch. Vertical thick gray lines indicate choice offset (3350 msec), feedback offset (3750 msec), and delay offset (4250 msec), respectively. Horizontal black boxes indicate offer onset (400 msec from each offer start), and gray boxes (dark: 350 msec, light: 500 msec) indicate choice and feedback, during which all offers are shown simultaneously. Shaded error bar indicates SEM. Horizontal thick gray solid lines indicate 5%, and horizontal thin gray solid lines indicate expected chance level obtained by two-sided binomial test against 5% (9.57%, the maximum value obtained from OFC).

We used ANOVA to measure the proportion of neurons with significant encoding for the chosen position (Figure 3). In OFC, the proportion of neurons that encode the chosen position was below chance level for all offer epochs (p = .3860, p = .2872, and p = .2872, respectively, for each offer epoch, binomial test). (Note that these values are nonsignificant even without correction for multiple comparisons). Shortly after the choice epoch starts, the proportion starts to ramp and peaks following the choice epoch onset (n = 20/115, p < .001, after correcting for multiple comparisons, latency = 761.7 msec). In contrast, VS and DS show clear ramping up during the offer epochs before the choice. In both areas, the proportions of significantly modulated neurons pass chance in Offer 2 and Offer 3 epochs (p = .0192 and p = .0387 for each offer in VS, p < .001 for Offers 2 and 3 in DS, binomial test).

The differential ramping pattern between areas is quantified by an analysis widely used in a time series analysis called the Theil–Sen line test, which is a nonparametric test for detecting a monotonic trend (either upward or downward) in a given data set. We used the Theil–Sen test to examine whether adjacent epochs, like Offers 1 and 2, show monotonic trends in their percentage of significantly modulated activity by choice position. The result indicates that for every area and every epoch there was a significant upward trend (p = .002 in the first adjacent epoch of OFC, p < .001 in all other cases).

However, two aspects suggest that there is a difference between OFC and other two areas. First, the proportion of significantly modulated neuron by choice position was significantly lower than expected chance level in all offer epochs at OFC. Second, the upward trend showed that changes at the time of choice are abrupt compared with other offer epochs (estimated upward slope for Offer 3 to choice: 21.56; Offer 1 to Offer 2: 4.50; Offer 2 to Offer 3: 1.53). Those quantities indicate that OFC rapidly tunes its activity relevant to choice. In contrast, other two areas encoding of choice was above the expected chance level across all epoch. In addition, the upward trend does change until a decision is made (estimate slope for VS: 9.50, 9.10, 6.35; DS: 8.79, 5.72, and 11.58, each adjacent epoch respectively). This difference is consistent with the hypothesis that OFC may encode choices only postdecisionally whereas striatal areas may encode them both before and after the choice (Tsujimoto, Genovesio, & Wise, 2009).

We next examined choice encoding during later epochs: the delay, ITI, and during the Offer 1 epoch for the following trial. In OFC, 22.61% of neurons were selective for choice position during the delay (p < .001, binomial test); during the latter two epochs, the number fell to chance levels (p = .082 and p = .2872, respectively, Figure 3). Both VS and DS showed significant proportions of modulated neurons during the delay and ITI (VS: delay: 14.56%, p < .001; ITI: 15.53%, p < .001; DS: delay: 19.12%, p < .001; ITI: 14.71%, p < .005), but modulation was at chance level during the Offer 1 epoch of the next trial. Altogether, it appears that OFC signals choice position information somewhat transiently, whereas striatal areas show ramping up and down over a longer timescale from start to end of the trial.

CP Correlations: Neural Variability of OFC and VS in Offer Epoch Predicts Choice

In the neuroscience of perceptual decision-making, the CP correlation is often used to demonstrate a close link between neural activity and choice (Liu & Newsome, 2005; Britten et al., 1992, 1996). We employ a conceptually similar analysis here. We examined whether variability in neural responses to offer stimuli at a particular position would predict the choice of that position during the subsequent choice epoch (Figure 4). In each epoch, we exclusively analyzed trials where the given position was incorrect so that we would not capture neural variability predicting the correctness of the offer or nonlinear interactions thereof. Then we compared neural activity in that epoch between the cases where position offered in that epoch was chosen or not by the area under the curve in ROC analysis, which is a commonly used method in measuring CP. The values of the mean CP in OFC were 0.540, 0.548, and 0.518, respectively, for each of the three offer epochs. The values of the mean CP in VS were 0.516, 0.519, and 0.501, respectively. The values of the mean CP in DS were 0.517, 0.507, and 0.497, respectively.

Figure 4. 

CP analysis. CP analysis based on the area under the curve in ROC analysis. The comparison was made between neural activity in every epoch for “choice made toward cued position” versus “choice made to position other than cued position” in trials where the given cues are incorrect. Colored bars in histograms are the number of neurons with significant CP defined by 95% confidence interval. Each row illustrates data from a different brain region (A, OFC; B, VS; C, DS), and each column indicates a different epoch.

Figure 4. 

CP analysis. CP analysis based on the area under the curve in ROC analysis. The comparison was made between neural activity in every epoch for “choice made toward cued position” versus “choice made to position other than cued position” in trials where the given cues are incorrect. Colored bars in histograms are the number of neurons with significant CP defined by 95% confidence interval. Each row illustrates data from a different brain region (A, OFC; B, VS; C, DS), and each column indicates a different epoch.

One way to assess the significance of this finding is to perform a rank sum test comparison between the measured CP of the population and chance level (i.e., 0.5). The result of that analysis shows that the CP of the population significantly differed from 0.5 in all cases for OFC (p < .001). CP values of VS and DS at Epochs 1 and 2 significantly differed from chance (p < .001); however, neither area differed from the chance level in Epoch 3 (p = .109 and p = .16, each area, respectively).

As a complementary analysis, we performed a binomial test to decide whether the proportion of individual neurons showing significant CP is greater than expected by chance (all areas, all epochs, p < .001). The significant CP measured in these areas provides some evidence that the spatial information carried in them is not a functionally meaningless task correlation or a passive but unimportant encoding, but rather a sign that spatial information in these regions has a direct influence on choice.

Overlapping Subset of Neurons Encode Space and Reward

We wanted to know whether neurons encoding spatial information tend to be the same ones that are responsive to reward information. Across the population of cells, the proportion of reward-modulated cells was greater than chance in all four epochs in all three regions (Figure 5). For example, we observed significant coding of the reward of Offer 1 in 33% of neurons from OFC (n = 37/115, p < .001, binomial test), 25% of neurons from VS (n = 25/103, p < .001), and 19% of neurons from DS (n = 38/204, p < .001). Similar results were observed in all the other epochs, and the percentage of reward-modulated neurons was higher than spatially modulated neurons (see Table 3). To examine how the three regions differed in their degree of reward modulation, we compared firing rates across each epoch by the F statistics using a Bonferroni-corrected multiple-comparison method. OFC and DS did not differ significantly in any epoch, except for choice epoch (p = .001, p = .005, and p < .001, respectively), nor did the other comparisons (OFC–VS and VS–DS) reveal a significant difference between areas (Table 2).

Figure 5. 

Percentage significantly modulated by reward. Percentage of neurons whose responses are modulated significantly by whether the option was rewarded or not. Each column indicates a different epoch within the task. Shaded error bars for percentage indicate SEM obtained using a bootstrapping procedure (n = 1000 permutations). Horizontal thick gray solid line indicates 5%, and horizontal thin gray solid line indicates expected chance level obtained by two-sided binomial test against 5% (9.57%, the maximum value obtained from OFC). In offer epochs, the black rectangle indicates stimulus onset, and gray indicates stimulus offset. In choice epoch, the dark gray rectangle in the choice epoch indicates choice option onset, and light gray indicates feedback onset.

Figure 5. 

Percentage significantly modulated by reward. Percentage of neurons whose responses are modulated significantly by whether the option was rewarded or not. Each column indicates a different epoch within the task. Shaded error bars for percentage indicate SEM obtained using a bootstrapping procedure (n = 1000 permutations). Horizontal thick gray solid line indicates 5%, and horizontal thin gray solid line indicates expected chance level obtained by two-sided binomial test against 5% (9.57%, the maximum value obtained from OFC). In offer epochs, the black rectangle indicates stimulus onset, and gray indicates stimulus offset. In choice epoch, the dark gray rectangle in the choice epoch indicates choice option onset, and light gray indicates feedback onset.

Table 3. 

Statistics of Significantly Tuned Neurons for Reward from Each Region in Each Epoch

  OFC (n = 115) VS (n = 103) DS (n = 204) 
Offer 1 
Significant neurons 37 25 38 
Percent 32.54 24.70 18.64 
p <.001 <.001 <.001 
  
Offer 2 
Significant neurons 40 26 30 
Percent 34.64 25.61 14.67 
p .002 <.001 <.001 
  
Offer 3 
Significant neurons 43 28 35 
Percent 37.32 26.89 17.17 
p <.001 <.001 <.001 
  
Choice 
Significant neurons 20 21 33 
Percent 17.53 20.82 16.18 
p <.001 <.001 <.001 
  OFC (n = 115) VS (n = 103) DS (n = 204) 
Offer 1 
Significant neurons 37 25 38 
Percent 32.54 24.70 18.64 
p <.001 <.001 <.001 
  
Offer 2 
Significant neurons 40 26 30 
Percent 34.64 25.61 14.67 
p .002 <.001 <.001 
  
Offer 3 
Significant neurons 43 28 35 
Percent 37.32 26.89 17.17 
p <.001 <.001 <.001 
  
Choice 
Significant neurons 20 21 33 
Percent 17.53 20.82 16.18 
p <.001 <.001 <.001 

p value is obtained using a two-way binomial test against 5% expected chance. Time range selected for offers was 200–700 msec from each stimulus onset, and time range for choice was 750 msec from choice onset.

Next, we computed unsigned F statistics for reward and for spatial position and correlated these (Figure 6). A negative correlation would indicate distinct populations, whereas a positive correlation would indicate overlapping (potentially even identical) population for processing different variables (for an explanation of the logic of this approach, see Blanchard, Hayden, & Bromberg-Martin, 2015). To compare between F statistics from neurons with different firing properties, we first normalized (Z-scored) neural firing rates. There was no significant correlation in OFC between space and reward variables at any epoch during the trial (all ps > .05). In VS, we found a positive correlation in the choice epoch, but not in others (choice epoch: n = 103, r = .230, p = .0012). In DS, we found a positive correlation for Offer 2, Offer 3, and choice epochs (n = 204, Offer 2 epoch: r = 1.042, p = .001; Offer 2 epoch: r = .490, p = .0357; choice epoch: r = .062, p = .0035; see Table 4 for other epoch details). (Note that the result did not differ from the identical analysis using raw firing rate only, except for Offer 3 from DS, r = .04, p = .068.)

Figure 6. 

Linear regression between F statistics of space and reward. Correlation between F statistics for space and reward for each neuron. Offer epoch analysis used a time window between 200 and 700 msec from offer onset, and choice epoch analysis used a time window between 0 and 750 msec from choice target onset. Shaded area indicates 95% confidence interval. Red line and shaded areas indicate a least-squares regression line and 95% confidence intervals. Rows correspond to data from OFC (blue), VS (orange), and DS (green). Each column indicates a different epoch within the task.

Figure 6. 

Linear regression between F statistics of space and reward. Correlation between F statistics for space and reward for each neuron. Offer epoch analysis used a time window between 200 and 700 msec from offer onset, and choice epoch analysis used a time window between 0 and 750 msec from choice target onset. Shaded area indicates 95% confidence interval. Red line and shaded areas indicate a least-squares regression line and 95% confidence intervals. Rows correspond to data from OFC (blue), VS (orange), and DS (green). Each column indicates a different epoch within the task.

Table 4. 

Linear Regression between F Statistics of Space and Reward from Each Area in Each Epoch

  Offer 1 Offer 2 Offer 3 Choice 
OFC 
Correlation (r.465 −.151 .587 .113 
p .484 .848 .729 .419 
  
VS 
Correlation (r.558 1.074 .668 .230* 
p .164 .265 .553 .001* 
  
DS 
Correlation (r.069 1.042* .490* .062* 
p .622 <.001* .0357* .004* 
  Offer 1 Offer 2 Offer 3 Choice 
OFC 
Correlation (r.465 −.151 .587 .113 
p .484 .848 .729 .419 
  
VS 
Correlation (r.558 1.074 .668 .230* 
p .164 .265 .553 .001* 
  
DS 
Correlation (r.069 1.042* .490* .062* 
p .622 <.001* .0357* .004* 
*

p < .05.

The key finding of this analysis is that there is no negative correlation in any epoch or any area. This lack of correlation—either positive or negative—suggests that the populations are neither overlapping (a positive correlation would indicate this) nor segregated (a negative correlation would indicate this). Note, however, that it would be premature to draw any firmer conclusions from a failure to find a significant effect.

Spatial Information Precedes Reward Information Processing

We next compared the latency of spatial and reward information (Figure 7). Specifically, we estimated the time at which information reached half its maximum at each epoch (this method was introduced by Siegel et al., 2015). In OFC, the latency of spatial information in the offer epoch (119.9 msec following the appearance of the first offer) was significantly shorter than the latency of reward information (248.8 msec; latency difference: 128.9 msec, p < .001). We observed the same pattern in the choice epoch (340.181 msec following cue vs. 761.7 msec following cue onset, p < .001, for pairwise comparison). In VS, the latency of spatial and reward information was the same in the Offer 1 epoch (201.10 msec and 203.86 msec, p = .99), although spatial information preceded reward coding in Offer 2, Offer 3, and the choice epoch. In DS, spatial information preceded reward information in all but Epoch 3. Considering that information could be transient, we also varied the time window of analysis, but we found the same tendency as Figure 7. Overall, these results support the idea that both offer information and choice spatial information generally precede corresponding reward information within these structures. They also suggest that the two forms of information reach these areas at different times.

Figure 7. 

Normalized explained variance (EV) for space (cyan) and reward (magenta). Each row corresponds to data from a different brain region (OFC, VS, and DS). (A) The latency of each variable (magenta: reward; blue: space). (B) Normalized EV of reward and space variables from each epoch. Normalized EV is identical to the percentage of neurons significantly modulated but normalized to compare spatial and reward variable. Each column indicates a different epoch within the task. Shaded region indicates the SEM. In the bottom row, the black rectangle indicates the offer epoch starting at stimulus onset, and gray indicates stimulus offset. The dark gray rectangle in the choice epoch indicates choice option onset, and light gray indicates feedback onset. This time was identical across areas.

Figure 7. 

Normalized explained variance (EV) for space (cyan) and reward (magenta). Each row corresponds to data from a different brain region (OFC, VS, and DS). (A) The latency of each variable (magenta: reward; blue: space). (B) Normalized EV of reward and space variables from each epoch. Normalized EV is identical to the percentage of neurons significantly modulated but normalized to compare spatial and reward variable. Each column indicates a different epoch within the task. Shaded region indicates the SEM. In the bottom row, the black rectangle indicates the offer epoch starting at stimulus onset, and gray indicates stimulus offset. The dark gray rectangle in the choice epoch indicates choice option onset, and light gray indicates feedback onset. This time was identical across areas.

We then compared latency differences across areas. In the Offer 1 epoch, both spatial information and reward information appear earlier in OFC than in DS (latency difference: 9.24 msec and 88.9 msec, p < .001, for space and reward encoding, respectively). VS encodes reward information more rapidly than OFC and DS but encodes spatial information more slowly (latency for space = 203.86 msec, latency for reward = 201.1 msec, p < .001 for all, multiple comparisons). In the choice epoch, though not significantly different from VS, OFC is the slowest region to encode spatial information (349.78 msec, p > .05 and p < .001, respectively, with VS and DS, multiple comparison) but is the fastest in reward encoding (787.10 msec, p < .001 for both areas, multiple comparisons). We next examined whether these effects could be observed with smaller smoothing windows (100 msec). In all three regions, the observed patterns were unchanged. Specifically, in OFC, spatial information became significantly faster than reward information in Epoch 1 and the choice epoch (Epoch 1 spatial latency: 143.8 msec; reward latency: 244.3 msec; choice epoch spatial latency: 371.88 msec; reward latency: 744.4 msec). All the latencies between spatial and reward are significantly different (p < .001, two-way t test). In VS, all trends were maintained, except that the reward information arises significantly faster than spatial information in Epoch 1 (p < .001, two-way t test). In DS, like OFC using 100 msec smoothing window, all the relationships between spatial and reward information are maintained with a significant difference between each other (p < .001, two-way t test). Together, these results illustrate the heterogeneous relationship between encoding in these three regions and suggest that it may be impossible to describe their interactions in a strictly feed-forward model.

DISCUSSION

Here we report neuronal selectivity for spatial positions of offers and choices in two core reward regions, the OFC and VS, as well as DS. We avoided confounds between space and target identity, attention, and motor planning that have been part of previous studies (including our own) by using a dynamic task with rapidly changing rules that solely determined reward associations. The trial-by-trial correlation in noise in the encoding of position was associated with variation in choice, an effect known as CP correlation. This correlation suggests the spatial signals in these areas may play a direct role in influencing choices. The presence of spatial information in these core reward regions suggests that spatial information may be a ubiquitous feature of the brain's reward-based decision-making system. We believe our study provides the strongest evidence so far in favor of spatial selectivity in these regions.

One limitation of previous studies is that reward is associated with position; indeed, reward is often so linked to position in many studies that animals may use it as a heuristic. That is, they may covertly adopt a strategy that associates reward with position despite the fact that doing so carries no benefit. Our task actively disincentivizes subjects to use a location-based heuristic strategy—and animals' high performance in the task indicates that this works. That is, they pay attention to rule and ignore space. Furthermore, the comparison with our previous study suggests robustness of spatial tuning in our current task design. The presence of robust spatial selectivity, despite this active punishment for associating it with reward, indicates that spatial information is strongly and natively encoded in these reward regions; it is “baked in” to their representational repertoires.

In studies like these, one potential attentional confound comes from variations in attention associated with changes in expectation. For example, if the first offer is correct, the subject may move his focus away from the task briefly, but if the first offer is incorrect, he may continue to focus until the task is resolved. To deal with this potential confound, we focused in our analyses on the responses to the first offer only (and further restrict these trials to those in which the first offer was incorrect; see above). Even if we limit the scope in Offer 1 epoch regarding potential confounding, significant selectivity in all three regions still is observed.

The percentage of cells we find spatially selective (OFC: n = 14/115, 12.17%; VS: n = 19/103, 18.45%; Epoch 1 for both areas) was larger than the percentage found in Strait et al. (2016) (OFC: n = 12/113, 10.62%; VS: n = 11/124 8.87%; Epoch 1 for both areas). The consistency between the two studies in OFC indicates that the spatial tuning in the reward area is robust regardless if there are some potential confound with object identity (OFC: p = .7102, chi-square contingency test). On the other hand, a significant increase in VS raises the possibility that our more difficult task may have enhanced selectivity in VS in a way that did not occur within OFC (VS: p = .0339, chi-square contingency test).

The existence of spatial information within putatively core reward regions suggests that the value signals in these regions are not independent of input- and output-related factors. It also raises the possibility that ostensible value representations are actually value-related modulations of underlying spatial signals. Equally, it raises the opposite possibility that spatial representations are spatially modified value signals. In either case, though, the intermixing of signals argues against the idea of a pure value domain, that is, a brain region in which values and only values are computed and compared. Instead, they suggest that spatial information is at the very least present at all stages of value comparison in the brain. Consistent with this idea, the CP correlations we observe suggest that the value signals in these regions play a role in influencing choices.

These ideas relate to, but do not resolve, ongoing debates about the nature of economic choice (Hayden & Moreno-Bote, 2017; Hunt & Hayden, 2017; Padoa-Schioppa & Conen, 2017; Rich, Stoll, & Rudebeck, 2017; Cisek & Pastor-Bernier, 2014; Cisek, 2012; Padoa-Schioppa & Cai, 2011; Rangel & Hare, 2010; Padoa-Schioppa & Assad, 2006). First, they indicate that selectivity in cognitive regions, including reward regions, is not fixed but is flexible, depending on the specific needs and demands of the animal at the time of the task. The functional repertoire of these areas, then, is “pluripotent.” This flexibility may be a consequence in part of their multiplexed selectivity, which allows them to change rapidly to task demands (Marblestone, Wayne, & Kording, 2016; Hunt et al., 2012). Second, they are consistent with the possibility that there may be no pure goods space, at least one implemented in the form of pure reward neurons. Such a pure goods space is part of some neuroeconomic theories but does not necessarily imply a choice in an action space. It is also part of theories that make use of distributed consensus (Hunt & Hayden, 2017; Cisek, 2012; Cisek & Kalaska, 2010). One way of looking at this distinction is that some choice models are modular: There is an evaluation, comparison, and action selection stage. Other choice models are graded: There is a gradual rotation, without discrete stages, from input (offer) to orthogonal (choice) output space.

Recent findings provide some reasons why a system would represent multiple intertwined variables, rather than functioning as a single-variable feature detector. First, the high-dimensional representation offered by mixed selectivity facilitates more flexible, context-based decision-making compared with the lower-dimensional representation offered by highly specialized neurons (Blanchard, Piantadosi, & Hayden, 2018; Enel, Procyk, Quilodran, & Dominey, 2016; Fusi, Miller, & Rigotti, 2016; Raposo, Kaufman, & Churchland, 2014; Rigotti et al., 2013). Second, by maintaining a spatial “tag” throughout the decision process, actions can be selected while avoiding potential binding problems between choices and actions (Hayden & Moreno-Bote, 2017; Strait et al., 2016). Third, carrying information about spatial details of actions allows for the integration of action costs into economic choices (Kolling et al., 2016; Rudebeck, Walton, Smyth, Bannerman, & Rushworth, 2006).

A recent study by DiCarlo and colleagues provides another possible explanation for why space may be observed in the reward system (Hong, Yamins, Majaj, & DiCarlo, 2016). The authors of that study found that variables irrelevant to the task, which they called category-orthogonal, were observed in mid- and high-level form vision areas, and indeed, the strength of representation grew with level. Their network, wherein higher layer corresponded to high-level form vision areas, produced the same result as physiological data. This result was surprising, as the information discarding the operation (max-pooling) that they used in each layer in the network did not result in their diminution. We speculate that a similar process may be occurring in our data: the preservation of choice-orthogonal variables in the hierarchy that produces actions. Consistent with our viewpoint about why multiple intertwined variables are in economic decision-making system, Hong and colleagues speculate that preservation of information may serve to avoid binding problems (see also Di Lollo, 2012).

Several studies report an absence of spatial information in OFC neurons (Grattan & Glimcher, 2014; Padoa-Schioppa & Cai, 2011; Kennerley & Wallis, 2009). Others show some evidence for it (McGinty et al., 2016; Strait et al., 2016; Bryden & Roesch, 2015; Luk & Wallis, 2013; Abe & Lee, 2011; Sul, Kim, Huh, Lee, & Jung, 2010; van Wingerden, Vinck, Lankelma, & Pennartz, 2010; Tsujimoto et al., 2009; Furuyashiki, Holland, & Gallagher, 2008; Feierstein, Quirk, Uchida, Sosulski, & Mainen, 2006; Roesch, Taylor, & Schoenbaum, 2006). Some studies report some form of spatial information in VS as well (Strait et al., 2016; Stott & Redish, 2014; Lansink et al., 2012; De Leonibus et al., 2009; Lansink, Goltstein, Lankelma, McNaughton, & Pennartz, 2009; Mulder et al., 2005), although some do not. Why the inconsistency across studies? One possibility is that spatial information is latent in these circuits but is only revealed when it is task relevant (Marblestone et al., 2016; Hunt et al., 2012). Consider by comparison the dorsal ACC, an area with clear premotor projections and undisputed spatial selectivity (Procyk et al., 2016; Strait et al., 2016; Luk & Wallis, 2013; Cai & Padoa-Schioppa, 2012; Hayden & Platt, 2010; Williams, Bush, Rauch, Cosgrove, & Eskandar, 2004; Matsumoto, Suzuki, & Tanaka, 2003; Shima & Tanji, 1998). Nonetheless, several studies have failed to show spatial selectivity in this region; we previously proposed that this contradiction could be explained by cross-study differences in the task relevance of space (Heilbronner & Hayden, 2016). A similar pattern likely also applies to posterior cingulate cortex (Hayden, Smith, & Platt, 2009; Hayden, Nair, McCoy, & Platt, 2008; Dean & Platt, 2006; Dean, Crowley, & Platt, 2004). Even in the DS, which is known to code spatial information robustly with dense connection to the hippocampus, its quality of spatial representation varies task dependently (Schmitzer-Torbert & Redish, 2008).

Our results suggest a subtle modification of this idea. In our task, space was specifically not relevant. It changed from trial to trial. Any learned association between spatial position and reward would decrease average reward rate, and monkeys' measured (extremely weak) spatial biases suggest that they managed to avoid such learning. We infer then that cross-trial spatial relevance is not crucial for spatial encodings. Instead, we propose that it was the task difficulty and the degree of attention required to perform the CSST, which had frequent rule switches, and the trial-by-trial changes in positions, within target distractors, and so on. Enhanced task difficulty may enhance overall firing rates or, at least, tuning and thus be a factor that uncovers latent or weak neural signals.

Our results indicate that the spatial selectivity observed in OFC and VS may operate differently from that observed in early sensory and motor areas. Space does not appear to be a special variable that serves as scaffolding around which other aspects of neural responses are organized but rather appears to be simply one of many variables encoded in the region. This fact, in turn, suggests that spatial information is encoded like any other variable that is part of the cognitive map of task space (Wang & Hayden, 2017; Schuck, Cai, Wilson, & Niv, 2016; Wikenheiser & Schoenbaum, 2016; Wilson, Takahashi, Schoenbaum, & Niv, 2014). Such a map would contain the entire set of task-relevant information and would presumably include the spatial positions of offers and choices. Our findings support such an idea. Nonetheless, the observation of a qualitatively similar set of responses in VS and DS suggests that whatever function these signals play may not be unique to OFC or even to the cortex. We speculate then that the cognitive map of task space may be observable through other regions in the prefrontal–striatal circuitry, including both VS and DS and, based on other studies, in the dorsal ACC as well (Ebitz & Hayden, 2016; Heilbronner & Hayden, 2016). Furthermore, our results showing subtle but measurable qualitative and quantitative differences between these regions suggest that the map is not identical but rather differs systematically by region. These differences may be driven by the anatomical connections of the region and may reflect a gradual change in responses—and thus in function—as information advances from input to output the end (Hunt & Hayden, 2017; Cisek, 2012).

Acknowledgments

We thank Meghan Castagno, Giuliana LoConte, and Marc Mancarella for assistance in data collection and Rei Akaishi, Habiba Azab, and Maya Wang for helpful discussions. This research was supported by a grant from the National Institute on Drug Abuse (grant R01-DA-038106; to B. Y. Hayden). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Conceived and designed the experiments: B. J. S., B. Y. H. Performed the experiments: B. J. S. Analyzed the data: S. B. M. Y. Wrote the paper: S. B. M. Y., B. Y. H.

Reprint requests should be sent to Seng Bum Michael Yoo, Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY 14618, or via e-mail: sbyoo.ur.bcs@gmail.com.

REFERENCES

REFERENCES
Abe
,
H.
, &
Lee
,
D.
(
2011
).
Distributed coding of actual and hypothetical outcomes in the orbital and dorsolateral prefrontal cortex
.
Neuron
,
70
,
731
741
.
Azab
,
H.
, &
Hayden
,
B. Y.
(
2017
).
Correlates of decisional dynamics in the dorsal anterior cingulate cortex
.
PLoS Biology
,
15
,
e2003091
.
Azab
,
H.
, &
Hayden
,
B. Y.
(
2018
).
Correlates of economic decisions in dorsal and subgenual cingulate cortices
.
European Journal of Neuroscience
,
47
,
979
993
.
Blanchard
,
T. C.
, &
Hayden
,
B. Y.
(
2014
).
Neurons in dorsal anterior cingulate cortex signal postdecisional variables in a foraging task
.
Journal of Neuroscience
,
34
,
646
655
.
Blanchard
,
T. C.
, &
Hayden
,
B. Y.
(
2015
).
Monkeys are more patient in a foraging task than in a standard intertemporal choice task
.
PLoS One
,
10
,
1
11
.
Blanchard
,
T. C.
,
Hayden
,
B. Y.
, &
Bromberg-Martin
,
E. S.
(
2015
).
Orbitofrontal cortex uses distinct codes for different choice attributes in decisions motivated by curiosity
.
Neuron
,
85
,
602
614
.
Blanchard
,
T. C.
,
Piantadosi
,
S. T.
, &
Hayden
,
B. Y.
(
2018
).
Robust mixture modeling reveals category-free selectivity in reward region neuronal ensembles
.
Journal of Neurophysiology
,
119
,
1305
1318
.
Blanchard
,
T. C.
,
Wilke
,
A.
, &
Hayden
,
B. Y.
(
2014
).
Hot-hand bias in rhesus monkeys
.
Journal of Experimental Psychology: Animal Behavior Processes
,
40
,
280
286
.
Blanchard
,
T. C.
,
Wolfe
,
L. S.
,
Vlaev
,
I.
,
Winston
,
J. S.
, &
Hayden
,
B. Y.
(
2014
).
Biases in preferences for sequences of outcomes in monkeys
.
Cognition
,
130
,
289
299
.
Brainard
,
D. H.
(
1997
).
The Psychophysics Toolbox
.
Spatial Vision
,
10
,
433
436
.
Britten
,
K. H.
,
Newsome
,
W. T.
,
Shadlen
,
M. N.
,
Celebrini
,
S.
, &
Movshon
,
J. A.
(
1996
).
A relationship between behavioral choice and the visul responses of neurons in macaque MT
.
Visual Neuroscience
,
13
,
87
100
.
Britten
,
K. H.
,
Shadlen
,
M. N.
,
Newsome
,
W. T.
, &
Movshon
,
J. A.
(
1992
).
The analysis of visual motion: A comparison of neuronal and psychophysical performance
.
Journal of Neuroscience
,
12
,
4745
4765
.
Bryden
,
D. W.
, &
Roesch
,
M. R.
(
2015
).
Executive control signals in orbitofrontal cortex during response inhibition
.
Journal of Neuroscience
,
35
,
3903
3914
.
Cai
,
X.
, &
Padoa-Schioppa
,
C.
(
2012
).
Neuronal encoding of subjective value in dorsal and ventral anterior cingulate cortex
.
Journal of Neuroscience
,
32
,
3791
3808
.
Cisek
,
P.
(
2012
).
Making decisions through a distributed consensus
.
Current Opinion in Neurobiology
,
22
,
927
936
.
Cisek
,
P.
, &
Kalaska
,
J. F.
(
2010
).
Neural mechanisms for interacting with a world full of action choices
.
Annual Review of Neuroscience
,
33
,
269
298
.
Cisek
,
P.
, &
Pastor-Bernier
,
A.
(
2014
).
On the challenges and mechanisms of embodied decisions
.
Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences
,
369
,
1
14
.
De Leonibus
,
E.
,
Managò
,
F.
,
Giordani
,
F.
,
Petrosino
,
F.
,
Lopez
,
S.
,
Oliverio
,
A.
, et al
(
2009
).
Metabotropic glutamate receptors 5 blockade reverses spatial memory deficits in a mouse model of Parkinson's disease
.
Neuropsychopharmacology
,
34
,
729
738
.
Dean
,
H. L.
,
Crowley
,
J. C.
, &
Platt
,
M. L.
(
2004
).
Visual and saccade-related activity in macaque posterior cingulate cortex
.
Journal of Neurophysiology
,
92
,
3056
3068
.
Dean
,
H. L.
, &
Platt
,
M. L.
(
2006
).
Allocentric spatial referencing of neuronal activity in macaque posterior cingulate cortex
.
Journal of Neuroscience
,
26
,
1117
1127
.
Di Lollo
,
V.
(
2012
).
The feature-binding problem is an ill-posed problem
.
Trends in Cognitive Sciences
,
16
,
317
321
.
Ebitz
,
R. B.
, &
Hayden
,
B. Y.
(
2016
).
Dorsal anterior cingulate: A Rorschach test for cognitive neuroscience
.
Nature Neuroscience
,
19
,
1278
1279
.
Enel
,
P.
,
Procyk
,
E.
,
Quilodran
,
R.
, &
Dominey
,
P. F.
(
2016
).
Reservoir computing properties of neural dynamics in prefrontal cortex
.
PLoS Computational Biology
,
12
,
1
35
.
Feierstein
,
C. E.
,
Quirk
,
M. C.
,
Uchida
,
N.
,
Sosulski
,
D. L.
, &
Mainen
,
Z. F.
(
2006
).
Representation of spatial goals in rat orbitofrontal cortex
.
Neuron
,
51
,
495
507
.
Furuyashiki
,
T.
,
Holland
,
P. C.
, &
Gallagher
,
M.
(
2008
).
Rat orbitofrontal cortex separately encodes response and outcome information during performance of goal-directed behavior
.
Journal of Neuroscience
,
28
,
5127
5138
.
Fusi
,
S.
,
Miller
,
E. K.
, &
Rigotti
,
M.
(
2016
).
Why neurons mix: High dimensionality for higher cognition
.
Current Opinion in Neurobiology
,
37
,
66
74
.
Grattan
,
L. E.
, &
Glimcher
,
P. W.
(
2014
).
Absence of spatial tuning in the orbitofrontal cortex
.
PLoS One
,
9
,
e112750
.
Hayden
,
B. Y.
, &
Moreno-Bote
,
R.
(
2017
).
A neuronal theory of sequential economic choice
.
bioRxiv
,
221135
.
Hayden
,
B. Y.
,
Nair
,
A. C.
,
McCoy
,
A. N.
, &
Platt
,
M. L.
(
2008
).
Posterior cingulate cortex mediates outcome-contingent allocation of behavior
.
Neuron
,
60
,
19
25
.
Hayden
,
B. Y.
, &
Platt
,
M. L.
(
2010
).
Neurons in anterior cingulate cortex multiplex information about reward and action
.
Journal of Neuroscience
,
30
,
3339
3346
.
Hayden
,
B. Y.
,
Smith
,
D. V.
, &
Platt
,
M. L. L.
(
2009
).
Electrophysiological correlates of default-mode processing in macaque posterior cingulate cortex
.
Proceedings of the National Academy of Sciences, U.S.A.
,
106
,
5948
5953
.
Heilbronner
,
S. R.
, &
Hayden
,
B. Y.
(
2016
).
Dorsal anterior cingulate cortex: A bottom–up view
.
Annual Review of Neuroscience
,
39
,
149
170
.
Hong
,
H.
,
Yamins
,
D. L. K.
,
Majaj
,
N. J.
, &
DiCarlo
,
J. J.
(
2016
).
Explicit information for category-orthogonal object properties increases along the ventral stream
.
Nature Neuroscience
,
19
,
613
622
.
Hunt
,
L. T.
, &
Hayden
,
B. Y.
(
2017
).
A distributed, hierarchical, and recurrent framework for reward-based choice
.
Nature Review Neuroscience
,
18
,
172
182
.
Hunt
,
L. T.
,
Kolling
,
N.
,
Soltani
,
A.
,
Woolrich
,
M. W.
,
Rushworth
,
M. F. S.
, &
Behrens
,
T. E. J.
(
2012
).
Mechanisms underlying cortical activity during value-guided choice
.
Nature Neuroscience
,
15
,
470
476
.
Kennerley
,
S. W.
, &
Wallis
,
J. D.
(
2009
).
Encoding of reward and space during a working memory task in the orbitofrontal cortex and anterior cingulate sulcus
.
Journal of Neurophysiology
,
102
,
3352
3364
.
Kolling
,
N.
,
Wittmann
,
M. K.
,
Behrens
,
T. E. J.
,
Boorman
,
E. D.
,
Mars
,
R. B.
, &
Rushworth
,
M. F. S.
(
2016
).
Value, search, persistence and model updating in anterior cingulate cortex
.
Nature Neuroscience
,
19
,
1280
1285
.
Lansink
,
C. S.
,
Goltstein
,
P. M.
,
Lankelma
,
J. V.
,
McNaughton
,
B. L.
, &
Pennartz
,
C. M. A.
(
2009
).
Hippocampus leads ventral striatum in replay of place-reward information
.
PLoS Biology
,
7
,
1
11
.
Lansink
,
C. S.
,
Jackson
,
J. C.
,
Lankelma
,
J. V.
,
Ito
,
R.
,
Robbins
,
T. W.
,
Everitt
,
B. J.
, et al
(
2012
).
Reward cues in space: Commonalities and differences in neural coding by hippocampal and ventral striatal ensembles
.
Journal of Neuroscience
,
32
,
12444
12459
.
Lau
,
B.
, &
Glimcher
,
P. W.
(
2007
).
Action and outcome encoding in the primate caudate nucleus
.
Journal of Neuroscience
,
27
,
14502
14514
.
Levy
,
D. J.
, &
Glimcher
,
P. W.
(
2012
).
The root of all value: A neural common currency for choice
.
Current Opinion in Neurobiology
,
22
,
1027
1038
.
Liu
,
J.
, &
Newsome
,
W. T.
(
2005
).
Correlation between speed perception and neural activity in the middle temporal visual area
.
Journal of Neuroscience
,
25
,
711
722
.
Luk
,
C.-H.
, &
Wallis
,
J. D.
(
2013
).
Choice coding in frontal cortex during stimulus-guided or action-guided decision-making
.
Journal of Neuroscience
,
33
,
1864
1871
.
Marblestone
,
A. H.
,
Wayne
,
G.
, &
Kording
,
K. P.
(
2016
).
Towards an integration of deep learning and neuroscience
.
Frontiers in Computational Neuroscience
,
10
,
1
41
.
Matsumoto
,
K.
,
Suzuki
,
W.
, &
Tanaka
,
K.
(
2003
).
Neuronal correlates of goal-based motor selection in the prefrontal cortex
.
Science
,
301
,
229
232
.
McGinty
,
V. B.
,
Rangel
,
A.
, &
Newsome
,
W. T.
(
2016
).
Orbitofrontal cortex value signals depend on fixation location during free viewing
.
Neuron
,
90
,
1299
1311
.
Moore
,
T. L.
,
Killiany
,
R. J.
,
Herndon
,
J. G.
,
Rosene
,
D. L.
, &
Moss
,
M. B.
(
2005
).
A non-human primate test of abstraction and set shifting: An automated adaptation of the Wisconsin Card Sorting Test
.
Journal of Neuroscience Methods
,
146
,
165
173
.
Mulder
,
A. B.
,
Shibata
,
R.
,
Trullier
,
O.
, &
Wiener
,
S. I.
(
2005
).
Spatially selective reward site responses in tonically active neurons of the nucleus accumbens in behaving rats
.
Experimental Brain Research
,
163
,
32
43
.
Padoa-Schioppa
,
C.
(
2011
).
Neurobiology of economic choice: A good-based model
.
Annual Review of Neuroscience
,
34
,
333
359
.
Padoa-Schioppa
,
C.
, &
Assad
,
J. A.
(
2006
).
Neurons in the orbitofrontal cortex encode economic value
.
Nature
,
441
,
223
226
.
Padoa-Schioppa
,
C.
, &
Cai
,
X.
(
2011
).
The orbitofrontal cortex and the computation of subjective value: Consolidated concepts and new perspectives
.
Annals of the New York Academy of Sciences
,
1239
,
130
137
.
Padoa-Schioppa
,
C.
, &
Conen
,
K. E.
(
2017
).
Orbitofrontal cortex: A neural circuit for economic decisions
.
Neuron
,
96
,
736
754
.
Padoa-Schioppa
,
C.
,
Jandolo
,
L.
, &
Visalberghi
,
E.
(
2006
).
Multi-stage mental process for economic choice in capuchins
.
Cognition
,
99
,
1
13
.
Paxinos
,
G.
,
Huang
,
X. F.
, &
Toga
,
A. W.
(
2000
).
The rhesus monkey brain in stereotaxic coordinates
.
San Diego, CA
:
Academic Press
.
Pearson
,
J. M.
,
Watson
,
K. K.
, &
Platt
,
M. L.
(
2014
).
Decision making: The neuroethological turn
.
Neuron
,
82
,
950
965
.
Procyk
,
E.
,
Wilson
,
C. R. E.
,
Stoll
,
F. M.
,
Faraut
,
M. C. M.
,
Petrides
,
M.
, &
Amiez
,
C.
(
2016
).
Midcingulate motor map and feedback detection: Converging data from humans and monkeys
.
Cerebral Cortex
,
26
,
467
476
.
Rangel
,
A.
, &
Hare
,
T.
(
2010
).
Neural computations associated with goal-directed choice
.
Current Opinion in Neurobiology
,
20
,
262
270
.
Raposo
,
D.
,
Kaufman
,
M. T.
, &
Churchland
,
A. K.
(
2014
).
A category-free neural population supports evolving demands during decision-making
.
Nature Neuroscience
,
17
,
1784
1792
.
Rich
,
E. L.
,
Stoll
,
F. M.
, &
Rudebeck
,
P. H.
(
2017
).
Linking dynamic patterns of neural activity in orbitofrontal cortex with decision making
.
Current Opinion in Neurobiology
,
49
,
24
32
.
Rigotti
,
M.
,
Barak
,
O.
,
Warden
,
M. R.
,
Wang
,
X.-J.
,
Daw
,
N. D.
,
Miller
,
E. K.
, et al
(
2013
).
The importance of mixed selectivity in complex cognitive tasks
.
Nature
,
497
,
585
590
.
Roesch
,
M. R.
,
Taylor
,
A. R.
, &
Schoenbaum
,
G.
(
2006
).
Encoding of time-discounted rewards in orbitofrontal cortex is independent of value representation
.
Neuron
,
51
,
509
520
.
Rudebeck
,
P. H.
,
Walton
,
M. E.
,
Smyth
,
A. N.
,
Bannerman
,
D. M.
, &
Rushworth
,
M. F. S.
(
2006
).
Separate neural pathways process different decision costs
.
Nature Neuroscience
,
9
,
1161
1168
.
Schmitzer-Torbert
,
N. C.
, &
Redish
,
A. D.
(
2008
).
Task-dependent encoding of space and events by striatal neurons is dependent on neural subtype
.
Neuroscience
,
153
,
349
360
.
Schuck
,
N. W.
,
Cai
,
M. B.
,
Wilson
,
R. C.
, &
Niv
,
Y.
(
2016
).
Human orbitofrontal cortex represents a cognitive map of state space
.
Neuron
,
91
,
1402
1412
.
Shima
,
K.
, &
Tanji
,
J.
(
1998
).
Role for cingulate motor area cells in voluntary movement selection based on reward
.
Science
,
282
,
1335
1338
.
Siegel
,
M.
,
Buschman
,
T. J.
, &
Miller
,
E. K.
(
2015
).
Cortical information flow during flexible sensorimotor decisions
.
Science
,
348
,
1352
1355
.
Sleezer
,
B. J.
,
Castagno
,
M. D.
, &
Hayden
,
B. Y.
(
2016
).
Rule encoding in orbitofrontal cortex and striatum guides selection
.
Journal of Neuroscience
,
36
,
11223
11237
.
Sleezer
,
B. J.
, &
Hayden
,
B. Y.
(
2016
).
Differential contributions of ventral and dorsal striatum to early and late phases of cognitive set reconfiguration
.
Journal of Cognitive Neuroscience
,
26
,
1
16
.
Sleezer
,
B. J.
,
Loconte
,
G.
,
Castagno
,
M. D.
, &
Hayden
,
B. Y.
(
2017
).
Neuronal responses support a role for orbitofrontal cortex in cognitive set reconfiguration
.
European Journal of Neuroscience
,
38
,
42
49
.
Stott
,
J. J.
, &
Redish
,
A. D.
(
2014
).
A functional difference in information processing between orbitofrontal cortex and ventral striatum during decision-making behaviour
.
Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences
,
369
,
199
204
.
Strait
,
C. E.
,
Blanchard
,
T. C.
, &
Hayden
,
B. Y.
(
2014
).
Reward value comparison via mutual inhibition in ventromedial prefrontal cortex
.
Neuron
,
82
,
1357
1366
.
Strait
,
C. E.
,
Sleezer
,
B. J.
,
Blanchard
,
T. C.
,
Azab
,
H.
,
Castagno
,
M. D.
, &
Hayden
,
B. Y.
(
2016
).
Neuronal selectivity for spatial position of offers and choices in five reward regions
.
Journal of Neurophysiology
,
1
,
1098
1111
.
Strait
,
C. E.
,
Sleezer
,
B. J.
, &
Hayden
,
B. Y.
(
2015
).
Signatures of value comparison in ventral striatum neurons
.
PLoS Biology
,
13
,
1
22
.
Sul
,
J. H.
,
Kim
,
H.
,
Huh
,
N.
,
Lee
,
D.
, &
Jung
,
M. W.
(
2010
).
Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making
.
Neuron
,
66
,
449
460
.
Tsujimoto
,
S.
,
Genovesio
,
A.
, &
Wise
,
S. P.
(
2009
).
Monkey orbitofrontal cortex encodes response choices near feedback time
.
Journal of Neuroscience
,
29
,
2569
2574
.
van Wingerden
,
M.
,
Vinck
,
M.
,
Lankelma
,
J. V.
, &
Pennartz
,
C. M. A.
(
2010
).
Learning-associated gamma-band phase-locking of action-outcome selective neurons in orbitofrontal cortex
.
Journal of Neuroscience
,
30
,
10025
10038
.
Wang
,
M. Z.
, &
Hayden
,
B. Y.
(
2017
).
Reactivation of associative structure specific outcome responses during prospective evaluation in reward-based choices
.
Nature Communications
,
8
,
15821
.
Wikenheiser
,
A. M.
, &
Schoenbaum
,
G.
(
2016
).
Over the river, through the woods: Cognitive maps in the hippocampus and orbitofrontal cortex
.
Nature Reviews Neuroscience
,
17
,
513
523
.
Williams
,
Z. M.
,
Bush
,
G.
,
Rauch
,
S. L.
,
Cosgrove
,
G. R.
, &
Eskandar
,
E. N.
(
2004
).
Human anterior neurons and the integration of monetary reward with motor responses
.
Nature Neuroscience
,
7
,
1370
1375
.
Wilson
,
R. C.
,
Takahashi
,
Y. K.
,
Schoenbaum
,
G.
, &
Niv
,
Y.
(
2014
).
Orbitofrontal cortex as a cognitive map of task space
.
Neuron
,
81
,
267
278
.