Objects that promise rewards are prioritized for visual selection. The way this prioritization shapes sensory processing in visual cortex, however, is debated. It has been suggested that rewards motivate stronger attentional focusing, resulting in a modulation of sensory selection in early visual cortex. An open question is whether those reward-driven modulations would be independent of similar modulations indexing the selection of attended features that are not associated with reward. Here, we use magnetoencephalography in human observers to investigate whether the modulations indexing global color-based selection in visual cortex are separable for target- and (monetary) reward-defining colors. To assess the underlying global color-based activity modulation, we compare the event-related magnetic field response elicited by a color probe in the unattended hemifield drawn either in the target color, the reward color, both colors, or a neutral task-irrelevant color. To test whether target and reward relevance trigger separable modulations, we manipulate attention demands on target selection while keeping reward-defining experimental parameters constant. Replicating previous observations, we find that reward and target relevance produce almost indistinguishable gain modulations in ventral extratriate cortex contralateral to the unattended color probe. Importantly, increasing attention demands on target discrimination increases the response to the target-defining color, whereas the response to the rewarded color remains largely unchanged. These observations indicate that, although task relevance and reward influence the very same feature-selective area in extrastriate visual cortex, the associated modulations are largely independent.
Reward is a strong reinforcer and one of the most fundamental determinants of behavior that guides flexible adaptation to the environment and guarantees evolutionary fitness (Schultz, 2015). There is common agreement that the effect of reward on behavior is mediated by dopaminergic neuromodulation in the so-called reward pathway—a complex brain network including the OFC, the BG, and mesolimbic midbrain structures as core components (Schultz, 2000, 2007; McClure, York, & Montague, 2004). Rewards prioritize the processing of desired objects, which implies that rewards influence sensory selection, but how this is accomplished in the brain is not well understood. There is growing evidence showing that rewards associated with visual locations, features, or objects facilitate or impede discrimination performance depending on whether reward is, or was, linked with target or distractor properties, respectively (Chelazzi et al., 2014; Anderson, Laurent, & Yantis, 2011, 2013; Theeuwes & Belopolsky, 2012; Hickey, Chelazzi, & Theeuwes, 2010a, 2010b; Kristjansson, Sigurjonsdottir, & Driver, 2010; Navalpakkam, Koch, Rangel, & Perona, 2010; Rutherford, O'Brien, & Raymond, 2010; Della Libera & Chelazzi, 2006, 2009; Engelmann & Pessoa, 2007). Consistent with such influence on sensory selection, rewards were found to modulate even early stages of visual cortical processing (Baruni, Lau, & Salzman, 2015; Arsenault, Nelissen, Jarraya, & Vanduffel, 2013; Chubykin, Roach, Bear, & Shuler, 2013; Stanisor, van der Togt, Pennartz, & Roelfsema, 2013; Serences & Saproo, 2010; Franko, Seitz, & Vogels, 2009; Serences, 2008; Shuler & Bear, 2006), with recent research demonstrating that reward-dependent activity fluctuations in dopaminergic midbrain structures directly predict selectivity changes in object-selective visual cortex (Hickey & Peelen, 2015).
Reward-driven effects on sensory selection typically resemble those produced by visual attention in experimental settings that do not involve reward. For example, reward was shown to modulate neurophysiological indices (components of the ERP or event-related magnetic field [ERMF]) of attentional processing in humans in a wide range of tasks like cued spatial selection (Baines, Ruz, Rao, Denison, & Nobre, 2011), visual search (Donohue et al., 2016; Harris et al., 2016; Sawaki, Luck, & Raymond, 2015; Buschschulte et al., 2014; Qi, Zeng, Ding, & Li, 2013; Hickey et al., 2010a; Kiss, Driver, & Eimer, 2009), and feature-based selection (Hopf et al., 2015). Given the apparent overlap of neurophysiological signatures of sensory selection, it seems sensible to assume that reward and task relevance are mediated by the same top–down mechanism that modulates perceptual processing in visual sensory cortex.
On the other hand, there are experimental data suggesting that reward-driven biasing of visual sensory selection may operate independent of the biasing underlying the priority selection of attended items not associated with reward (see Anderson, 2013, for a recent review of relevant behavioral data). Evidence for this comes from observations showing that rewards can establish their effect in an incidential way (Seitz, Kim, & Watanabe, 2009; Pessiglione et al., 2008) by passively enhancing feature representations in visual cortex (MacLean, Diaz, & Giesbrecht, 2016; MacLean & Giesbrecht, 2015a; Chelazzi, Perlato, Santandrea, & Della Libera, 2013; Della Libera, Perlato, & Chelazzi, 2011; Kristjansson et al., 2010). Using fMRI, it was shown that fluctuating reward contingencies are coded in visual cortex independent of whether participants become aware of said contingencies (Serences, 2008). Furthermore, grating orientations associated with higher reward led to sharpened voxel-tuning functions, which could not be accounted for by global feature-based attention to the target orientation (Serences & Saproo, 2010).
To unambiguously verify whether reward-driven modulations of sensory selection in visual cortex are independent of similar modulations seen for attention to (unrewarded) target-defining properties turns out to be more challenging than it may appear at a first glance. The problem is that, in many experimental setups, the operational definition of the target and reward relevance are directly or indirectly overlapping in some form. The most typical “confound scenario” is that the attended target is simultaneously the item that gains reward upon successful selection. Overlapping effects of reward relevance and attention to the target may then be a trivial observation (Chelazzi et al., 2013; Maunsell, 2004). To avoid this form of confound, some ERP/ERMF studies explicitly changed reward contingencies independent of the experimental parameters controlling the allocation of attention to the target (MacLean & Giesbrecht, 2015b; Hickey et al., 2010a) or separated the top–down operational definition of reward and target definition (Donohue et al., 2016; Hopf et al., 2015; Buschschulte et al., 2014). But even then, very similar modulatory effects were observed in visual sensory cortex. For example, a reward-associated color presented outside the spatial focus of attention (FOA) caused a selection bias in ventral extrastriate cortex very similar—in terms of amplitude, time course, and source localization—to the global, that is, location-independent, modulation bias seen for a target-defining color (global color-based attention (GCBA; Hopf et al., 2015). However, some observations in Hopf et al. (2015) seemed to be compatible with independent modulatory effects. Specifically, the response enhancement to the target and reward colors when presented together outside the FOA was almost exactly the sum of the response when the reward and target color were presented alone. This additivity of effect size was consistent with the possibility that reward modulates the feature-selective response independent of GCBA. However, this conclusion can only be tentative because neither reward size nor attention was independently varied in this experiment, and the additivity of activity modulations per se is insufficient to prove their independence.
Here, we investigate more directly whether global modulatory effects of reward and GCBA in visual cortex are dissociable. We do so by changing the attention demands on target discrimination while keeping the reward-defining factors constant. The experimental logic is as follows. If reward modulations are independent of GCBA, they should remain unaltered when manipulating attentional demands on target discrimination. Alternatively, if reward and GCBA modulate sensory processing via a common mechanism, changing demands on target discrimination should also modulate the effect of reward. To address the issue, we employ a modified version of the experimental design in Hopf et al. (2015). The stimulus setup and probe conditions are illustrated in Figure 1A. Participants are asked to fixate the center of the screen and covertly attend (dashed circle) and discriminate a color-defined target half-sphere (blue) that is combined with a distractor half-sphere drawn in a task-irrelevant color (gray) in the left visual field (VF). A task-irrelevant double-color stimulus (probe) is simultaneously presented in the right VF. The ERMF response to this unattended probe serves as a measure of GCBA to the target color by comparing the response when the probe contains the target color (target probe [T-probe]) with the response when the probe contains only neutral control colors (control probe [C-probe]). Analogously, the effect of global reward-based selection is assessed by comparing the response to probes containing the reward color (reward probe [R-probe]) with that of C-probes. Finally there are probes containing both the target- and reward-defining color (target + reward probe [T&R-probe]), which allows us to assess the combined effect of attention and reward. The operational definition of the reward color was separated from that of the target color in the following way. On 25% of the trials, the target sphere contained both the target and reward colors. Participants were given reward upon correct target discrimination on those trials. Importantly, no reward was delivered on all other trials, including those in which the reward color appeared in the probe. The rewarded trials were excluded from the analysis of the magnetoencephalography (MEG) responses; hence, the reward color never defined the target on the trials that were included in the data analysis. Participants were asked to report whether the border between the target and neutral/reward colors was convex or concave relative to the target color. To manipulate the amount of attention devoted to the target, we changed the difficulty of discriminating the curvature defining this border. The curvature for easy and hard discriminations was individually determined in a separate behavioral experiment immediately before the MEG experiment (see below), so that perfomance levels for the easy and hard conditions could be matched across participants.
Eighteen participants (mean age = 26.72 years, SD = 2.92 years, range = 22–31 years, eight women) took part in the experiment. All the participants were students of the Otto-von-Guericke University of Magdeburg, gave informed and written consent, and were paid for their participation (€6/hr). All participants were right-handed, had normal or corrected-to-normal vision, and reported normal color vision.
Stimuli and Trial Structure
Each stimulus frame contained two bicolored 3-D spheres (diameter of 1.58° of visual angle) on a gray background (luminance = 10.6 cd/m2), one placed in the left and one in the right visual hemifield 1.6° below and 2.8° lateral from fixation. The sphere in the left VF was the target sphere, of which one half-sphere was always drawn in the target color (blue in Figure 1A). The other side of this sphere (i.e., the distractor half-sphere) was drawn in a neutral color (e.g., gray) or in the reward color (rewarded trials). The border between the target and distractor half-sphere was curved, with the degree of curvature defining easy or hard discrimination trials, respectively (Figure 2A, details are provided below). The probe sphere in the right VF was task irrelevant and did not feature a curvature manipulation of the border between the two half-spheres. The colors assigned to the half-spheres were taken from a set of five colors: red, green, blue, yellow, and gray. The luminance of the colors was psychophysically matched based on heterochromatic flicker photometry with reference to red (Kaiser, 1988) in four participants. The average of the resulting luminance values for each color (red/green/blue/yellow/gray = 56.0/85.0/26.0/138.0/107.0 cd/m2) were then used for all participants.
On each block, one color was designated the target color, another was designated the reward color, and the remaining three served as control colors. Figure 1B illustrates the color assignment and block sequence of one example session. Every color served twice in a row as target or reward color on different trial blocks of the session. The level of difficulty was alternated from block to block, always starting with an easy block. There were four different orders of color-to-block assignment, which were counterbalanced across participants. On each trial, the target-sphere always contained the target color, which randomly appeared in the left or right half-sphere. On 25% of the trials, the target color was combined with the reward color, and in this case, participants received a monetary reward upon correct target discrimination (rewarded trials). On the remaining 75% nonrewarded trials, the target color was randomly combined with one of the three control colors, which never appeared simultaneously in the probe. An example of all four possible color assignments to the probe (probe conditions) for no-reward trials is given in Figure 1A. The probe could contain the target and control color (T-Probe), the reward and control color (R-probe), both the target and reward color (T&R-probe), and two control colors (C-probe). Within an experimental block, the four probe conditions were presented equally often (25%). On rewarded trials, the probe was equally likely to be a T-, R-, or C-probe. T&R-probes never appeared on rewarded trials.
Manipulation of Discrimination Difficulty
Participants had to report whether the curvature of the border between the target and the distractor half-sphere was convex or concave relative to the target side. To vary the attentional demand on target discrimination, we increased (easy) or decreased (hard) the degree of curvature separating the target color from the distractor color. As illustrated in Figure 2A, the curvature was changed by rotating the sphere in depth around the vertical axis (yellow arrow), thereby bringing more to the front the target side (convexity) or the distractor side (concavity). To match the level of difficulty among participants, each participant performed a behavioral experiment immediately before the MEG experiment, in which we determined the individual rotation angles to set the performance level (accuracy) for the easy and hard conditions. Respective rotation angles served as a starting point for the subsequent MEG experiment, during which the angles were adjusted (stepping up or down 0.5–1°) from trial block to trial block for keeping the performance levels overall constant. The average (over trial blocks) rotation angle of the hard and easy condition of each participant in the MEG experiment is plotted in Figure 2B.
The stimulus setup and trial timing of the behavioral experiment was similar to the subsequent MEG experiment, except that no reward color was designated and that only red was used as a target color. All other colors appeared equally often as distractor color. Participants pressed a button with the right index or middle finger to indicate whether the target was convex or concave, respectively. Each session contained a total of 320 trials, with different rotation angles being presented equally often and in random order. The levels of difficulty were defined in the following way: “hard”: between 65% and 85% correct responses; “easy”: between 85% and 95% correct responses.
Task and Procedure (MEG Experiment)
Participants were seated at a 1-m distance to a rear projection screen. While fixating the central fixation cross, participants were asked to covertly attend to the left VF stimulus and report the curvature of the color-defined target half-sphere with a two-alternative button press of the right hand (index finger = convex, middle finger = concave). Each trial block started with a 2-sec frame informing the participant about the target and reward colors, as well as about the difficulty level of the task (easy or hard). On each trial, the two spheres were displayed on a gray background (10.6 cd/m2) for 700 msec, followed by an ISI of 600–1100 msec (uniform distribution). On rewarded trials, additional feedback was given for 400 msec (a frame showing “5 cents” or “0 cents”) informing the participant whether or not a reward was gained. The next trial followed after an ISI of 600–900 msec (uniform distribution). Participants performed 10 blocks of 128 trials, each lasting approximately 6 min. Between blocks, there were pauses of 1–2 min. The participants were not informed about the block-to-block adjustments of the rotation angle to keep the performance level of the easy and hard conditions in the defined ranges. The stimulus material was generated in MATLAB (The MathWorks) using the Psychophysics Toolbox Version 3 (Kleiner, Brainard, & Pell, 2007; Brainard, 1997; Pelli, 1997).
Monetary reward (€0.05) was delivered upon correct target discrimination on trials where the target and reward colors appeared together in the target sphere. The average total payoff that participants received as reward was approximately €13.50 (€12.0–14.45). The payoff was added to the standard payment for participation.
The magnetic and electric brain response was recorded using a 248-magnetometer whole-head BTi system (4D Neuroimaging, Magnes 3600 WH), together with a 32-channel EEG recording system (Neuroscan Compumedics). Details about the EEG recordings will not be provided, because respective data are not considered here. The MEG signal was continuously recorded and digitized at a sampling rate of 508.06 Hz. Before digitization, the signal was filtered online with a band pass of 0.01–100 Hz. The head position was monitored online with an infrared camera focused on the participant's head. When necessary, participants were instructed to readjust their position during breaks between trial blocks.
Coregistation of MEG and Anatomical Data
Coregistration was performed in each individual participant based on five marker coils placed at standardized positions in the EEG cap (Easycap Herrsching, Germany), as well as three anatomical landmarks (nasion, left and right periauricular points). Marker coils and landmarks were digitized using a Polhemus 3Space Fastrak system (Polhemus, Inc.). The position of the digitized marker coils was then brought into registration with the magnetically measured position of the marker coils. Before averaging the MEG signal across participants, the position of the sensor array of each participant relative to the landmarks was brought into reference with the sensor array of one selected participant showing the most canonical relationship between the sensor array and anatomical landmarks (reference participant). This was done by transforming the individual sensor data into source space (minimum norm least squares [MNLS] estimate) using the cortical surface of the Montreal Neurological Institute brain (ICBM-152 template) as source space compartment in each participant. The source space representation of the data was then back-projected to the sensor space representation of the reference participant via lead field inversion.
The MEG signal was epoched including a 200-msec baseline period before and a 700-msec window after stimulus onset. Epochs of the magnetic response containing recording or eye motion/blink artifacts were identified based on a peak-to-peak threshold criterion (M = 3.07 fT, range = 2.8–3.3 fT) and then rejected. The criterion resulted in an average rejection rate of 5.2% (range = 0.9–12.6%) of the trials. ERMF responses were derived from no-reward trials only by calculating averages of interest (correct responses only) according to the four probe conditions (C, T, R, T&R) and the two levels of discrimination difficulty (easy, hard). For illustration and statistical validation, average waveforms were analyzed at selected sensor sites best representing the overall probe- and contra-target response modulation (highlighted by red and blue ellipses in the field maps in Figure 3) relative to the control condition. To simplify data analysis, efflux and influx waveforms were collapsed by averaging them after reversing the polarity of the efflux response. ERMF waveforms were plotted using the ERPSS software (Event-Related Potential Software System, University of California, San Diego, La Jolla, CA). Source localization analysis was performed using the MNLS method (Fuchs, Wagner, Kohler, & Wischmann, 1999), as implemented in Curry 7 (Neuroimaging Suite, Compumedics Neuroscan USA Ltd.). Source density estimates were computed using the standard brain gray matter surface as source compartment provided with Curry 7.
Statistical Data Validation
Performance accuracy and RTs (of correct responses) on no-reward trials were analyzed using a 2 × 4 repeated-measures ANOVAs (rANOVAs), with factors of Difficulty (easy and hard) and Probe condition (C, T, R, T&R). RTs faster than 200 msec after stimulus onset were excluded from analysis. Whenever the overall rANOVA was significant, pairwise comparisons using paired sample t tests between probe conditions were performed separately for the easy and hard conditions. To control for the increasing Type 1 error under multiple comparisons, the alpha-level was adjusted using Bonferroni correction (n = 6, pcorr = .0083). To assess the effect of the reward color being present (rewarded trials) versus absent in the target (no-reward trials), separate rANOVAs with the factors Reward (reward color present, absent), Difficulty (easy and hard), and Probe condition (C, T, R) were computed. Note that, because on rewarded trials, T&R Probes were never appeared, the factor Probe condition had only three levels.
For statistical validation of the MEG data, rANOVAs as implemented in the ERPSS software (ranova by Luck/Henson) were computed. Waveform differences were tested against baseline using a time sample-by-sample sliding window approach. t tests were computed for mean amplitude measures in a time window of 29.5 msec (window size used in all analyses) centered on each consecutive time sample between 15.7 and 484.2 msec after stimulus onset. Computations were performed using a custom-made script that allowed for the iteration of the ranova program of ERPSS. To control for the Type I error due to multiple comparisons, only differences showing p < .05 on at least five consecutive time samples (9.8 msec) were considered statistically significant (see Guthrie & Buchwald, 1991, for a rationale of the approach). The analysis was hierarchically structured, with first computing an overall 4 × 2 rANOVA with the factors Probe condition (C, T, R, T&R) and Difficulty (easy and hard). Upon significant overall testing, post hoc pairwise comparisons between probe conditions were separately performed for the easy and hard conditions.
Figure 2C, D shows mean accuracy (C) and RT (D) measures for the four probe and two difficulty conditions (no-reward trials only). Accuracy was approximately 80% for the hard and approximately 93% for the easy condition, which reflects the predefined target performance ranges. RTs were generally faster for the easy condition versus the hard condition. Overall, rANOVAs with the factors Difficulty (easy and hard) and Probe condition (C, T, R, T&R) yielded significant main effects of Difficulty (accuracy: F(1, 17) = 148.49, p < .001; RT: F(1, 17) = 47.90, p < .001), confirming that the experimental manipulation of discrimination difficulty was effective. There were also significant main effects of Probe condition (accuracy: F(3, 51) = 4.36, p < .01; RT: F(3, 51) = 11.04, p < .001) and a significant Probe Condition × Difficulty interaction for accuracy, F(3, 51) = 3.40, p < .05. For RT, the interaction was not significant, F(3, 51) = 0.40, p = .75. Post hoc pairwise comparisons between probe conditions revealed that the effect of difficulty is mainly due to accuracy being reduced for T&R-probes relative to R-probes (T&R vs. R: p < .001) under hard conditions, whereas there was no difference between probe conditions under easy conditions. Note that given that the accuracy level under easy conditions was fairly high, a performance decrement for T&R-probes like that seen under hard conditions may not be visible due to performance being at ceiling. Finally, analogous post hoc comparisons for RT revealed a response slowing for T&R relative to C under easy (p < .001) and hard conditions (p < .001).
The performance effect of presenting the reward color in the target sphere was assessed by a three-way rANOVA with the factors Reward (reward color present, absent), Difficulty (easy and hard), and Probe condition (C, T, R). Confirming previous observations (Hopf et al., 2015), the analysis yielded a significant main effect of Reward (accuracy: F(1, 17) = 8.44, p < .01; RT: F(1, 17) = 14.3, p < .001), reflecting the fact that participants performed with 2.7% higher accuracy and responsed 34.2 msec faster on no-reward versus rewarded trials. There was also a significant main effect of Difficulty (accuracy: F(1, 17) = 138.7, p < .0001; RT: F(1, 17) = 39.6, p < .0001), reflecting the intended experimental manipulation. There was no significant two- or three-way interaction.
Figure 3 shows the magnetic response as a function of probe type (T red, R green, and T&R purple) contralateral to the probe (contra-probe response, left) and contralateral to the target (contra-target response, right) averaged over the levels of difficulty and overlaid on the control condition C (black). The location of the sensors representing the contra-probe response is highlighted by the red and blue ellipses in the left hemisphere of the topographical field map. The target-related response is measured at sensor sites is highlighted by ellipses in the right hemisphere. The probe-related modulation is visible as efflux (red field lines) and influx (blue field lines) configuration over the left posterior–lateral hemisphere contralateral to the probe. To simplify data presentation and analysis, the waveforms of the efflux and influx sensors are averaged after reversing the polarity of the efflux response. In the middle of Figure 3 are 3-D maps displaying the current density distribution (MNLS estimates) on the standard brain for the response modulations shown in the field maps. All three conditions show a prominent left hemisphere modulation between approximately 180 and 330 msec, which is largest for T&R-probes and smallest for R-probes. The underlying field topography is comparable between conditions, and the underlying source density distributions reveal similar source maxima in the left ventral extrastriate cortex, confirming our previous observation that reward- and attention-related color biasing refers to the same underlying modulation in ventral extrastriate visual cortex (Hopf et al., 2015). A comparison of the contra-probe responses (left) with the contra-target responses (right) reveals a clear lateralization of the modulatory response, with response enhancements for target and reward probes appearing over the left hemisphere consistent with the probe being presented in the right VF. Only minimal modulations are seen contralateral to the target side, which appear as small response reductions relative to the control condition. Note that it is possible, but very unlikely, that activity modulations due to reward are lateralized to the left hemisphere independent of the VF of stimulation. This possibility cannot be ruled out with the present experimental design.
Figure 4 plots the magnetic waveforms of the probe-related (A) and contra-target response (B) as a function of probe condition separately for the easy (top row) and hard discrimination level (bottom row). For the contra-probe response, the response of T (red) and R (green) relative to C are overall comparable under easy conditions. The response increment of the T&R condition (purple) relative to C is substantially bigger and approximately twice the modulation of the T and R conditions, which is in line with our previous findings (Hopf et al., 2015). For the hard condition, in contrast, the response modulation of T-probes is larger than that of R-probes, with the former almost reaching the modulation size of T&R-probes, suggesting that increasing discrimination difficulty alters the response of T-probes more than that of R-probes. For statistical validation a 4 × 2 sliding window ANOVA (see Methods) with the factors Probe condition (T, R, T&R, C) and Difficulty (easy, hard) was computed, which revealed significant main effects of Probe condition and Difficulty (significant time ranges are indicated by black and yellow horizontal bars in Figure 4A, bottom row). Importantly, there was a significant Probe Condition × Difficulty interaction (orange horizontal bars), confirming that increasing discrimination difficulty had a differential effect on the contra-probe response elicited by the different probe conditions. In contrast to the contra-probe response, the contra-target response (B) shows only minimal variation as a function of probe condition under easy conditions, with T-, R-, and T&R-probes displaying a slightly smaller response between approximately 250 and 330 msec relative to C-probes. For the hard condition, the response pattern does not change much, except that for R-probes a more pronounced response attenuation appears between 200 and 250 msec relative to all other probe types. A sliding window ANOVA computed on the waveforms of the contra-target response yielded significant main effects of Probe condition (215–237 and 265–295 msec) and Difficulty (185–212 msec), but there was no interaction of those factors, indicating that overall task difficulty did not influence the contra-target response variation due to the different probe conditions.
To assess more directly whether the easy–hard manipulation affected the GCBA and reward-related modulation differently, the data shown in Figure 4A are replotted in Figure 5 by overlaying the waveforms of the easy and hard levels separately for each probe condition. As visible, T-probes display a significant response enhancement (sliding pairwise comparison, see Methods) for the hard (red solid) relative to the easy condition (red dashed) condition between approximately 190 and 330 msec, whereas there are only small and nonsignificant differences for the R-probes (green) and T&R-probes (purple). Importantly, there is also no difference whatsoever between the easy and hard conditions of C-probes (black), indicating that the probe response is not modulated by the overall task difficulty in an unspecific way. In summary, changing task difficulty led to a modulation of the GCBA response to the target color, although the response to the reward color remained largely unaltered.
Of note, it is theoretically possible that the unchanged response to R-probes is due to a global response attenuation when presenting the reward color in the probe under hard conditions, because here the color is irrelevant but more distracting than a control color. The fact that, under hard conditions, the contra-target response shows a smaller response for the reward relative to the target color in the probe would fit with such global attenuation (Figure 4B, bottom row). However, respective response reduction appears between 200 and 250 m. The contra-probe response enhancement of T-probes relative to R-probes, in contrast, is maximal beyond 250 msec (Figure 4A, bottom row), which is not compatible with an account in terms of a global attenuation of the reward color.
The source localization analysis reported above revealed that the T–C modulation (collapsed over discrimination difficulty) originates in ventral extrastriate cortex (Figure 3, top row). An important question then is whether the difficulty-dependent response enhancement seen for T-probes would be consistent with the current origin of the T–C effect. To clarify this question, we computed source density estimates of the hard-minus-easy difference of the T–C response. The results are summarized in Figure 6, which shows the T–C difference waveforms (top row) for the hard (red solid) and easy (red dashed) conditions, together with source density estimates of the double-difference T–C(hard) minus T–C(easy) (bottom row). The difficulty-related modulation, indeed, appears to arise in the left ventral extrastriate cortex between 200 and 300 msec, almost exactly in the region where the T–C effect shows its maximum. Hence, target discrimination difficulty influenced sensory selection in ventral extrastriate cortex exactly where attention to color biases activity in that region.
In the present experiment, we replicate previous observations (Hopf et al., 2015) showing that a color associated with reward elicits a spatially global response enhancement (outside the spatial FOA) relative to a neutral irrelevant color in extrastriate visual cortex. This response enhancement is very similar to the global response enhancement elicited by an attended color that defines the target (GCBA; Bartsch et al., 2015). In Hopf et al. (2015) and in the present experiment, the definition of reward relevance was dissociated from the definition of the target by delivering reward only when the reward color appeared in the target, but not when it appeared in the task-irrelevant color probe presented in the unattended VF. That is, the reward color never defined the target on those experimental trials that served to examine the global modulatory effect of reward (R-probe trials). This was done to avoid that on a given trial a rewarded target would motivate stronger attentional deployment to the reward-related feature (Maunsell, 2004). When dissociating the target definition and reward in this way, we find that the reward color causes a global response enhancement in extrastriate cortex. Importantly, we predicted that when the reward-related modulation arises independent of the GCBA modulation, increasing the discrimination difficulty of the color target should modulate the GCBA response but not the global response to the reward-associated color. This is exactly what we observe here. The global response enhancement for the reward color remains largely unaltered when increasing discrimination load, although the global reward effect and GCBA refer to a modulation of the very same color-selective region in ventral extrastriate cortex. The observation of a simultaneous selection bias for the target- and reward-defining color aligns with recent psychophysical work, suggesting that observers can instantiate multiple attentional control settings for different feature values within one feature dimension like color (Grubert, Carlisle, & Eimer, 2016; Grubert & Eimer, 2016; Irons, Folk, & Remington, 2012; Roper & Vecera, 2012; Moore & Weissman, 2010). The present data add that such multiple control settings can also reflect the operation of reward-driven and reward-independent target-defining factors.
An observation worth discussing is that, although T- and R-probes elicited prominent modulations in ventral extrastriate cortex, those probes were not associated with significant RT increments relative to C-probes in either easy or hard condition. RT increments to distractor items carrying a target- or reward-defining color would typically be taken to indicate that those items captured spatial attention (Anderson et al., 2011; Folk & Remington, 1998). The absence of such RT effects in this study suggests that R- and T-probes did not capture attention, which seems unexpected at a first glance. However, given the extratriate modulations observed here reflect a global sensitivity bias operating in parallel throughout the VF, they do not prioritize color selection at a specific location. This suggests that GCBA modulations are not directly underlying attentional or value-based capture, which may rather depend on postperceptual stages of processing (Adamo, Pun, & Ferber, 2010; Parrott, Levinthal, & Franconeri, 2010).
In Hopf et al. (2015), we observed that response amplitudes of T- and R-probes were additive to match the response size of the combined presentation (T + R = T&R). This response pattern is seen again in the easy condition of the present experiment (task difficulty was even easier in Hopf et al., 2015, than in the easy condition here). Under hard conditions, however, the response to T&R-probes is smaller than the sum of the response to T- and R-probes, suggesting underadditivity. We presume that this is because the modulatory response in extrastriate cortex approaches saturation when presenting the target and reward colors in combination, which limits further enhancement due to increasing discrimination (ceiling effect). This would also explain why the response to T&R-probes was not further enhanced for the hard relative to the easy condition, although those probes contain the target color. To ultimately clarify the role of response saturation, further experiments using a parametric variation of discrimination difficulty would be necessary.
The observed response pattern in the easy condition (and in Hopf et al., 2015) may not necessarily indicate additivity of the reward and GCBA modulation. It is possible that, because on reward trials, the target sphere always contained both the target and reward colors, that observers have built a reward template for the combined occurrence of both colors but not for the reward color alone. The response to R-probes could then reflect a partial match to the combined reward template, with the T&R-probes providing a full match, thereby accounting for the stronger response to the latter probes. With the present data, this possibility cannot be ruled out, but it would require more complex assumptions about the nature of top–down settings than assuming that participants built a separate feature template for the target and reward colors. In particular, it would imply that the same color is a defining part of one feature template when combined with a specific color (reward) while it simultaneously defines a different template on its own when combined with several other colors (target). To find out whether such arguably more complex settings are possible requires more experimental work. In any case, whether the reward response reflects a partial match with a combined color template or a full match with a single template, the observation that discrimination difficulty affects the response to the target- but not reward-probes indicates that the attentional bias for the reward-related color is specific and appears in parallel to the target-defining color.
Such specific bias aligns with several previous observations in which reward modulations were observed in the absence of attention to reward: (1) Unrewarded distractor features previously associated with reward produce a lingering priority bias for that feature (MacLean et al., 2016; MacLean & Giesbrecht, 2015a; Della Libera et al., 2011; Hickey et al., 2010a; Kristjansson et al., 2010; Della Libera & Chelazzi, 2006, 2009). (2) Perceptual learning appears for reward-associated orientation features, which are rendered imperceptible via continuous flash suppression (Seitz et al., 2009). (3) Activity changes in visual cortex indexed by fluctuations of the BOLD response track reward history, which is not consciously accessible by the observer (Serences, 2008).
It should be noted that the present experimental design reveals brain activity modulations primarily reflecting short-latency biasing effects of reward (Hickey et al., 2010a; Kristjansson et al., 2010). The effects reported here are not representing long-lasting reward effects, which were shown to persist for days even when the reward association is not reinforced anymore (MacLean & Giesbrecht, 2015a, 2015b; Della Libera & Chelazzi, 2009). This is because the color assignment to the target, reward, and control color changed every second block (to control for low-level sensory differences between colors), with the consequence that from the fifth block onward each control color had served either as a target or reward color on previous blocks.
The present observations have also a more general implication for reward coding in the brain. It has been emphasized that, unlike visual attention, which refers to top–down modulations of sensory processing in visual cortex, rewards are not linked to a dedicated sensory representation (Schultz, 2015). Rewarding properties of desired objects are assumed being coded by phasic responses in midbrain dopaminergic regions (substantia nigra [SN], ventral tegmental area [VTA]) independent from otherwise relevant (attended) object properties (“the reward retina”; Schultz, 2009). However, for such code to selectively highlight desired object features, it is necessary to link signals of the “reward retina” with feature representations in sensory cortex. The dopaminergic reinforcement signal is diffuse and unselective, which raises the question of how a feature-selective sensory bias is brought about. A possible solution to this credit assignment problem has been proposed by the AGREL (attention-gated reinforcement learning) model (Roelfsema, van Ooyen, & Watanabe, 2010; Roelfsema & van Ooyen, 2005), which posits that a diffuse modulatory drive of reward in visual cortex is gated by attention, with the reward-driven reinforcement signal taking effect only in attended and task-relevant sensory representations. Experimental data consistent with AGREL have been provided (Hickey & Peelen, 2017; Schiffer, Muller, Yeung, & Waszak, 2014). Activity fluctuations in dopaminergic midbrain structures were shown to predict the amount of reward-related sensory enhancement and suppression in object-selective visual cortex (Hickey & Peelen, 2015). The selective response enhancement for the reward color observed here would then reflect an independent but attention-gated sensory bias, consistent with the reward and GCBA modulations being almost indistinguishable in visual cortex (Stanisor et al., 2013).
It is also possible that top–down signals mediating reward and attention (GCBA) combine outside the visual cortex, from where selective modulatory projections target the visual cortex. Strong overlapping modulatory effects of attention and reward were documented for the caudate nucleus and the SN/VTA (Boehler et al., 2011; Engelmann, Damaraju, Padmala, & Pessoa, 2009). In the monkey, the caudate nucleus has been implicated as a locus where motivational value and visual information are linked (Kawagoe, Takikawa, & Hikosaka, 1998) and where subpopulations of neurons seem to code for color–reward associations (Lauwereyns et al., 2002). Those feature–reward associations may then be directly relayed to visual cortex areas to modulate feature-based selectivity (via the visual corticostriatal loop; Seeger, 2008). Alternatively, they could be relayed to cortical sites of attentional control in frontal and/or parietal cortex (Corbetta & Shulman, 2002), where reward and attention signals are primarily combined. Indeed, single neuron recordings in macaque pFC revealed clusters of neurons at the intersection of ventromedial, anterior cingulate, and lateral portions that seem to integrate valuation and attention signals (Kaping, Vinck, Hutchison, Everling, & Womelsdorf, 2011). The integrated signal may then directly modulate visual sensory cortex, analogous to the way the FEF gates covert spatial attention (Noudoost & Moore, 2011; Armstrong, Fitzgerald, & Moore, 2006; Moore & Armstrong, 2003) and feature-based attention (Burrows, Zirnsak, Akhlaghpour, Wang, & Moore, 2014) in visual cortex area V4.
Another candidate cortical structure where reward may integrate with attention is the parietal cortex—a key component of the frontoparietal attention network (Corbetta & Shulman, 2002; Kastner & Ungerleider, 2000). In the monkey, a core structure of this network is the lateral intraparietal cortex (LIP), which is assumed to correspond with the superior parietal cortex in humans. LIP is known to control the allocation of attention by prioritizing selection based on bottom–up cues and top–down cognitive cues like attention (Bisley & Goldberg, 2010). LIP also codes reward (Kubanek & Snyder, 2015; Louie, Grattan, & Glimcher, 2011; Sugrue, Corrado, & Newsome, 2004, 2005; Platt & Glimcher, 1999) independent of attention (Peck, Jangraw, Suzuki, Efem, & Gottlieb, 2009; Bendiksby & Platt, 2006). Importantly, LIP neurons were found to sum visual, saccade directing, and cognitive signals including reward and attention (Ipata, Gee, Bisley, & Goldberg, 2009), which may provide the basis for a common output that modulates feature-selective responses in visual cortex areas.
The present findings replicate previous observations that a reward-associated but task-irrelevant color causes a spatially global gain modulation in ventral extratriate visual cortex. The modulation is almost indistinguishable, in terms of time course, polarity, and cortical current origin, from the modulation entailed by GCBA, suggesting that sensitivity changes due to GCBA and reward result from a response modulation in the same feature-selective visual cortex area. Despite the involvement of the same cortex area, a manipulation of attention demands on target discrimination modulates the global response to the attended color but leaves the response to the reward color unchanged, suggesting that reward can establish a feature-selective sensory bias that dissociates from task-defined feature-based attention.
This work was supported by Deutsche Forschungsgemeinschaft SFB779/TPA1.
Reprint requests should be sent to Jens-Max Hopf, Leibniz-Institute for Neurobiology, Brenneckestrasse 6, D-39118 Magdeburg, Germany, or via e-mail: firstname.lastname@example.org.