Long-term spatial contextual memories are a rich source of predictions about the likely locations of relevant objects in the environment and should enable tuning of neural processing of unfolding events to optimize perception and action. Of particular importance is whether and how the reward outcome of past events can impact perception. We combined behavioral measures with recordings of brain activity with high temporal resolution to test whether the previous reward outcome associated with a memory could modulate the impact of memory-based biases on perception, and if so, the level(s) at which visual neural processing is biased by reward-associated memory-guided attention. Data showed that past rewards potentiate the effects of spatial memories upon the discrimination of target objects embedded within complex scenes starting from early perceptual stages. We show that a single reward outcome of learning impacts on how we perceive events in our complex environments.
Recent empirical evidence has substantiated the long-held notion that past experiences, stored as long-term memories (LTMs), can be used proactively to optimize perception within familiar contexts. The ability of spatial contextual LTMs to drive attention and enhance detection of relevant objects has been demonstrated using both arbitrary spatial stimulus arrangements (Chun & Jiang, 1998) and naturalistic scenes (Patai, Doallo, & Nobre, 2012; Summerfield, Rao, Garside, & Nobre, 2011; Becker & Rasmussen, 2008). Orienting attention from LTM engages activity in the parietal-frontal network for visual-spatial orienting as well as brain regions implicated in retrieval of object locations within specific contexts (e.g., hippocampus; Stokes, Atherton, Patai, & Nobre, 2012; Summerfield, Lepsien, Gitelman, Mesulam, & Nobre, 2006).
In real-world situations, however, both memory and attention are strongly modulated by motivational factors. Remembering the rewarding outcomes of past experiences and generating future expectations accordingly is essential to guide adaptive behavior. Reward values have been proposed to influence future choices and actions (Shohamy & Adcock, 2010; Serences, 2008) and have increasingly been suggested to influence attentional and perceptual processes (Anderson, Laurent, & Yantis, 2011; Padmala & Pessoa, 2011; Sänger & Wascher, 2011; Hickey, Chelazzi, & Theeuwes, 2010; Kristjànsson, Sigurjónsdóttir, & Driver, 2010; Navalpakkam, Koch, Rangel, & Perona, 2010; Pessoa & Engelmann, 2010; Della Libera & Chelazzi, 2006, 2009; Kiss, Driver, & Eimer, 2009; Raymond & O'Brien, 2009; Krawczyk, Gazzaley, & D'Esposito, 2007). Convergent evidence from electrophysiological recordings in rodents and monkeys (Bethus, Tse, & Morris, 2010; Rossato, Bevilaqua, Izquierdo, Medina, & Cammarota, 2009; Singer & Frank, 2009; Wirth et al., 2009; O'Carroll, Martin, Sandin, Frenguelli, & Morris, 2006; Lisman & Grace, 2005; Rolls & Xiang, 2005; Jay, 2003; Tabuchi, Mulder, & Wiener, 2003) as well as from fMRI (Kuhl, Shah, DuBrow, & Wagner, 2010; Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Wittmann et al., 2005) and intracranial recordings (Vanni-Mercier, Mauguière, Isnard, & Dreher, 2009) in humans has revealed the role of reward in modulating hippocampus-dependent memories. In addition, reward has also been shown to influence attention by enhancing neural processing within the spatial orienting network (Small et al., 2005). However, the crucial question left unanswered is whether reward values associated to past memories can modulate memory-based expectations to enhance attention and optimize how we perceive events in our complex environments.
Here, we developed a sensitive perceptual-judgment memory-cueing task to examine whether the past reward outcome of memories can modulate spatial expectations from LTM to enhance fine-grained perceptual discriminations of relevant objects embedded within natural scenes. We also capitalized on the high temporal resolution of ERPs to probe the level(s) at which LTMs and their reward associations can bias neural processing. We used a modified version of the experimental approach developed by Summerfield et al. (2006). Participants first performed a learning task during which they learned the spatial location of a predefined target (a small key) embedded within naturalistic visual scenes. Reward associations were manipulated by giving rewards during the last block of the learning task on a proportion of trials. Twenty-four hours later, they completed a LTM-cued covert orienting task. Participants discriminated the presence or absence of target key stimuli within the memorized scenes while ERPs were recorded. The initial presentation of the scene (without the target) served as the attentional cue in each trial. Scenes that contained a target during the learning task constituted valid cues that predicted where the upcoming target would appear within the scene. These scenes could either be associated with a specific target location that had been rewarded (rewarded-valid cues) or non-rewarded (non-rewarded-valid cues). Scenes without a target during learning constituted neutral cues that did not provide any predictive information about the target location (neutral cues, Experiment 1). We tested whether and how positive reward associations enhanced the behavioral benefits and neural effects of LTM-based attention to learned target locations. Importantly, by providing reward only at the final episode of learning, after the target had been identified, it was possible to analyze the effects of a single exposure to a rewarding outcome without any change to the learning process itself. Because participants had no foreknowledge about which scenes would be rewarded, there was no possibility of preferentially memorizing target locations within those scenes. Because reward associations were delivered in a blocked fashion in Experiment 1, a follow-up experiment (Experiment 2) was conducted to ensure that the benefits conferred by one single reward association to the attentional orienting effects could also be obtained when cue associations were intermixed on a trial-by-trial basis at the end of learning.
We hypothesized that spatial predictions from LTM would bias visual search to improve accuracy and RTs to discriminate the presence versus absence of a target in the scene. Target-locked ERPs were recorded during the orienting task to investigate the stages of neural modulation influenced by reward-associated spatial contextual memories during memory-guided visual search. We focused our analysis on modulations of well-established ERP markers of early visual processing (P1 and N1 potentials) and target selection in visual search (the N2pc).
Experiment 1: Reward Potentiation of Memory-based Spatial Orienting
Eighteen healthy students from the University of Oxford participated in this study for monetary compensation. Data from four participants were discarded from analysis because of excessive oculomotor artifacts. The remaining 14 participants (eight women) had a mean age of 23.5 years (range, 19–32 years). All were right-handed and had normal or corrected-to-normal vision. The protocols were approved by the University of Oxford Central University Research Ethics Committee.
There were three phases to the experiment. Participants performed first a learning task, completed over two sessions on consecutive days; followed by a memory-cued orienting task and a spatial memory recall task on the third day (see Figure 1).
Two hundred twenty-eight digital images of scenes were obtained from lab members. A set of 12 scenes was used for familiarization and practice trials. An additional 216 scenes were used in the experimental trials. Matlab (Mathworks, Natick, MA) was used to prepare the stimuli. Each scene was prepared in two different formats, used for the learning and orienting tasks. For counterbalancing purposes, five learning task versions were prepared for each scene with the key (15 × 29 pixels, equivalent to 0.3° × 0.7°) placed in one of each of the four visual quadrants or with the key absent. The assignment of scenes to different experimental conditions, key presence or absence, and key location were counterbalanced across participants. For the orienting task, keys were replaced by a larger and brighter version (25 × 49 pixels; 0.6° × 1.1°) to make the key visible within the briefly displayed target scene. Scene stimuli were presented using Presentation software (Neurobehavioral Systems, Albany, CA) and subtended 22° × 17° of visual angle at a viewing distance of 100 cm.
Participants viewed each of the 216 scenes, repeated in random order, over six blocks (Figures 1 and 2). A small gold key target was present in 144 scenes (36 per quadrant) and absent in 72 scenes. Participants explored the scenes overtly to search for the key. Once located, they activated the mouse cursor with a left-side mouse click and indicated the location of the key by positioning the cursor on the location of the key and making a second left-sided mouse click. If participants made no response, the computer automatically moved onto the next scene after a variable search time. Allowable search times decreased over blocks: 16–24 sec in Block 1, 12–20 sec in Blocks 2 and 3, 10–18 sec in Blocks 4 and 5, and 8–16 sec in Block 6. Exposure times for scenes with and without keys were equated through an automated algorithm, which randomly drew the maximum presentation time for scenes without keys from the last five exposure durations in scenes containing keys. Participants were asked to find as many keys as possible and to memorize their locations. Participants received visual written feedback when they correctly identified the location of the key.
After the six blocks, participants performed an additional reward block, in which the same 216 scenes were divided equally into and presented across two different blocks (Figure 1). In one of these blocks (“reward block”), participants were rewarded £0.40 for each key they found but lost £0.20 for each key they were unable to find. No reward was given for scenes in which there was no key present. Visual feedback after each scene indicated monetary gains or losses. In the other block (“non-reward block”), they were asked to find as many keys as possible, but no monetary reward was given. Again, they received visual written feedback after each scene as a function of their performance. To ensure that only well-learned key locations were rewarded, the maximum search time was 5 sec in both reward blocks. The order of reward and non-reward blocks was counterbalanced. Subjects started with £20 for their participation in the study and could increase that amount up to a maximum of £48, depending on their performance in the reward blocks.
Eye movements were recorded using an infrared eye-tracking system (ISCAN) and visualized using ILAB (Gitelman, 2002).
Memory-cued orienting task
Participants returned one day after completing the learning task to perform a memory-cued orienting task while the EEG was recorded (Figure 1). Participants viewed previously studied scenes for a brief exposure and made forced-choice responses, indicating whether a bright gold key was embedded within the scene.
Participants completed 216 trials. Each trial began with the brief presentation (100 msec) of a previously studied scene, which contained no key and which acted as an attentional cue (cue scene). After a randomized ISI of 750–1150 msec, the scene (target scene) reappeared briefly (200 msec) as the target scene, and participants had to discriminate whether it contained an embedded target. On two thirds of the trials (144 trials), the location of the key in the learning task that had been rewarded (“Rewarded-valid” trials; 72 scenes: 48 “target-present,” 24 “target-absent”) or non-rewarded (“Non-rewarded-valid” trials; 72 scenes: 48 “target-present,” 24 “target-absent”), predicted with 100% validity the location where the target would appear. On the remaining one third, no key had been present in the learning task, and therefore, participants had no spatial predictive information about the location at which the target would be presented (“Neutral” trials; 72 scenes: 48 “target-present,” 24 “target-absent”). Subjects had a 1000-msec response window after the target scene disappeared. The intertrial interval varied randomly between 2000 and 3000 msec. Trials were randomly intermixed throughout the task. The task was performed covertly, and eye movements were monitored using an infrared eye-tracking system (ISCAN).
Participants performed a short practice session (12 trials) before the orienting task to ensure comprehension of and become familiar with the task.
Spatial memory recall task
Immediately following the orienting task, participants performed a task measuring explicit memory for the location of the key within each scene. They viewed the same scenes presented in the learning task without any key present. They used the mouse to click on the remembered location from the learning task. If they had no memory, they clicked on the center of the screen. Participants also rated their response confidence after each scene on a 3-point scale by clicking one of the three mouse buttons (1 = not at all confident; 2 = fairly confident; 3 = very confident).
Behavioral Statistical Analysis
Performance in the learning task was analyzed by calculating the mean percentage of keys found in each block and the mean search time taken to locate the keys for each block. To test for the progressive learning of the key locations, accuracy and search time measures were analyzed by linear contrasts over the six blocks using repeated-measures ANOVAs. The reward-manipulation blocks were introduced after learning had reached its asymptotic, optimal value. To test whether any further learning occurred in the reward blocks, performance measures in the reward blocks were compared with those in the immediately preceding learning block using an ANOVA. In addition, we also carried out a separate ANOVA comparing performance in the reward-manipulation blocks to rule out any global differences in learning between reward and non-reward blocks that could potentially confound the interpretation of subsequent performance and neural measures on the orienting task.
The benefits on performance conferred by the reward associations of memory cues were analyzed by submitting measures of RTs to targets and accuracy (i.e., percentage of correct “target-present/target-absent” discriminations) to ANOVAs testing for linear effects across Condition (rewarded-valid, non-rewarded-valid, neutral) and Response (present, absent).
The analysis of the orienting task used only scenes in which participants had successfully located the target key by the final block of the learning task (6.4% of the trials were excluded). For RT analysis, only correct trials were used (12% of the total trials were excluded). Trials were also excluded if RTs exceeded ±3 standard deviations (SD; 0.50% of the total trials were excluded).
Spatial memory recall task
The distance between the correct coordinate of the key location and the recalled location was computed, using only scenes for which the participants had correctly located the key in the learning task. To minimize the influence of viewing the location of the keys during the orienting task on the explicit LTM recall, only scenes from “target-absent” trials were analyzed.
ERP Recording and Data Processing
The EEG was recorded continuously from 40 Ag/AgCl electrodes mounted on an elastic cap, positioned according to the 10–20 international system (AEEGS, 1991). Recording was referenced to the right mastoid and re-referenced off-line to averaged mastoids. The horizontal EOG was recorded bipolarly with electrodes around right eye (outer canthus and inner bridge of the nose). The vertical EOG was recorded bipolarly using FP2 and an electrode placed below the right eye. The signal was digitized at 1000 Hz and low-pass filtered at 200 Hz. Data were further low-pass filtered off-line at 40 Hz.
The continuous EEG was segmented into epochs starting 1050 msec before and ending 600 msec after the target scene presentation. The prestimulus interval spanned the maximum ISI to enable removal of trials with anticipatory saccades. Epochs were normalized using a baseline of 50 msec before and after stimulus presentation. Epochs containing blinks or large saccades (horizontal EOG and vertical EOG exceeding ±50 μV), excessive noise or drift (a voltage exceeding ±100 μV at any electrode) were automatically excluded. Epochs were subsequently visually inspected for smaller saccades, blinks, and drifts and discarded if necessary. Finally, trials with incorrect responses or corresponding to scenes where participants failed to locate the key by the final block of the learning task were also excluded from all the further analysis. The minimum number of artifact-free trials per subject per condition was set at 20.
Epochs in “target-present” trials were averaged separately according to the main conditions of interest and target side. ERPs from targets located on the right and on the left side of scenes were combined by a procedure preserving the relationship between the side of electrode location and the side of target (contralateral and ipsilateral).
ERP Statistical Analysis
Spatio-temporal windows for ERP analyses were set on the basis of (i) the peak latency and distribution of potentials of interest in the grand-averaged waveforms and (ii) the results from a spatio-temporal pattern analysis carried out using CARTOOL software (developed by D. Brunet, brainmapping.unige.ch/Cartool.htm). The segmentation procedure compares the topographical distribution of grand-averaged ERPs over time across the experimental conditions and identifies periods of stability in the topographical maps of the ERPs (Pascual-Marqui, Michel, & Lehmann, 1995). We compared topographies in valid and neutral trials, between 0 and 600 msec, with the constraints that topographies should last at least 50 msec and be less than 90% correlated. A spatio-temporal clustering algorithm was used, by which the number of clusters initially set progressively diminishes by iteratively removing the clusters with the lowest global explained variance and assigning their maps to the surviving clusters with which they have the highest spatial correlation (Atomize & Agglomerate Hierarchical Clustering; Murray, Brunet, & Michel, 2008). The optimal number of maps for explaining the entire data set was defined by a cross-validation criterion (Pascual-Marqui et al., 1995), derived by dividing the global explained variance by the degrees of freedom determined by the number of electrodes (Brunet, Murray, & Michel, 2011). Its absolute minimum gives the optimal number of segments.
Mean amplitudes of visual potentials P1 and N1 were measured at lateral posterior electrodes (O1/2, PO3/4, PO7/8) contralateral and ipsilateral to the target location during the time windows of 100–140 and 150–180 msec, respectively. Peak latency analyses for P1 and N1 were also conducted at these electrode sites in the ranges of 80–150 and 100–200 msec. The mean amplitude of the N2pc component was measured at PO7/8, PO3/4, and O1/2 electrodes contralateral and ipsilateral to the side of the target between 240 and 280 msec.
Linear differences in mean amplitudes and/or peak latencies of potentials were analyzed by repeated-measures ANOVAs with the within-subject factors: Condition (rewarded-valid, non-rewarded-valid, neutral), Hemisphere (contralateral, ipsilateral), and Electrode Location (O1/2, PO3/4, PO7/8). The Greenhouse–Geisser correction for nonsphericity was applied when necessary.
Experiment 2: Randomized Presentation of Reward Association
In Experiment 1, reward associations related to identification of the target location were delivered in a blocked fashion. To rule out the dependency of the effects of reward potentiation of memory-based spatial orienting on any variable linked to the blocking of reward associations, a follow-up experiment was completed. The methods were equivalent to those in Experiment 1, except for procedures related to the delivery of the reward associated in an intermixed, trial-by-trial fashion in the final block of the learning phase.
Eight right-handed healthy students were recruited from the University of Oxford. They were four women and four men, had a mean age of 24.5 years (range = 21–31 years), and had normal or corrected-to-normal vision.
A total of 96 scenes were used in this experiment, and an additional 12 scenes were used for practice trials.
Participants viewed the 96 scenes repeated in random order over five blocks. Because the main objective of this experiment was to replicate, using a nonblocked design, the effect of one reward association on memory-guided attention, all the scenes contained a key (24 in each quadrant) in the learning task (i.e., a neutral condition was not included in this experiment). In a sixth, final learning block, one half of the scenes was followed by a monetary reward; the other half was followed by no reward. Rewarded and nonrewarded trials were intermixed in a fully randomized and unpredictable fashion. On rewarded scenes, participants gained £0.50 for each key they found but lost £0.50 for each key they were unable to find. Task requirements and response procedures were equivalent to those described in the previous experiment, except that the scene remained onscreen during the presentation of feedback.
Twenty-four hours later, participants completed the orienting task, which was the same as the first experiment, except that it contained only two trial types (“Rewarded-valid” trials; 48 scenes: 24 “target-present,” 24 “target-absent”; “Non-rewarded-valid” trials; 48 scenes: 24 “target-present,” 24 “target-absent”).
Behavioral Statistical Analysis
As for Experiment 1, only trials in which participants had correctly learned the key locations were subsequently analyzed. RTs and accuracy measures were analyzed by a 2 (Condition: rewarded-valid, non-rewarded-valid) × 2 (Response: target-present, target-absent) ANOVA. The equal probabilities of target-present relative to target-absent trials in this experiment enabled us to measure d′, an index of perceptual sensitivity that gives the relationship between the rate of hits to false alarms within each condition [d′ = z(hit) − z(f.a)]. d′ was compared between rewarded and non-rewarded conditions using a paired t test.
Formation of Robust LTMs for Target Locations within Natural Scenes
Over the course of the learning blocks, participants located an increasing number of targets, with increasing speed (Block 1: mean accuracy ± SEM = 84 ± 2.2%, mean search times ± SEM = 4.9 ± 0.19 sec; Block 6: 93 ± 1.2%, 1.1 ± 0.12 sec; Figure 2). Repeated-measures ANOVAs testing for linear decreases in search times over the learning blocks revealed a significant linear contrast, F(1, 13) = 289.33, p < .001. Similarly, a significant linear increase in accuracy over the learning blocks was revealed by a significant linear contrast, F(1, 13) = 30.87, p < .001. Comparison of performance during the final blocks with the reward manipulation and the immediately preceding learning block confirmed that no significant effects of learning were observed during the reward blocks (RT: F(2, 26) = 0.132, p = .877; accuracy: F(1.332, 17.320) = 0.358, p = .618, ɛ = 0.666). As planned, the reward manipulation occurred after an asymptotic learning level had been achieved. Importantly, there was also no difference in performance between the reward block and the non-reward block. Search times, F(1, 13) = 0.12, p = .737, and accuracy, F(1, 13) = 0.26, p = .621, were well equated, thus ruling out any general effects of reward availability on learning performance.
A recall test performed immediately after the orienting task confirmed that subjects retained strong memories of the key locations on the day after the learning task. We used a stringent criterion to test for successful recollection of the key locations—positioning a mouse cursor within a radius of 150 pixels from the target location, equivalent to 3.4°. Participants correctly identified the learned locations of targets on 82% of scenes (±5% SEM). In addition, subjects' response confidence ratings covaried with their accuracy. A linear effect in an ANOVA comparing the mean distance between the placed cursor and the original key in pixels across the confidence ratings showed that the distance decreased systematically as confidence ratings increased (from 1 = not confident to 3 = very confident). In addition, recall was stronger for rewarded-key locations (overall distance in pixels: 45 ± 2) relative to non-rewarded key locations (56 ± 4; see Figure 2). An ANOVA testing the effect of Reward (2 levels) and Rating (3 levels) revealed a linear effect of Rating, F(1, 13) = 30.33, p < .001, and a main effect of Reward, F(1, 13) = 5.74, p = .032, with no interaction between the factors.
Reward Potentiates the Behavioral Benefits of Memory-guided Visual Search
The results confirmed the participants' ability to identify targets embedded within natural scenes and to benefit from LTM-based spatial cues (see Table 1). Importantly, one single reward association potentiated the behavioral benefits of memory-based orienting on target identification within a scene. There was a significant linear effect of condition on RT (mean msec ± SEM; rewarded-valid: 619.6 ± 34.4, non-rewarded-valid: 624.2 ± 34.6, neutral: 643.3 ± 34.9; F(1, 13) = 8.033, p = .014). Target discrimination was faster in target-present trials (present: 573.4 ± 34.3, absent: 684.6 ± 35.7; F(1, 13) = 66.77, p < .001). Moreover, the effects of orienting based on rewarded memories on target discrimination were only observed for target-present trials (Linear Interaction Condition × Response: F(1, 13) = 10.64, p = .006; target-present: F(1, 13) = 29.46, p < .001; target-absent: F(1, 13) = 1.79, p = .203).
|Rewarded-valid||547 ± 35.9||692.1 ± 36.5|
|91.1 ± 1.5||87.1 ± 3.6|
|Non-rewarded-valid||550 ± 35.3||698.4 ± 36.9|
|90.6 ± 1.5||86.4 ± 5.6|
|Neutral||623.3 ± 33.7||663.4 ± 37.7|
|82.4 ± 3.0||93.1 ± 1.5|
|Rewarded-valid||547 ± 35.9||692.1 ± 36.5|
|91.1 ± 1.5||87.1 ± 3.6|
|Non-rewarded-valid||550 ± 35.3||698.4 ± 36.9|
|90.6 ± 1.5||86.4 ± 5.6|
|Neutral||623.3 ± 33.7||663.4 ± 37.7|
|82.4 ± 3.0||93.1 ± 1.5|
Significant differences in accuracy accompanied the RT results (Linear Interaction Condition × Response: F(1, 13) = 8.62, p = .012), illustrating a significant linear effect of condition in target-present, F(1, 13) = 7.99, p = .014, but not in target-absent trials, F(1, 13) = 2.73, p = .122. Taken together, these effects suggest that reward-associated long-term spatial memories allow attention to reach the target location more rapidly and to identify the target more accurately compared with when unrewarded memories guide spatial orienting.
To test that these benefits to the attentional orienting effects were restricted to scenes containing targets within the reward phase of the learning task, on which individuals has actually received reward, we further analyzed differences in performance on the neutral trials in the orienting task according to whether the scenes had been presented in the reward versus non-reward final learning blocks. There were no differences either in RT, F(1, 13) = 1.89, p = .192, or accuracy, F(1, 13) = 0.16, p = .696. The fact that no reward-related differences in performance occurred in target-absent trials argues against explanations based on a generic improvement in the amount of learning in the reward block relative to the non-reward block during the learning task.
Rewarded Memories Modulate Target-related Neural Activity
Early visual processing (P1 and N1 potentials)
Target-present scenes elicited the expected visual potentials P1 and N1 over parieto-occipital scalp regions in all conditions (Figure 3).
ANOVAs testing for effects of reward association through linear contrasts of condition revealed a significant influence of reward on memory-based attentional bias. Specifically, P1 amplitude (100–140 msec poststimulus onset) at contralateral electrodes (relative to ipsilateral) was enhanced by memory-guided orienting only when spatial memories were associated with reward (Condition × Hemisphere: F(1, 13) = 5.05, p = .043). This was confirmed by subsidiary ANOVAs showing significant effects of Hemisphere in rewarded-valid trials, F(1, 13) = 6.43, p = .025, but not in non-rewarded-valid trials, F(1, 13) = 1.48, p = .246, or neutral trials, F(1, 13) = 1.63, p = .224.
The amplitude of the N1 component (150–180 msec) was unaffected by cue type. The analysis of N1 latency, however, revealed a significant Linear Effect of Condition, F(1, 13) = 4.65, p = .050, which showed the latencies to be earliest for targets preceded by rewarded-valid cues (163 ± 3), intermediate for targets preceded by non-rewarded-valid cues (166 ± 3), and latest for those preceded by neutral cues (168 ± 3). The effect of cue type on N1 latency did not significantly differ between hemispheres (Condition × Hemisphere: F(1, 13) = 0.085, p = .776).
Target selection (N2pc)
The N2pc potential was elicited by target-present scenes at parieto-occipital electrodes contralateral to the side of the target (Figure 4). The presence and time window of the N2pc was indicated by a spatio-temporal segmentation analysis carried out on the ERP difference waveforms (see Methods). The reliability of the N2pc was confirmed by a main effect of Hemisphere, F(1, 13) = 42.93, p < .001, on mean amplitudes 240–280 msec after target onset.
Of direct interest to the experimental question, N2pc amplitudes were modulated by the preceding cue (interaction Hemisphere × Linear Contrast of Condition: F(1, 13) = 4.94, p = .045; see Figure 4). The amplitude of the N2pc became attenuated by memory cues, and the effect was accentuated by rewarded memory cues. Subsidiary ANOVAs testing for the N2pc in each condition separately confirmed that a significant N2pc was elicited in neutral trials, F(1, 13) = 15.93, p = .002. Although smaller, it was also present in non-rewarded-valid trials, F(1, 13) = 7.6, p = .016. However, in rewarded-valid trials the N2pc became unreliable, F(1, 13) = 4.17, p = .062. Paired t tests between the different conditions carried out on the difference waveforms created by subtracting the ipsilateral from the contralateral target-related ERP waveforms showed that the N2pc amplitude was significantly smaller in the rewarded-valid versus in the neutral condition, t(13) = 2.22, p = .02, but no significant differences were found between non-rewarded-valid and neutral conditions, t(13) = 0.435, p = .33. There was a trend toward a smaller N2pc in the rewarded-valid relative to non-rewarded-valid conditions at occipital electrodes (O1/2), t(13) = 1.525, p = .07.
Visual inspection of the grand-averaged waveforms, as well as the topographical analysis delineating successive periods of stable ERP topography (see Methods), also revealed an unexpected later lateralized effect, following the N2pc with opposite polarity (see Figure 4). This effect was revealed as an enhanced positivity over posterior contralateral (relative to ipsilateral) scalp locations to the target side in the latency window between 320 and 380 msec poststimulus (labeled posterior contralateral positivity, PCP, for its scalp location and polarity). To assess if this target-related effect was differentially modulated by the type of cue, we analyzed ERP mean amplitudes through this latency window. Results showed a significant effect of Hemisphere, F(1, 13) = 6.14, p = .028, confirming the presence of this lateralized effect over parietoccipital electrodes, but no significant modulation by Cue Type, F(1, 13) = 0.07, p = .79.
One-reward Association Is Sufficient to Enhance Memory-driven Benefits on Behavior
The pattern of results during the learning task was replicated: over the course of the learning blocks, participants located an increasing number of targets, F(1, 7) = 45.19, p < .001, with increasing speed, F(1, 7) = 129.42, p < .001 (Block 1: 95.2 ± 1.0%, 3.995 ± 0.244 sec; Block 5: 99.8 ± 0.3%, 0.969 ± 0.074 sec; final (reward) Block 6: 99.8 ± 0.3%, 0.899 ± 0.057 sec). Again, comparison of performance during the final block with reward manipulation and the immediately preceding learning block confirmed that reward associations were delivered after participants had reached a plateau level of learning (RT: t(7) = 1.424, p = .197; accuracy was the same in both blocks for all the participants).
The results in the orienting task replicated the benefit conferred by one single reward association to subsequent performance on a visual discrimination task. A significant main effect of Condition was obtained on RT, F(1, 7) = 5.998, p = .044, confirming that target detection was faster in rewarded (685.3 ± 54.7 msec) versus non-rewarded trials (703.3 ± 53.6 msec). The single reward association was sufficient to enhance the speed of responses on the perceptual discrimination task. Target discrimination was faster in target-present trials (present: 600.2 ± 57.8, absent: 788.4 ± 52.7; F(1, 7) = 62.08, p < .001), but no significant interaction was found between condition and response, F(1, 7) = 1.33, p = .287. There were no significant main effects of Condition, F(1, 7) = 0.61, p = .460, or Response, F(1, 7) = 3.11, p = .121, in accuracy. Again, a significant interaction of Condition × Response was found on accuracy, F(1, 7) = 12.67, p = .009, driven by the higher accuracy in target-present versus target-absent trials being significant in the rewarded condition only (“rewarded-valid”: 91.4 ± 2.1% vs. 82.9 ± 4.2%, p = .042; “non-rewarded-valid”: 89.8 ± 1.8% vs. 88.1 ± 2.6%, p = .549). The d′ measure was also equivalent between the two conditions, t(7) = 0.571, p = .586.
Finally, the recall test again showed that participants retained strong memories of key locations 24 hr after learning. The key location was correctly identified in 85% of scenes (±9.5 SD), and higher confidence ratings were associated with more accurate memories, F(2, 14) = 23.04, p < .001. Recall was not found to be better for rewarded versus non-rewarded key locations (accuracy: 85.8 ± 3.1% vs. 83.6 ± 4%, t(7) = 0.929, p = .384; overall distance in pixels: 47.2 ± 4.8 vs. 50.5 ± 3.8, t(7) = −0.637, p = .544; confidence ratings: 2.37 ± 0.09 vs. 2.36 ± 0.11, t(7) = 0.297, p = .775).
We demonstrated a role of reward in regulating how efficiently spatial memories drive visual search in natural scenes and modulate the visual processing of incoming information. The single delivery of one small reward (£0.40 in Experiment 1 or £0.50 in Experiment 2) after learning was able to potentiate the proactive effect of spatial memories to facilitate identification of targets on a subsequent visual search task. The enhancement of the memory-based attention effect by the reward association was completely incidental, occurring even though the previous reward association had no bearing on the visual search task.
Behavioral results showed that past rewards enhance the ability of spatial expectations from LTM to facilitate the perception of target events within a complex scene. Responses to identify targets were faster and more accurate for targets placed in previously rewarded locations, indicating that reward-associated LTM allows attention to reach the target location more rapidly and to identify the target more accurately compared with when nonrewarded memories guide spatial orienting. These results replicate and extend previous findings by Summerfield et al. (2006, 2011) by showing the important role of reward in enhancing experience-based biases upon perceptual decisions on relevant objects in cluttered scenes.
Interestingly, our results provide evidence that reward can bias attention through associations in memory but in the absence of direct rewarding outcomes (i.e., no reward was given during the attentional orienting task). Most of previous studies have examined the effects of reward on human brain function by linking the magnitude and valence of monetary incentives to task performance (e.g., Sänger & Wascher, 2011; Hickey et al., 2010; Pessoa & Engelmann, 2010; Engelmann, Damaraju, Padmala, & Pessoa, 2009; Small et al., 2005) or under conditions in which stimuli were deemed more likely to lead to monetary reward on the basis of previous experience (Serences & Saproo, 2010; Serences, 2008). Here, we provide new data demonstrating the critical role of past rewarded experiences in determining how efficiently spatial memories drive attention and visual search processes in real-world environments even in absence of immediate rewarding (monetary) consequences associated with the ongoing goal-driven behavior.
We took care to ensure that our reward-association procedures did not affect acquisition of the spatial memories themselves. Our analyses confirmed that reward associations had been delivered after learning had reached its maximal level; no further significant effects of learning occurred during the reward blocks themselves. In the final reward blocks, reward associations were provided after identification of targets, so no further learning could be influenced by these single associations. Accordingly, comparisons of performance to identify rewarded and non-rewarded target locations in the final blocks revealed no performance differences. Importantly, learning performance in the final rewarded and non-rewarded blocks was closely matched, also arguing against the contribution of any generic block-related variable to the learning experience that could influence subsequent performance. Furthermore, benefits to the attentional orienting effects were restricted to scenes containing targets within the reward phase of the learning task, on which individuals had actually received reward. There were no effects related to target-absent trials within the rewarded and nonrewarded final learning blocks.
A follow-up behavioral experiment replicated the benefits of one single reward association for subsequent target identification. To rule out any possibility that any latent state-related variables might have contributed to the subsequent effects during the orienting task, reward associations were delivered in a completely randomized and unpredictable fashion within one common final block during the learning phase. Subsequent improvements in perceptual identification in the visual search task were therefore a direct consequence of the single reward enhancing the consolidation of those particular target–context associations and potentiating the top–down effects of these memories on subsequent perception.
The present data add to growing behavioral evidence showing that reward-associated stimuli through learning can bias visual attention (Anderson et al., 2011; Della Libera & Chelazzi, 2009) and receive facilitated processing, making them less sensitive to the attentional blink (Raymond & O'Brien, 2009). Furthermore, we demonstrate for the first time that just one reward–outcome association, without any change to the learning process itself, enhances the ability of LTM to drive attention within naturalistic scenes and influence perception of stimuli at these locations in subsequent encounters.
The moment-by-moment record of neurophysiological activity elicited by target stimuli, by means of ERPs in Experiment 1, enabled us to observe that the behavioral improvement by reward was accompanied by modulations of ongoing neural activity at different levels of visual cortical processing. The earliest effect was observed on the P1 potential, which showed larger amplitudes at contralateral parieto-occipital sites (relative to ipsilateral sites) only for targets appearing at reward-associated remembered locations. Assuming that the contralateral enhancement of the P1 potential is a marker of effective top–down spatial modulation of visual processing in extrastriate areas (Martínez et al., 1999; Heinze et al., 1994), our finding can be taken as evidence that reward-related signals stored in LTM bias perceptual analysis of relevant objects embedded in real-world scenes as early as 100–140 msec after scene onset. The absence of a significant modulation of the P1 latency provides further support that reward-associated memory-based orienting results in an amplification of sensory-evoked activity in extrastriate areas, closely resembling the sensory gain-control mechanism reported from visual-spatial attention studies (Hillyard, Vogel, & Luck, 1998).
In contrast to these P1 effects, expectancy generated from reward-associated LTM resulted in modulations of the latency of the N1 component, characterized by a linear latency reduction in rewarded-valid trials relative to non-rewarded-valid and neutral trials, but without any effects on its amplitude. Accumulating evidence indicates that the N1 attention effect indexes a higher-level discriminative process in areas of the ventral visual stream (Hopf, Vogel, Woodman, Heinze, & Luck, 2002; Vogel & Luck, 2000; Luck, 1995). It is thus plausible to propose that the present N1 latency effects reveal a speeding up of target discrimination by reward-related enhancement of top–down biasing of visual processing based on memory cues.
The present pattern of reward-related modulations are in line with that reported by Summerfield et al. (2011), who observed larger P1 amplitudes and earlier N1 latencies to target stimuli appearing shortly after valid memory cues relative to neutral cues. However, the effects on P1 and N1 were not significant in our recent study (Patai et al., 2012) using a similar perceptual judgment memory-cueing task. Patai et al. suggest that this could be related to the specific challenging conditions in this experimental paradigm to measure these potentials. It should be noted that, in Summerfield et al. (2011) study, the target appeared as a transient on the scene background and therefore was much more salient than in the present task, where the target was embedded in the cluttered scene. On the basis of this hypothesis, the present pattern of ERP modulations despite highly unfavourable conditions for measuring the effects of spatial biases on sensory-evoked responses, reinforces our proposal that reward potentiates the influence of the attentional bias set up by memory cues toward relevant locations in crowded scenes, so that stimuli presented at spatial locations associated with rewarded outcomes in past encounters win representation at the expense of other stimuli, in a similar way that attention biases the competition in favour of attended targets.
These reward-induced modulations of P1 and N1 components are also consistent with neuroimaging findings from visuospatial cueing tasks showing that the attentional benefits in performance associated with monetary incentives lead to enhanced activation in visual cortical areas (Pessoa & Engelmann, 2010; Engelmann et al., 2009; Small et al., 2005). Moreover, they agree with recent studies reporting that levels of activation within spatially selective visual regions can be biased in favor of more valuable stimuli, as determined by their prior reward histories (Serences, 2008), and result in an increase of stimulus discriminability (Serences & Saproo, 2010). Because of the high temporal resolution of ERPs, our findings go further in demonstrating that the modulations in visual areas occur early, during the perceptual stages of information processing (see also Hickey et al., 2010), and are not merely the consequence of late, re-entrant feedback from higher-order stages of processing after perceptual analysis is complete.
The ERP marker of target selection, the N2pc, was strongly modulated by spatial memory for a target location and by the reward association of such memory. The N2pc is an enhanced negativity at posterior electrodes contralateral relative to ipsilateral to the side of the target embedded in a distractor array (Hickey, Di Lollo, & McDonald, 2009; Eimer, 1996; Luck & Hillyard, 1994). It is thought to originate primarily from posterior parietal and occipito-temporal areas (Hopf et al., 2000). The N2pc was significantly attenuated by spatial memory, and again this effect was enhanced for rewarded memories. Critically, this lateralized effect suggests that the N2pc can also signal the identification of targets embedded within complex and natural visual scenes.
The N2pc attenuation by LTM-driven visual search replicates our recent findings (Patai et al., 2012) that point to possible differences in how memory cues and perceptual cues come to influence target selection processes. In contrast to the attenuation of the N2pc by memory-based orienting, the N2pc has been shown to be unaffected by visual spatial cues (Schankin & Schübo, 2010; Brignani, Lepsien, Rushworth, & Nobre, 2009; Kiss, Van Velzen, & Eimer, 2008). Patai et al. suggest that the cueing of attention to the location of a target by its previously learned context could have preactivated specific memory traces for target–context configurations, facilitating the target discrimination and strongly reducing the amount of required suppression of distracting stimuli (as reflected by a reduced N2pc in valid trials). The finding that the N2pc reduction increased when reward associations of spatial memories were manipulated extends these previous results and supports our hypothesis that learned reward values potentiate the ability of spatial predictions from LTM to bias visual processing, thereby priming selection of target stimuli among competing distractor items in familiar scene contexts.
We also identified a later, spatially specific effect characterized by a lateralized posterior positivity contralateral to the target location (labeled here as PCP), which was not significantly modulated by spatial LTM or reward. This effect is also in agreement with work by Patai et al. (2012) and with previous studies revealing similar lateralized ERP activity in this latency range during visual search (Hilimire, Mounts, Parks, & Corballis, 2010). Hilimire and colleagues have proposed that this potential may reflect additional processing necessary to individuate the target after it is identified under conditions of high competition between stimuli in an array (Hilimire et al., 2010).
Overall, the present results show that LTMs of spatial locations within complex scenes can incorporate past reward outcomes and be used to optimize top–down memory-driven attentional biases on perceptual discrimination of stimuli at these locations in subsequent encounters. Importantly, they reveal that reward associations influence perception by enhancing the effects of memory-based attention on different stages of stimulus processing in visual cortical areas, rather than introducing independent, different effects. The fact that such a minor reward manipulation during learning boosts memories for spatial locations within cluttered scenes and dynamically impacts ongoing processing in visual brain regions has important implications in the knowledge of how attention, memory, and reward interact to adapt our behavior flexibly within a real-life environment.
In summary, our results have significant implications in advancing the understanding of how the rewarding outcomes of learning impact perceptual functions in the human brain. They provide evidence in humans that a single reward outcome of a previous learning experience enhances attentional mechanisms in subsequent encounters with the learned information. Even when unrelated to the task at hand, reward associations impact multiple stages of neural modulation and enhance behavioral measures of target identification.
This research was supported by a project grant to A. C. N. from the Wellcome Trust. S. D. was supported by a Spain's Ministry of Education and Science/FECYT postdoctoral grant and by a current postdoctoral contract from the Isidro Parga Pondal program (Xunta de Galicia, Spain). In addition, the research was supported by the National Institute for Health Research Oxford Biomedical Research Centre based at Oxford University Hospitals Trust Oxford University as part of the Cognitive Health Programme. The views expressed are those of the author(s) and not necessarily those of the National Health Service, the National Institute for Health Research, or the Department of Health.
Reprint requests should be sent to Anna Christina Nobre, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford OX1 3UD, UK, or via e-mail: firstname.lastname@example.org.