Anticipating rewards has been shown to enhance memory formation. Although substantial evidence implicates dopamine in this behavioral effect, the precise mechanisms remain ambiguous. Because dopamine nuclei have been associated with two distinct physiological signatures of reward prediction, we hypothesized two dissociable effects on memory formation. These two signatures are a phasic dopamine response immediately following a reward cue that encodes its expected value and a sustained, ramping response that has been demonstrated during high reward uncertainty [Fiorillo, C. D., Tobler, P. N., & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science, 299, 1898–1902, 2003]. Here, we show in humans that the impact of reward anticipation on memory for an event depends on its timing relative to these physiological signatures. By manipulating reward probability (100%, 50%, or 0%) and the timing of the event to be encoded (just after the reward cue versus just before expected reward outcome), we demonstrated the predicted double dissociation: Early during reward anticipation, memory formation was improved by increased expected reward value, whereas late during reward anticipation, memory formation was enhanced by reward uncertainty. Notably, although the memory benefits of high expected reward in the early interval were consolidation dependent, the memory benefits of high uncertainty in the later interval were not. These findings support the view that expected reward benefits memory consolidation via phasic dopamine release. The novel finding of a distinct memory enhancement, temporally consistent with sustained anticipatory dopamine release, points toward new mechanisms of memory modulation by reward now ripe for further investigation.
Episodic memory formation, an important component of learning, is enhanced during reward anticipation: Just as the desire to get an “A” grade, or to understand the world, can motivate individuals to remember information, the promise of money can motivate people to form new memories (Gruber & Otten, 2010; Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Wittmann et al., 2005) and even enhance memory for incidental events (Murty & Adcock, 2014; Mather & Schoeke, 2011). However, the mechanisms of memory enhancement during reward anticipation remain incompletely understood (for reviews, see Miendlarzewska, Bavelier, & Schwartz, 2016; Shohamy & Adcock, 2010).
One proposed mechanism involves the neuromodulator dopamine, released during reward anticipation, which directly stabilizes long-term potentiation to support memory formation. In the dopaminergic midbrain, the ventral tegmental area (VTA) sends afferent projections to the hippocampus (Gasbarri, Sulli, & Packard, 1997; Gasbarri, Verney, Innocenzi, Campana, & Pacitti, 1994), which is populated with dopamine receptors (Jiao, Paré, & Tejani-Butt, 2003; Lewis et al., 2001; Ciliax et al., 2000; Khan et al., 2000; Bergson et al., 1995; Little, Carroll, & Cassin, 1995; Camps, Cortés, Gueye, Probst, & Palacios, 1989; Dawson, Gehlert, McCabe, Barnett, & Wamsley, 1986). Indeed, applying dopamine receptor antagonists in the hippocampus blocks memory formation for new, rewarding events (Bethus, Tse, & Morris, 2010). Prior work has also shown that, during reward anticipation, activation of the dopaminergic midbrain (Adcock et al., 2006; Wittmann et al., 2005) and increased midbrain connectivity with the hippocampus (Adcock et al., 2006) predict successful memory formation. However, this mechanism of memory enhancement is only one among many known cellular and network actions of dopamine. Even within the hippocampus, these models must be elaborated to incorporate knowledge about dopamine receptor distributions (see Shohamy & Adcock, 2010, for a review) and multiple temporal profiles of dopamine neuronal responses.
More specifically, rapid phasic burst responses scale with the expected reward value of a reward or a cue predicting reward (Tobler, Fiorillo, & Schultz, 2005; Fiorillo, Tobler, & Schultz, 2003), whereas a slower, anticipatory sustained response has been reported to be associated with reward uncertainty (Fiorillo et al., 2003). In the hippocampus, particularly, dopamine receptors do not closely appose dopamine terminals (for a review, see Shohamy & Adcock, 2010), and thus, phasic responses versus sustained responses are likely to differentially influence hippocampal dopamine receptors; this distinction is likely to exist in other cortical regions and mechanisms as well. Thus, in this study, we proposed that, over several seconds of reward anticipation, phasic and sustained dopamine neuronal excitation should differentially modulate memory formation and furthermore that we could characterize these distinct dopamine profiles using a behavioral paradigm in humans. Specifically, we hypothesized that, for events immediately following a reward-predicting cue, phasic dopamine release would drive memory enhancements when expected reward value is high. On the other hand, for events closer to a potentially rewarding outcome, sustained dopamine release should drive memory enhancements when reward uncertainty is high.
Many of the effects of dopamine on long-term memory occur during consolidation; thus, to ensure that memory performance would reflect the consolidation-dependent mechanisms, our initial tests of these hypotheses examined retrieval after a 24-hr delay. However, dopamine has been implicated not only in enhancing both early- and late-phase long-term potentiation (Lemon & Manahan-Vaughan, 2006; Otmakhova & Lisman, 1996) but also in increasing neuronal replay (McNamara, Tejero-Cantero, Trouche, Campo-Urriza, & Dupret, 2014) and changing dynamic hippocampal physiology (Martig & Mizumori, 2011; Otmakhova & Lisman, 1999). Whereas some of these mechanisms should be apparent only after a delay (i.e., 24 hr), other mechanisms could be apparent immediately. Thus, secondarily, we also sought to establish whether putatively phasic versus sustained dopaminergic influences on memory would be present only after a period that allowed for consolidation or would also be evident immediately after encoding.
We set out to dissociate the putative influence of two distinct dopaminergic responses on memory formation during reward anticipation. To parse these effects, we designed a study in which we used overlearned abstract cues to indicate reward probability, establishing expected reward value independently from uncertainty. We further manipulated the epoch of encoding during reward anticipation: We presented trial-unique, incidental items either early (400 msec after cue presentation), to capture a rapid dopamine response anticipated to scale with expected reward value, or late (3–3.6 sec after cue presentation), to capture a sustained dopamine response anticipated to scale with high reward uncertainty. Finally, in addition to testing memory following a 24-hr consolidation interval, we tested a second group 15 min postencoding to examine whether the effects on memory performance were all dependent on consolidation. We used an incidental memory task, as opposed to an intentional memory task, because we aimed to develop an experimental context that created distinct dopaminergic profiles: phasic and ramping dopamine responses. Had we told participants there would be a memory test for the items shown, motivational salience would then be attached to the items themselves, disrupting the distinct dopamine profiles we aimed to elicit. With this paradigm, we examined whether expected reward value and reward uncertainty yielded temporally and mechanistically distinct influences on memory formation, as would be predicted for these distinct triggers for dopamine release.
Forty healthy young adult volunteers participated in the study. All participants provided informed consent, as approved by the Duke University institutional review board. Data from additional participants were excluded due to failure to follow the instructions (n = 1), poor cue outcome learning (n = 2), or computer error (n = 3). Individuals participated in one of two experiments: Experiment 1 (n = 20, 12 women, mean age = 27.45 years, SEM = 3.82 years) or Experiment 2 (n = 20, 12 women, mean age = 21.90 years, SEM = 3.23 years).
Design and Procedure
The first phase of the experiment involved reward learning. During reward learning, participants were presented with abstract cues, all Tibetan characters, which predicted 100%, 50%, or 0% probability of subsequent monetary reward. Participants were instructed to try to learn the relationship between the cues and reward. They were presented with the cue (1 sec), a unique image of an everyday object (2 sec), then an image of either a dollar bill or a scrambled dollar bill (400 msec), indicating a reward or no reward, respectively. A jittered fixation cross separated trials (1–8 sec). No motor contingency was required to earn the reward. Independent of performance, participants were paid a monetary bonus equal to the amount accumulated over the outcomes in one block of the task. Participants saw 40 trials per condition, distributed evenly over five blocks. Before the first block and following every block, participants were asked to rate their certainty of receiving reward following each cue along a sliding scale from “Certain: No Reward” to “Certain: Reward.” To be included in the analysis, during learning participants had to meet a minimum criterion of identifying the 100% reward cues as more associated with reward than the 0% cues, as assessed by average certainty score across all five blocks.
In the second phase of the experiment, the abstract cues used in the reward learning phase were used to modulate incidental encoding; these cues predicted 100%, 50%, and 0% reward probability. Because the associations were deterministic, reward probabilities established expected reward value, with 100% higher than 50% and 0% rewarded cues. In contrast, 50% predictive cues established higher uncertainty relative to the 100% and 0% predictive cues.
During the incidental encoding task, participants saw a cue (400 msec), followed by a unique novel object (1 sec) either immediately after the cue (400 msec post-cue onset, objects remained on the screen for 1 sec; object offset was 3.2 sec pre-outcome) or just before outcome (3–3.6 sec post-cue onset, objects remained on the screen for 1 sec; object offset was 0–0.6 sec pre-outcome). Note that object presentation and reward outcome did not overlap. These encoding epochs were chosen based on the timing of the phasic dopamine response (<500 msec; Schultz, Dayan, & Montague, 1997) and the sustained ramping response (2 sec: Fiorillo et al., 2003; also 4–6 sec: Howe, Tierney, Sandberg, Phillips, & Graybiel, 2013; Totah, Kim, & Moghaddam, 2013). After the image offset and during the delay, a fixation cross was shown. A dollar bill or scrambled dollar bill, indicating a reward or no reward, respectively (400 msec), appeared 4.6 sec after cue onset for all trials. After reward feedback, participants were presented with the probe question, “Did you receive a reward?” (1 sec). Participants were instructed to quickly and accurately make a “yes” or “no” button press. The exact motor component could not be anticipated since the yes/no, right/left location was random from trial to trial. A jittered fixation cross separated trials (1–7 sec). In summary, there were six conditions in the design: Three probabilities of reward (100%: high expected reward value/certain, 50%: medium expected reward value/uncertain, and 0%: no expected reward value/certain) crossed with the early or late encoding epochs (Figure 1). There were 20 trials per condition, evenly dispersed among five blocks.
Recognition Memory Test
Participants performed an old/new recognition memory test where they viewed 280 “new” objects and 280 “old” objects. Although “old” objects were from both the reward learning and incidental encoding phases, only the old objects from the incidental encoding phase were included in analyses to calculate memory performance. They rated their confidence by saying “definitely sure,” “pretty sure,” or “just guessing” for each memory judgment. When examining memory accuracy for trials, participants labeled as “guesses” (one-sample t tests within guesses: [hits − false alarms]/all responses) memory that was significantly greater than chance. Because of this, we included all trials (those labeled as guesses, medium confidence, and high confidence) in the analysis.
Experiment 1—24-hr Retrieval
In Experiment 1, participants returned at the same time the next day to complete the recognition memory test, approximately 24 hr after encoding.
Experiment 2—Immediate Retrieval
In Experiment 2, participants completed the recognition memory test 15 min after completing the encoding task.
Memory performance for all analyses was calculated as a corrected hit rate ([hits − false alarms]/all responses). This study was a between-subject design with specific hypotheses about how dopamine signaling would impact memory. To address our primary research question, we first examined the effects of reward probability (100%, 50%, 0%) and encoding epoch (early, late) in the 24-hr group. Then, to examine if the observed effects were consolidation dependent, we examined memory in the immediate group, comparing memory performance across groups using retrieval time (immediate, 24-hr) as a between-subject factor and reward probability (100%, 50%, 0%) as a within-subject factor. All analyses were corrected for multiple comparisons, either by ANOVA or with sequential Bonferroni correction.
Repeated-measures ANOVA (3 × 2) was used to examine the effects of Reward probability (100%, 50%, 0%) and Encoding epoch (early, late) on subsequent memory performance. Any significant interaction between Reward probability and Encoding epoch was investigated further using post hoc analyses. One-way repeated-measures ANOVAs with Reward probability as a within-subject factor, conducted separately at early encoding (400 msec post-cue onset) and late encoding (3–3.6 sec post-cue onset) epochs, were used to examine how reward probability related to memory formation at each encoding epoch during anticipation. Significant one-way ANOVAs prompted additional follow-up analyses: Specifically, a test for a linear trend increasing with probability was used to examine how expected reward value related to memory in the early encoding epoch, and post hoc t tests were used to compare memory for certain (100% or 0%) versus uncertain (50%) trials in the late encoding epoch.
To examine alternate explanations that variability in attention and task engagement at encoding could account for the subsequent memory performance across conditions, we examined performance on the reward probe. Specifically, we conducted one-way ANOVAs and follow-up pairwise Student's t tests as well as tests for a linear trend to determine whether RT or accuracy for the reward probe varied as a function of reward probability or encoding epoch.
Finally, to test the alternate explanation that memory for items presented following the 50% probability cue may be influenced by reward outcome (rewarded, unrewarded), we completed two-tailed paired Student's t tests to examine whether there were differences in memory for rewarded versus unrewarded trials within that probability condition. These t tests were conducted separately at the early and late encoding epochs.
Because both Experiment 1 (Group 1, 24-hr delay) and Experiment 2 (Group 2, 15-min delay) revealed differences in memory performance for early versus late encoding epochs, we next tested whether the patterns at each encoding epoch significantly differed across groups (according to retrieval time). We thus first performed 3 × 2 ANOVAs with Reward probability (100%, 50%, 0%) as a within-subject factor and Retrieval time (immediate, 24-hr) as a between-subject factor. We conducted this analysis at both the early and late encoding epochs (early epoch: Reward Probability [100%, 50%, 0%] × Retrieval Time [24-hr vs. immediate]; late epoch: Reward Probability [100%, 50%, 0%] × Retrieval Time [24-hr vs. immediate]). A significant interaction between Reward probability and Retrieval time prompted post hoc pairwise ANOVAs to examine whether the deltas between immediate and 24-hr retrieval were significantly different across reward probability conditions.
Lastly, we also performed a 2 (Encoding epoch: early, late) × 3 (Reward probability: 100%, 50%, 0%) × 2 (group: immediate, 24-hr) repeated-measures ANOVA using encoding epoch and reward probability as within-subject factors and Group as a between-subject factor to test for a three-way interaction.
Participants in both groups successfully learned the meaning of the cues during the reward learning phase. In the 24-hr memory group, participants in the final block reported the 100% probable cue as 99.46% (SEM = 0.20) likely to predict reward, the 50% cue as 52.65% (SEM = 3.39) likely to predict reward, and the 0% cue as 2.29% (SEM = 1.79) likely to predict reward. In the immediate memory group, participants in the final block reported the 100% probable cue as 99.37% (SEM = 0.43) likely to predict reward, the 50% cue as 55.26% (SEM = 2.68) likely to predict reward, and the 0% cue as 1.44% (SEM = 1.07) likely to predict reward.
Experiment 1—24-hr Retrieval Group
Because the aim of the study was to examine the memory effects of distinct temporal components of reward anticipation and relate those components to determinants of dopamine physiology, we completed a 3 × 2 repeated measures, within-subject ANOVA looking at as a function of Reward probability (100%, 50%, 0%) and Encoding epoch (early, late). We found a main effect of Reward probability, F(2, 18) = 5.56, p = .01; no main effect of Encoding epoch, F(1, 19) = 2.57, p = .13; and a strong interaction between Encoding epoch and Reward probability, F(2, 18) = 7.50, p = .004. Follow-up one-way ANOVAs examining the effect of reward probability on memory within early and late encoding epochs revealed a significant effect in the late epoch, F(2, 19) = 13.25, p < .0001, and a trend-level effect in the early epoch, F(2, 19) = 2.41, p = .10. Post hoc tests to examine memory performance during the early encoding epoch revealed a significant linear trend such that memory scaled with increasing reward probability (linear trend: R2 = .03, p = .04). Thus, early during anticipation, memory performance linearly tracked expected reward value (Figure 2A). Post hoc t tests to examine memory performance during the late encoding epoch revealed greater memory following 50% cues compared with the 100% and 0% cues, with no difference in performance between 100% and 0% cues (100% vs. 50%: t(19) = 4.34, p = .0004; 50% vs. 0%: t(19) = 4.20, p = .0005; 100% vs. 0%: t(19) = 0.31, p = .76; corrected for multiple comparisons using the sequential Bonferroni technique; Holm, 1979). Thus, late in reward anticipation, greater reward uncertainty benefitted memory (Figure 2A).
It was also possible that the memory benefit we attributed to the uncertain anticipatory context could instead be explained by associations with reward outcomes. To investigate this alternative explanation, we performed t tests between the rewarded and unrewarded uncertain trials, during both early and late epochs (corrected for multiple comparisons). We found no differences in memory as a function of reward outcome in either epoch (early, rewarded vs. unrewarded: t(19) = 0.10, p = .92; late, rewarded vs. unrewarded: t(19) = 1.09, p = .29).
Experiment 2—Immediate Retrieval Group
Examining memory after a 24-hr retrieval delay ensured that performance reflected all the known consolidation-dependent mechanisms of dopamine on memory but did not allow us to distinguish between effects acting at encoding versus consolidation. Thus, in Experiment 2, we recruited a new group of participants to complete an immediate retrieval test 15 min after encoding. All analyses for Experiment 1 were repeated for Experiment 2. Analyses of immediate retrieval performance replicated effects of reward uncertainty on items presented late in the anticipation epoch; however, they did not show effects of reward probability on items presented early in the epoch, as follows:
A 3 × 2 repeated-measures, within-subject ANOVA revealed a trend for a main effect of Reward probability, F(2, 18) = 3.33, p = .06, and a main effect of Encoding epoch, F(1, 19) = 8.008, p = .01, with memory greater at late than early encoding epochs, t(19) = 2.83, p = .01. Importantly, there was again a significant interaction between Reward probability and Encoding epoch, F(2, 18) = 3.711, p = .04. Post hoc one-way ANOVAs within early and late encoding epochs revealed a significant difference in memory for the late epoch, F(2, 19) = 4.95, p = .01, but no difference for the early epoch, F(2, 19) = 1.31, p = .28. Post hoc t tests to examine memory performance during the late encoding epoch revealed significantly greater memory following 50% cues relative to both 100% and 0% cues (100% vs. 50%: t(19) = 2.97, p = .008; 50% vs. 0%: t(19) = 2.57, p = .02), with no difference between the latter (100% vs. 0%: t(19) = 0.66, p = .52; corrected for multiple comparisons). The presence of the uncertainty effect at both immediate and 24-hr retrieval indicates that this effect was not dependent on consolidation (Figure 2B).
By contrast, the effect of expected reward value for items in the early encoding epoch was not present at immediate retrieval. Although the ANOVA demonstrated no significant difference in memory performance by reward probability, the test for a linear trend was an a priori analysis. We found no significant linear trend (R2 = .01, p = .14). Thus, the influence of reward probability on memory for items presented early during reward anticipation was not present during immediate retrieval and only appeared after 24 hr (Figure 2B).
As was the case in the 24-hr retrieval group, analyses for the immediate retrieval group revealed no effects of reward outcome on memory during uncertain trials (early, rewarded vs. unrewarded, t(19) < 0.0001, p = 1.00; late, rewarded vs. unrewarded, t(19) = 1.33, p = .20; corrected for multiple comparisons).
Contrasting 24-hr and Immediate Memory Groups from Experiments 1 and 2
To quantify whether memory patterns within early and late encoding epochs changed over a 24-hr period of consolidation, we ran two 3 × 2 repeated-measures ANOVAs, one per encoding epoch (early, late), with the within-subject factor Reward probability (100%, 50%, 0%) and the between-subject factor Retrieval time (24-hr group, immediate group) and looked for an interaction between the two. We found a significant interaction at the early epoch, F(2, 37) = 4.281, p = .021, but not at the late epoch, F(2, 37) = 1.826, p = .175. Follow-up pairwise ANOVAs revealed that the decrement in memory performance as a function of retrieval period (24-hr vs. immediate) was significantly greater for 0% than 100% reward, F(1, 38) = 8.76, p = .005, with no other significant differences (all other ANOVAs: F(1, 38) < 1.75, p > .19). After consolidation, memory for items encoded during the early epoch decreased more following 0% cues than following 100% cues. Thus, the relationship between memory and reward anticipation remained consistent from immediate to 24-hr retrieval for items at the late encoding epoch but changed significantly across retrieval periods in the early encoding epoch (Figure 3).
To test whether our distinct patterns of results observed within the 24-hr retrieval group and the immediate retrieval group would withstand a three-way interaction, we performed a 2 (Encoding epoch: early, late) × 3 (Reward probability: 100%, 50%, 0%) × 2 (Group: immediate, 24-hr) repeated-measures ANOVA using Encoding epoch and Reward probability as within-subject factors and Group as a between-subject factor.
As expected, we observed a significant main effect of Reward probability, F(2, 76) = 6.40, p = .003, no significant main effect of Encoding epoch, F(1, 38) = 1.50, p = .228, and a significant main effect of Group, F(1, 38) = 7.45, p = .01. In addition, we observed the following significant interactions: Reward Probability × Encoding Epoch, F(2, 76) = 11.04, p < .001; Reward Probability × Group, F(2, 76) = 3.54, p = .034; and Encoding Epoch × Group, F(1, 38) = 10.33, p = .003. However, we did not observe a significant three-way interaction, F(2, 76) = 1.77, p = .178.
Post hoc analyses following up on the main effect of reward probability revealed greater memory for the 50% reward condition compared with both the 100% and 0% conditions (50% vs. 0%: t(39) = 2.70, p = .01; 100% vs. 50%: t(39) = 3.11, p = .003, 100% vs. 0%: t(39) = 0.15, p = .88; corrected for multiple comparisons). This result is qualified by the interaction between reward probability and encoding epoch as well as reward probability and group (see below).
As expected, post hoc analyses following up on the main effect of group revealed better memory for the immediate than 24-hr retrieval group, t(33.51) = 2.70, p = .011.
The significant interaction of Reward Probability × Encoding Epoch was driven by no differences in memory performance in the early epoch (all ps > .45), but greater memory performance in the uncertain condition compared with both the 100% and 0% condition during the late epoch (50% vs. 100%: t(39) = 5.18, p < .001; 50% vs. 0%: t(39) = 4.75, p < .001; corrected for multiple comparisons).
The significant interaction of Reward Probability × Group showed that, within the 24-hr group, there were differences in memory performance across all three reward probability levels with uncertainty having the strongest effect on memory performance (50% > 0%: t(19) = 3.34, p < .005; 50% > 100% t(19) = 1.97, p = .06; 100% > 0%: t(19) = 2.15, p = .04; note that 50% vs. 100% and 100% vs. 0% do not survive correction for multiple comparisons). In the immediate retrieval group, there was significantly better memory for the uncertain cues compared with the 100% predictive cues, t(19) = 2.46, p = .02.
The significant interaction of Encoding Epoch × Group was driven by significantly greater memory in the late epoch compared with the early epoch in the immediate group, t(19) = 2.83, p = .01. There were no differences in memory that varied by encoding epoch in the 24-hr group, t(19) = 1.60, p = .13.
These results largely confirm findings from our individual group ANOVAs, with the exception that we expected to observe a significant three-way interaction (Reward Probability × Encoding Epoch × Group). The differences we observed across groups (linear trend in memory by reward probability in the early epoch only for the 24-hr group and greater memory for the uncertain than certain reward probabilities in both groups) should be replicated in a larger sample.
Experiments 1 and 2: Confidence During Recognition Memory Test
To examine the degree to which participant's self-reported confidence ratings for old images varied according to reward probability, encoding epoch, or group, we ran a repeated-measures ANOVA with Reward probability (0%, 50%, 100%) and Encoding epoch (early, late) as within-subject factors and Retrieval time (immediate, 24-hr) as a between-subject factor. As anticipated, average confidence levels for hits were higher in the immediate versus 24-hr retrieval group, F(1, 38) = 13.796, p = .001. However, average confidence levels did not significantly differ as a function of Reward probability (F(1.63, 61.85) = 0.125, p = .841), Encoding epoch (F(1, 38) = 0.373, p = .545), or any interactions between the three factors (all ps > .147). Note that variation in degrees of freedom reflects Greenhouse–Geisser correction for a sphericity violation.
Experiments 1 and 2: Accuracy and Reaction Time During Encoding
To test the explanation that task engagement during encoding may explain the observed relationships between reward anticipation and memory, we examined whether the patterns of accuracy or RT for reward probes resembled subsequent memory performance across conditions. In both groups, one-way repeated-measures ANOVAs showed no accuracy differences in probe response by reward probability for trials with items presented in the early epoch (24-hr: F(2, 19) = 2.37, p = .11; immediate: F(2, 19) = 1.42, p = .25; Figure 4A and C). However, probe accuracy on trials with items presented in the late epoch in both groups significantly differed with reward probability (24-hr: F(2, 19) = 9.83, p = .0004; immediate: F(2, 19) = 9.58, p < .0001) and revealed significant linear trends, such that people performed more accurately as expected reward value increased (24-hr: R2 = .06, p < .0001; immediate: R2 = .04, p = .004; Figure 4A and C).
One-way repeated-measures ANOVAs of RT on trials with items presented in the late epoch revealed a significant difference at 24-hr retrieval but not immediate retrieval (24-hr: F(2, 19) = 4.25, p = .02; immediate: F(2, 19) = 1.97, p = .15; Figure 4B and D). Follow-up pairwise t tests showed that the 24-hr effects were driven by slower RTs for 50% rewarded trials relative to 0% rewarded trials (50% vs. 0%: t(19) = 3.84, p = .001; 100% vs. 50%: t(19) = 1.44, p = .17, 100% vs. 0%: t(19) = 1.18, p = .25; Figure 4B). There were no significant RT differences on trials with items presented in the early epoch (24-hr: F(2, 19) = 1.12, p = .24; immediate: F(2, 19) = 0.15, p = .86; Figure 4B and D). Thus, RT and accuracy on the reward probe were modulated by reward probability, but not in a manner that explained the observed memory effects.
Our findings demonstrate reward anticipation influences on memory formation that are temporally specific: In the early anticipation encoding epoch, 400 msec after the presentation of the reward cue and temporally coincident with phasic dopamine responses, item memory scaled with expected reward value. During a late anticipation encoding epoch, just before a predicted outcome, memory was instead greatest for items presented during high uncertainty. The memory benefit for items presented just after cues for greater rewards was evident only after 24 hr, implying a mechanism requiring consolidation to modulate memory formation. The memory benefit for items presented just before uncertain outcomes, however, was evident both immediately and 24 hr after encoding, implying a distinct underlying mechanism that modulates memory formation at encoding.
Although this is, to our knowledge, the first behavioral demonstration of dissociable contexts for encoding within reward anticipation, the results build on expectations generated from prior neuroimaging and physiological studies. Previous work using fMRI has demonstrated dissociable neural responses within the dopaminergic system for expected reward value and uncertainty (Tobler, O'Doherty, Dolan, & Schultz, 2007; Preuschoff, Bossaerts, & Quartz, 2006), with one study demonstrating dissociable temporal patterns in the striatum: activation in the first second after a reward cue scaling with expected reward value and in the following seconds leading up to reward outcome scaling with uncertainty (Preuschoff et al., 2006). Physiologically, cues associated with greater expected reward value elicit greater phasic dopamine firing in the midbrain at latencies less than 400 msec (Tobler et al., 2005; Fiorillo et al., 2003), whereas a sustained dopaminergic ramp has been shown to increase with greater reward uncertainty over a 2-sec period of reward anticipation (Fiorillo et al., 2003). Because phasic dopamine would be predicted to benefit memory early in the reward anticipation period and sustained anticipatory dopamine to have an effect close to the reward outcome, this study implies previously undescribed, functionally specific relationships between memory and phasic versus sustained dopamine.
Dissociable effects on memory are grounded in observations of other differential effects of dopaminergic firing modes. Phasic burst firing preferentially influences downstream targets via synaptic release, whereas sustained low-frequency activity results in extrasynaptic release (Floresco, West, Ash, Moore, & Grace, 2003). Extracellular dopamine levels have been demonstrated not only to increase for increased tonic dopamine activity (Floresco et al., 2003) but also to exhibit sustained, ramping dopamine levels lasting on the order of seconds (Howe et al., 2013; Stuber, Roitman, Phillips, Carelli, & Wightman, 2005; Roitman, Stuber, Phillips, Wightman, & Carelli, 2004). The mismatch of distribution of dopamine receptors in the hippocampus relative to dopamine terminals indicates that phasic dopamine firing in the midbrain cannot be communicated to hippocampal synapses as a temporally precise signal (see Shohamy & Adcock, 2010, for a review). Optogenetic findings have also revealed that higher (simulating phasic) versus lower (simulating tonic or possibly sustained) levels of dopamine release have differential influences on dopamine receptors in the hippocampus (Rosen, Cheung, & Siegelbaum, 2015). Relationships between hippocampal dopamine release and hippocampal memory formation have yet to be demonstrated. Whereas further work thus remains to elucidate phasic versus sustained (or tonic) dopamine effects on memory, the extant literature supports multiple dissociable mechanisms of dopaminergic influence at distinct timescales (Düzel, Bunzeck, Guitart-Masip, & Düzel, 2010; Shohamy & Adcock, 2010).
The present observation of a 24-hr memory benefit following higher expected reward value early during reward anticipation is consistent with a previously described relationship in the literature between dopamine and consolidation-dependent memory effects. Expected reward value predicts phasic dopamine activity in the VTA (Cohen, Haesler, Vong, Lowell, & Uchida, 2012; Pan, Schmidt, Wickens, & Hyland, 2005; Tobler et al., 2005; Fiorillo et al., 2003; Schultz, 1998; Schultz et al., 1997). Dopamine has been associated with enhancement of late-phase long-term potentiation (Lemon & Manahan-Vaughan, 2006) and is a critical element in the synaptic tagging and capture theory of memory consolidation (Lisman, Grace, & Duzel, 2011; Redondo & Morris, 2011; Sajikumar & Frey, 2004). Additionally, optogenetically induced burst firing of dopaminergic fibers results in increased hippocampal replay during post-learning sleep and increased memory (McNamara et al., 2014). Thus, previous work supports a relationship between dopamine and enhanced hippocampal memory consolidation. Our novel demonstration of a temporally specific effect of expected reward value on memory during the early epoch of reward anticipation is consistent with phasic dopamine driving consolidation-dependent memory processes.
On the other hand, our observation of a memory benefit during high uncertainty just before reward outcome, evident immediately and persisting after consolidation, suggests a mechanism of memory enhancement that occurs at encoding. High reward uncertainty has been associated with sustained, ramping dopamine firing in the VTA (Fiorillo et al., 2003). As noted above, relationships between dopamine release and hippocampal memory formation have yet to be demonstrated. What has been shown, however, is that tonic dopamine has immediately observable effects on hippocampal physiology (Rosen et al., 2015; Martig & Mizumori, 2011; Otmakhova & Lisman, 1999) and the threshold for early long-term potentiation (Li, Cullen, Anwyl, & Rowan, 2003), providing candidate mechanisms whereby sustained dopamine may contribute to memory at immediate retrieval, in the hippocampus and elsewhere. This hypothesis opens new avenues for future investigation.
By demonstrating and dissociating both immediate and consolidation-dependent memory benefits related to reward anticipation, our study takes an important first step toward reconciling conflicting patterns of findings in the memory literature. Prior rodent work has demonstrated the importance of consolidation for dopamine-dependent memory formation (McNamara et al., 2014; Bethus et al., 2010) and theoretical mechanisms of dopamine synaptic activity have emphasized consolidation (Lisman et al., 2011; Redondo & Morris, 2011). However, some reward anticipation effects on memory are not consolidation dependent (Gruber, Gelman, & Ranganath, 2014; Murty & Adcock, 2014). Our data integrate across previous studies, suggesting that whereas phasic, synaptic dopamine effects may indeed be dependent on consolidation, effects of sustained and extrasynaptic dopamine may occur at encoding.
Alternative Accounts and Limitations
In this study, we did not manipulate dopamine directly. It is thus possible that our effects and, in particular, the uncertainty benefit modulating encoding were not dopaminergic in nature. Although prior studies using pharmacological manipulations have already contributed direct evidence that dopamine affects memory formation in humans (Chowdhury, Guitart-Masip, Bunzeck, Dolan, & Düzel, 2012; Knecht et al., 2004), they have not been shown to selectively affect specific modes of dopamine firing, and indeed, l-DOPA should enhance both. Our hypotheses were based on work showing sustained neuronal firing in the VTA scaling with greater uncertainty. It has been debated whether this signal represents sustained dopaminergic firing or represents an accumulation of phasic responses (Niv, Duff, & Dayan, 2005). There is evidence that ramping activity in the VTA may be GABAergic in nature (Cohen et al., 2012). Other work, however, is consistent with a sustained signal that is actively maintained (Murty, Ballard, & Adcock, 2017; MacInnes, Dickerson, Chen, & Adcock, 2016; Lloyd & Dayan, 2015; Howe et al., 2013; Totah et al., 2013). In addition, sustained dopamine release in efferent regions has been demonstrated to scale with reward proximity and reward magnitude (Howe et al., 2013); a response to uncertainty has yet to be experimentally examined in efferent regions. Other neurotransmitters, such as acetylcholine, offer additional potential mechanisms for enhanced memory formation; these are not mutually exclusive. The hippocampus in particular is densely populated with cholinergic receptors (Alkondon & Albuquerque, 1993), and acetylcholine has been discussed as important for expected uncertainty (Sarter, Lustig, Howe, Gritton, & Berry, 2014; Yu & Dayan, 2005), which may be similar to the cued uncertainty in this study. Finally, under some behavioral contexts, hippocampal dopamine release appears to require neuronal activity within the locus coeruleus, implicating noradrenergic neurons (Kempadoo, Mosharov, Choi, Sulzer, & Kandel, 2016; Takeuchi et al., 2016; Smith & Greene, 2012). The current work introduces possibilities for future experiments that disentangle the roles of specific neuromodulators in encoding during high reward uncertainty.
This behavioral project also lacks neural evidence of engagement of specific regions within the medial-temporal lobe and the rest of the brain. Our discussion and interpretation focus predominantly on the hippocampus given extensive evidence supporting its role in reward-related memory formation and the evidence for modulation of its physiology. However, there is also strong support for engagement of the perirhinal cortex during familiarity memory: Indeed, research suggests the perirhinal cortex is critical for familiarity memory whereas the hippocampus is important for recollection (Suzuki & Naya, 2014; Diana, Yonelinas, & Ranganath, 2007; Eichenbaum, Yonelinas, & Ranganath, 2007). We included all trial types in our analyses here: guesses, medium, and high confidence trials. Prior work relating these confidence ratings to memory processes suggests that our overall memory measure is likely to reflect the function of not only the hippocampus proper but also the broader medial-temporal lobe, including perirhinal cortex. In addition, when analyzing self-reported confidence data for old images (specifically hits), we observed significant group differences (greater confidence in the immediate group than the 24-hr group), but no other significant effects. From prior literature (Lisman et al., 2011; Shohamy & Adcock, 2010), it might be predicted that participants would report highest confidence ratings for trials associated with highest reward (or uncertainty), particularly given that if individuals recollect an object, it should be associated with high confidence. Notably in this paradigm, we did not use a remember/know assessment or otherwise specifically assess recollection. Furthermore, both recollection and familiarity can support high confidence, so that confidence alone should not be used as a measure of recollection (though typically highest confidence ratings are associated with recollection; Yonelinas, Aly, Wang, & Koen, 2010). Future iterations using a remember/know structure to assess recollection, fMRI, and more trials may permit analyzing guesses and low-confidence trials separately from high-confidence trials. We would predict that guesses and low-confidence trials would recruit the perirhinal cortex whereas high-confidence trials would recruit the hippocampus, allowing the test of additional, anatomical hypotheses about distinct mechanisms for the reward anticipation effects shown here.
Multiple alternative accounts were also considered as potential explanations of our memory findings. One intuitive possibility is that enhanced memory formation was a result of greater task engagement, specifically increased attention. Pearce and Hall introduced the notion of attention influencing learning of uncertain stimuli (Pearce & Hall, 1980), and this theory has been corroborated by others showing increased attention to uncertain cues (Koenig, Kadel, Uengoer, Schubö, & Lachnit, 2017; Hogarth, Dickinson, Austin, Brown, & Duka, 2008). As such, attention may have a particularly strong influence on learning in the 50% reward condition (i.e., attention may increase toward the end of the trial). One measure we have for attention in the current paradigm is RT for responding to the reward probe; this may reflect encoding of the reward event but can only indirectly reflect attention before reward delivery. This limits the conclusions we can draw regarding the influence of attention during the cue and anticipation periods, which occur before reward delivery. At the time of the reward probe, in the immediate retrieval group, there were no significant differences in RT when objects were shown early or late in the epoch. In the 24-hr retrieval group, although there were no differences in RT when objects occurred in the early epoch, RT for 50% reward cues was significantly slower than 0% reward cues, but importantly no different from 100% reward cues. The best manner of assessing the influence of attention during encoding may be to include eye-tracking and/or neuroimaging measures in future iterations of the task. Given the present data, it is difficult to quantify how much of our observed effects are due to attention, though we speculate it is playing an important role.
Another possible alternative was that the memory benefit for uncertainty in the late encoding epoch was due to a phasic dopaminergic response to reward delivery. However, there was no evidence for this relationship, as there was no memory difference for items presented before rewarded versus unrewarded outcomes on the uncertain trials.
Lastly, the effect of reward probability on information presented during the early epoch was not particularly strong and the follow-up linear trend showing increasing memory with increasing reward probability explained a low percentage of the variance in the data. In addition, the differences we observed across groups were not robust to a three-way interaction. As such, future experiments, including more trials and a larger sample, should replicate these results.
This study builds on prior findings that reward anticipation modulates memory formation. Here, we show that, within reward anticipation, there are distinct temporal contexts for encoding, with mechanistically distinct impact on memory outcomes. By mapping these distinct encoding contexts onto the putative physiological profiles for expected reward value and uncertainty, this work suggests a novel working model of dopaminergic influence on memory formation for future investigation: Whereas phasic dopamine release acts to facilitate memory consolidation, sustained dopamine release acts to benefit memory encoding. Integrating disparate findings, our proposed model paves the way for future research examining contextually regulated mechanisms of reward-enhanced memory formation.
The authors would like to thank Vishnu Murty for helpful comments on this article. This work was supported by a NIMH BRAINS award (R01MH094743) to R. A. A. and a KL2 award (TR002554) to K. C. D.
Reprint requests should be sent to R. Alison Adcock, B253 Levine Science Research Center, Duke University, Box 90999, Durham, NC 27708, or via e-mail: email@example.com.
This paper is part of a Special Focus deriving from a symposium at the 2017 Annual Meeting of the Cognitive Neuroscience Society entitled “Memory Neuromodulation: Influences of Learning States on Episodic Memory.”