By recording the feedback-related negativity (FRN) in response to gains and losses, we investigated the contribution of outcome monitoring mechanisms to age-associated differences in probabilistic reinforcement learning. Specifically, we assessed the difference of the monitoring reactions to gains and losses to investigate the monitoring of outcomes according to task-specific goals across the life span. The FRN and the behavioral indicators of learning were measured in a sample of 44 children, 45 adolescents, 46 younger adults, and 44 older adults. The amplitude of the FRN after gains and losses was found to decrease monotonically from childhood to old age. Furthermore, relative to adolescents and younger adults, both children and older adults (a) showed smaller differences between the FRN after losses and the FRN after gains, indicating a less differentiated classification of outcomes on the basis of task-specific goals; (b) needed more trials to learn from choice outcomes, particularly when differences in reward likelihood between the choices were small; and (c) learned less from gains than from losses. We suggest that the relatively greater loss sensitivity among children and older adults may reflect ontogenetic changes in dopaminergic neuromodulation.
Reward expectancy is recognized as one of the major mechanisms that drive goal-directed behavior and establish the saliency of events (e.g., Rushworth & Behrens, 2008; Schultz, 2007). Identifying what is relevant and what is not when trying to achieve a particular goal is vital for guiding behavior in unfamiliar contexts and for the efficient deployment of attentional resources. These observations carry particular weight during life periods when attentional resources are either still increasing or start declining. Nevertheless, surprisingly little is known about differences in the way in which outcomes are monitored to guide behavior during different periods of ontogeny. Thus far, only a handful of studies have investigated feedback effects from a child developmental or aging perspective. These studies showed that children and older adults experience more difficulties in learning from probabilistic performance feedback than younger adults and adolescents (cf. Eppinger, Mock, & Kray, 2009; Eppinger, Kray, Mock, & Mecklinger, 2008; Marschner et al., 2005; Mell et al., 2005; Crone, Jennings, & van der Molen, 2004). However, compared with other cognitive mechanisms and functions such as executive control, working memory, or episodic memory, the monitoring of outcomes for the shaping of behavior has rarely been investigated systematically across different age groups that cover development from childhood to old age.
Available evidence suggests that the functional brain circuitries and neuromodulatory mechanisms involved in reward processing undergo pronounced changes across the life span (e.g., Li, Lindenberger, & Bäckman, 2010; Raz et al., 2005; Resnick, Pham, Kraut, Zonderman, & Davatzikos, 2003; Sowell et al., 2003; Volkow et al., 2000). As a first step toward linking ontogenetic changes in physiology to age-related behavioral differences in reward processing, this study investigates the cortical mechanisms contributing to age-related differences in outcome monitoring across the life span. Specifically, we assess electrophysiological correlates of monitoring gains and losses in the context of a probabilistic reinforcement learning task in children, adolescents, younger adults, and older adults.
Neural Substrates of Reward Processing
Over the past decade, a great deal of progress has been made toward understanding the neural substrates of reward detection and reward expectation and the adjustment of future decisions on the basis of reward history (for reviews, see Cohen, 2008; Schultz, 2007). Relevant findings have been reported at neuroanatomical, neurochemical, and electrophysiological levels. The functional contributions of the BG, amygdala, OFC, and ACC to decision-making, reward processing, and reinforcement learning are being increasingly understood (Cohen, 2008; Redgrave, Gurney, & Reynolds, 2008; Kennerley, Walton, Behrens, Buckley, & Rushworth, 2006; Ridderinkhof, Ullsperger, Crone, & Nieuwenhuis, 2004; Walton, Devlin, & Rushworth, 2004; Walton, Bannerman, & Rushworth, 2002; Schultz, 2000; Shima & Tanji, 1998). Specifically, ACC is conceived of as a control structure or as a structure that integrates positive and negative aspects of an action to orient future behavior (cf., Rushworth & Behrens, 2008; Holroyd & Coles, 2002). At the neurochemical level, this network is targeted by the dopaminergic neurons of the midbrain. Dopamine (DA) neurons in the ventral tegmental area and substantia nigra have been shown to fire upon reward expectation as well as upon deception of a reward expectation (prediction error; for reviews, see Schultz, 2000, 2002, 2007; Holroyd & Coles, 2002). Dopaminergic signals from the midbrain are thought to be conveyed to ACC to inform prefrontal control structures of action outcomes that are worse or better than expected (Frank, Woroch, & Curran, 2005; Holroyd & Coles, 2002; but see also Walton, Croxson, Rushworth, & Bannerman, 2005). In line with this view, applying a DA D2 receptor agonist (pramipexole) that reduces phasic DA bursts via autoreceptors impaired reinforcement learning from positive outcomes at the behavioral level (Pizzagalli et al., 2008) and affected also outcome-related activations in dorsal and medial ACC (Santesso et al., 2009).
Recently, a negative ERP component has been observed after feedback about positive or negative outcomes in gambling or reinforcement learning tasks (e.g., Miltner, Braun, & Coles, 1997). The so-called feedback-related negativity (FRN) peaks about 250 msec after feedback and is more pronounced after negative feedbacks (e.g., losses) than after positive ones (e.g., gains; Gehring & Willoughby, 2002; Miltner et al., 1997). The generator of the FRN has been localized to the ACC (Luu, Tucker, Derryberry, Reed, & Poulsen, 2003; Gehring & Willoughby, 2002). Studies on the processes reflected by the FRN have revealed that the stronger reaction to losses is not driven by the valence specificity of the FRN but rather by the fact that it reflects “relative valence” and reacts most strongly to feedbacks from outcomes that are worse relative to other possible outcomes (e.g., receiving a smaller gain as opposed to receiving a larger gain; see Holroyd, Hajcak, & Larsen, 2006; Holroyd, Larsen, & Cohen, 2004; Nieuwenhuis, Holroyd, Mol, & Coles, 2004; Nieuwenhuis, Yeung, Holroyd, Schurger, & Cohen, 2004). In the case of more than two possible outcomes, another interesting feature of the FRN is that the evaluation reflected by the FRN appears to sort possible outcomes in a dichotomous fashion, with the FRN being larger for the worse and intermediate outcomes and smaller for the better outcome (Hajcak, Moser, Holroyd, & Simons, 2006; Holroyd et al., 2004, 2006). As Holroyd et al. (2006) noted, such a simplified outcome representation suggests that the monitoring signal reflected by the FRN is not a precise evaluation of the magnitude (or amount) of feedback received but rather an initial classification of the information provided by the feedback as either positive or negative in relation to achieving task-specific goals (cf. Hajcak et al., 2006). In line with this interpretation that the FRN reflects a relative evaluation of possible outcomes along the dimensions of valence or likelihood of occurrence, other studies have also linked the FRN to outcome expectancies (Hajcak, Moser, Holroyd, & Simons, 2007; Holroyd, Nieuwenhuis, Yeung, & Cohen, 2003). Taken together, these findings suggest that to establish and to maintain task-specific goals, top–down monitoring, presumably originating from the pFC, is needed. The monitoring system, however, need not represent the specifics of the feedback information itself but rather its valence in relation to a task-specific goal (Holroyd et al., 2006; Holroyd, Yeung, Coles, & Cohen, 2005).
In addition to the FRN, a P300 component has been observed after gain and loss feedback. This component is assumed to be more specifically related to the expectedness of the feedback (Hajcak et al., 2007; Hajcak, Holroyd, Moser, & Simons, 2005) and was shown to decrease when the feedback was less expected (Eppinger et al., 2008; Campbell, Courchesne, Picton, & Squires, 1979). The developmental evidence on this component is scarce. A few studies report a smaller decrease of this component with more expected outcomes in children and older adults compared with younger adults (Eppinger et al., 2008, 2009; Mathewson, Dywan, Snyder, Tays, & Segalowitz, 2008).
Life Span Differences in Neural Substrates of Reward Processing
In healthy adults, DA's role in affecting the FRN during reinforcement learning has only recently been investigated (Santesso et al., 2009; Pizzagalli et al., 2008). Studies involving children or older adults on this question are still lacking. However, some evidence suggests that dopaminergic modulation of the prefrontal cortices matures relatively late during childhood and adolescence (Andersen, Dumont, & Teicher, 1997; Rosenberg & Lewis, 1994; for a review, see also Benes, 2001) and that its dysfunction affects attention and other frontal executive functions (e.g., Liotti, Pliszka, Perez, Kothmann, & Woldorff, 2005; Diamond, Briand, Fossella, & Gehlbach, 2004; Diamond, 1996, 2002). At the same time, dopaminergic modulation declines markedly during adulthood and old age, and this decline has been linked to senescent declines in processing speed, processing robustness, episodic memory, working memory, and fluid intelligence (Nagel et al., 2008; Bäckman, Nyberg, Lindenberger, Li, & Farde, 2006; Bäckman et al., 2000; Volkow et al., 2000). The importance of DA for prefrontally supported cognitive functions is further emphasized by findings showing that DA release in the frontal cortex can be dynamically up-regulated when working memory or executive demands are high (Aalto, Bruck, Laine, Nagren, & Rinne, 2005). Importantly, the phasic decrease of D1 binding potential in subcortical structures during an interference task has been found to be present in younger but absent in older adults, suggesting a severe age-related deficit in the ability to up-regulate DA release in response to an executively challenging cognitive task (Karlsson et al., 2009).
Furthermore, the frontal brain regions implicating outcome monitoring, goal selection, goal maintenance, and initiation of adaptive actions during reinforcement learning show a comparatively protracted development (Resnick et al., 2003; Sowell et al., 2003) and early decline (Raz et al., 2005; Volkow et al., 2000) across the life span. Given life span age differences in the prefrontal circuitry at anatomical and neurochemical levels, we hypothesized that prefrontally based executive process, such as the monitoring of action outcomes as reflected in the FRN, would be inferior to or at least different from young adults in both children and older adults. Evidence from prior studies reviewed next supports this expectation.
Age Differences in the FRN
Previous findings indicate that older adults' FRN amplitudes are reduced relative to those of younger adults (Eppinger et al., 2008; Mathewson et al., 2008; Pietschmann, Simon, Endrass, & Kathmann, 2008; Nieuwenhuis et al., 2002), whereas children have higher FRN amplitudes than younger adults (Eppinger et al., 2009). The greater FRN amplitude in children has been attributed to their stronger reliance on external feedback (Eppinger et al., 2009), whereas the smaller amplitude in older adults has been related to less efficient phasic dopaminergic signaling of prediction errors (Nieuwenhuis et al., 2002).
Current evidence, however, presents a more complex developmental picture, in particular regarding the difference between FRNs after negative and positive feedbacks. Some studies reported that differences between the FRN after losses and gains are smaller among older adults than among younger adults (Mathewson et al., 2008; Pietschmann et al., 2008; Nieuwenhuis et al., 2002), whereas others found no such differences (Eppinger et al., 2008). As for children, the FRN difference after gains and losses did not differ reliably from the FRN difference observed in younger adults (Eppinger et al., 2009). Differences in experimental designs and procedures between the studies may have contributed to the mixed picture in the present findings. For instance, in some assessments the FRN was based on the difference wave between gain and loss trials (Pietschmann et al., 2008; Nieuwenhuis et al., 2002), whereas in others it was defined separately for gains and losses (Eppinger et al., 2008; Mathewson et al., 2008; Pietschmann et al., 2008). The studies also differ in terms of whether the peak FRN amplitude was chosen relative to a prestimulus baseline (Mathewson et al., 2008; Pietschmann et al., 2008; Nieuwenhuis et al., 2002) or relative to the preceding positive peak (Eppinger et al., 2008). Moreover, recent evidence also shows that the extent of learning affects a positivity onto which the FRN is superimposed in the EEG responses to gains (Eppinger et al., 2008). Taking into account the effects of these different factors, a peak-to-peak measure of the FRN defined separately for each condition thus appears to be the more appropriate measure for comparing the response of the monitoring system to gains and losses across the life span.
Finally, given the well-documented life span changes in dopaminergic neuromodulation at subcortical levels (for reviews, see Bäckman, Lindenberger, Li, & Nyberg, 2010; Bäckman et al., 2006; Benes, 2001), one may wonder how lower levels of dopaminergic modulation in children and older adults would affect outcome monitoring. The effects of DA dips in the midbrain during negative feedback (Schultz, 2002) are assumed to be amplified when tonic levels of DA are low (Frank & Kong, 2008; Frank, Seeberger, & O'Reilly, 2004). Specifically, as DA dips are thought to contribute to learning from negative outcomes through the striatal DA D2 receptors, an amplification of these dips in individuals with lower baseline levels of dopaminergic modulation is assumed to result in a greater reliance on negative feedbacks (Frank & Kong, 2008; Frank, 2005; Frank et al., 2004). In line with this view, older adults and Parkinson patients, whose tonic DA levels are lower, show greater sensitivity to negative outcomes (Frank & Kong, 2008; Frank et al., 2004). There is also more direct evidence on DA's effect on the FRN: DA agonists that presumably reduce phasic DA hamper reinforcement learning from positive outcomes and affect the monitoring of positive outcomes, as indicated by the FRN (Santesso et al., 2009; Pizzagalli et al., 2008).
Aims and Hypotheses of the Present Study
The goal of the present study is to investigate life span age differences in electrophysiological correlates of outcome monitoring during probabilistic reinforcement learning. The monitoring responses after gains or losses as measured with the FRN are assumed to reflect the perceived saliency of positive and negative events for pursuing task-relevant goals and regulating future actions. Smaller differences between the FRN after losses and the FRN after gains are assumed to indicate that the monitoring system is less capable of differentiating between favorable and unfavorable outcomes.
In light of the available evidence on life span anatomical and neuromodulatory differences in brain networks supporting reward processing, we hypothesize that the behavioral and electrophysiological consequences of gains and losses differ less from each other in children and older adults than in adolescents and younger adults. Furthermore, given evidence showing that DA levels in subcortical structures only peak until adolescence and adulthood and start to decline thereafter (for reviews, see Bäckman et al., 2006, 2010; Li, Lindenberger, Nyberg, Heekeren, & Bäckman, 2009; Li, Lindenberger, & Sikström, 2001), we also explored whether children and older adults adjust their choices more frequently after negative than after positive feedback as compared with younger adults and adolescents.
The study sample included 179 participants covering four age groups: 44 children (21 girls, 9–11 years), 45 adolescents (21 girls, 13–14 years), 46 younger adults (22 women, 20–30 years), and 44 older adults (21 women, 65–75 years). Informed consent was obtained from each participant or parent of the participant before testing. The Ethics Committee of the Max Planck Institute for Human Development approved of the study. Participants were paid €10 for the first and €7 for every following hour of the experiment. On the basis of earlier life span studies of memory development (Shing, Werkle-Bergner, Li, & Lindenberger, 2008; Brehmer, Li, Müller, von Oertzen, & Lindenberger, 2007) and development of reinforcement learning (e.g., Crone, Jennings, et al., 2004), the age range of children as chosen here well reflects cognitive development in middle childhood. The adolescent age range was chosen according to evidence on the development of executive functions (Luciana, Conklin, Hooper, & Yarger, 2005) and reinforcement learning (e.g., Galvan et al., 2006). Furthermore, given that mechanisms influencing outcome monitoring, such as stimulus-response monitoring and attention, develop more rapidly from childhood to early adulthood than from early to late adulthood (e.g., Waszack, Li, & Hommel, 2010; Li, Hämmerer, Müller, Hommel, & Lindenberger, 2009; Li et al., 2004), the age ranges of the groups (i.e., differences between the minimum and the maximum ages for a given group) were smaller in the children and adolescents than in younger and older adults (children: mean age = 9.88 years, SD = 0.57, range = 2.17 years; adolescents: mean age = 14.03 years, SD = 0.58, range = 2.25 years; younger adults: mean age = 24.12 years, SD = 1.86, range = 6.92 years; older adults: mean age = 70.33 years, SD = 2.84, range = 9.92 years). One child, one younger adult, and one older adult were excluded from the analyses because they did not reach the minimum learning criteria (see below). Also, one adolescent as well as two older adults were excluded because of technical problems during EEG recordings. The educational level of the sample was comparatively high. The majority of the children were still attending elementary school (79%), and the rest were already attending the gymnasium, the college preparatory track of high school. Most of the adolescents were attending the gymnasium (78%), the majority of the younger adults were enrolled at a university (91%), and most of the older adults held either an academic high school diploma (46%) or a vocational school diploma (43%). All subjects were right-handed (Oldfield Questionnaire: LQ > 80; Oldfield, 1971).
Other than the above demographic variables, we also collect data to characterize the sample's general cognitive abilities. In a separate test session prior to the EEG experiment, two main aspects of intelligence, fluid and crystallized intelligence (Horn, 1989), were assessed, respectively, with the Digit Symbol Substitution test (Wechsler, 1981) as a marker of perceptual speed and a modified version of the Spot-a-Word test (Lehrl, 1977; see also Lindenberger, Mayr, & Kliegl, 1993) as a marker of verbal knowledge. According to the two-component theories of life cognition (Lindenberger, 2001; Baltes, 1987), fluid intelligence relies relatively more on basic cognitive mechanics, whereas crystallized intelligence depends relatively more on experience and acquired knowledge. Thus, it can be expected that the life span trajectories of marker tests of fluid intelligence match more closely the maturation- and senescence-related age trajectories of brain development than the life span trajectories of measures of crystallized intelligence. In our sample, Digit Symbol Substitution performance increased from childhood to early adulthood and decreased from early to late adulthood (planned contrast: t = 11.6, p < .01, d = 1.84). Performance on the Spot-a-Word test increased with age, χ2(3, n = 182) = 125.4, p < .01. The observed dissociation between life span age gradients of these two tests in our sample is in agreement with well-established empirical evidence on the development of crystallized and fluid intelligence obtained in larger and more representative samples (e.g., Li et al., 2004).
The Reinforcement Learning Task and Experimental Procedure
Participants were seated in a comfortable manner in front of a computer in an electrically and acoustically shielded room. The distance to the computer screen was 80 cm. Each session started with a relaxation phase of 3 minutes (1.5 minutes with eyes closed and 1.5 minutes with eyes open). Participants were then asked to work on a modified probabilistic reinforcement learning task (after Frank et al., 2004). During the task, participants were presented with different pairs of Japanese characters that were each associated with probabilistic gains and losses. Choosing one of the two symbols resulted in the participant either gaining or losing 10 points. However, within each pair, one symbol had a higher probability of resulting in a gain than did the other symbol.
Distinctiveness of Choice Pairs Defined by Reward Probability
There were three types of choice pairs that differed with respect to the difference in gain and loss probabilities assigned to the two symbols: The pair with the highest distinctiveness in reward probability had a 85% probability of making a gain if the symbol with a higher reward likelihood (i.e., the “good” option) is chosen and conversely a 15% probability of making a gain if the symbol with a lower reward likelihood is chosen (i.e., the “bad” option). The pair with the medium distinctiveness in reward probability had a 75% probability of gains when choosing the good option and a 25% probability of gains when choosing the bad option. The pair with the lowest distinctiveness had a 65% probability of gains when choosing the good option and a 35% probability of gains when choosing the bad option.
Participants were instructed to try to collect as many points as possible and to identify the good option within each pair. Within a block of 60 trials, trials of each of the three different types of pairs were presented in a mixed order. Hence, within one block of trials, 20 trials of each of the three pair types were presented. The sequence according to which the different pairs were presented was pseudorandomized, such that each pair was presented twice before the next pair appeared. Pilot studies showed that this pseudorandomization resulted in more suitable difficulty levels for the participants of all four age groups in light of age differences in task-set switching and working memory (e.g., Kray, Eber, & Karbach, 2008; Eppinger, Kray, Mecklinger, & John, 2007; Kray & Lindenberger, 2000; Babcock & Salthouse, 1990). After each block of 60 trials, the proportion of good choices (i.e., when the good option within the pair was selected by the participants) was assessed. Learning criteria were based on the number of good choices made for each type of pair within the block. For the high-distinctive pair, participants had to make good choices 75% of the time; for the medium-distinctive pair type, 70%; and for the low-distinctive pair, 65%. After participants had fulfilled the learning criteria for all three types of choice pairs, a new set of three choice pairs was provided, for which the participants once again had to identify the good option to maximize gains. A maximum of three different sets of the three types of choice pair could be learned. This procedure resulted in participants completing different numbers of blocks depending on how quickly they managed to reach the learning criteria, with slower learning resulting in more learning blocks. The minimum number of blocks was thus three blocks, whereas the maximum number of blocks was 12, regardless of whether the learning criterion had been attained. This approach was chosen to ensure that despite the expected age differences in the speed of learning, the behavioral and electrophysiological data collected during the task reflected learning from negative and positive feedbacks in all age groups. Participants who had not managed to fulfill the learning criteria for at least one set of three pairs were excluded from further analyses. This was the case for one child, one younger adult, and one older adult.
Participants had 20 practice trials before starting with the task. The complete task took approximately 20 to 60 minutes, depending on the number of blocks needed. After each block, the participants were allowed to take a short break, deciding themselves how long they felt they needed before moving on to the next block. During this break, they also received summary feedback informing them how many points they had lost and gained in the preceding block as well as how many points they had collected altogether up to that point. If they had reached the learning criteria for all three types of pairs in the preceding block, they were also informed that a new set of three pairs would be presented in the following block.
Trial Time Course
At the beginning of each trial, a pair of Japanese characters (symbols) was shown on the screen until one of the symbols was chosen. The symbols used in the task were 1.07° × 1.07° and were presented close to the center of the screen. The symbol chosen then disappeared 500 msec after the choice had been made and was replaced by a feedback: in the case of a gain, a green “+10” was presented, and in the case of a loss, a red “−10” was presented. The feedback was 0.71° × 1.07° and remained on the screen for 1000 msec. After this, a fixation cross (0.35° × 0.35°) was presented for 1000 msec before the next trial started. The total trial duration thus amounted to 2500 msec plus the participant's RT to the pair of Japanese characters.
EEG Recordings and Analyses
EEG was recorded from 64 Ag/AgCl electrodes placed according to the 10-10 system in an elastic cap (Braincap, BrainVision), using BrainVision Recorder. The sampling rate was 1000 Hz, with a band-pass filter applied in the range of 0.01 to 250 Hz. EEG recordings were referenced on-line to the right mastoid. The ground was positioned above the forehead. Impedances were kept below 5 kΩ. Vertical and horizontal EOGs were recorded next to each eye and below the left eye.
Using BrainVision Analyzer (Gilching, Germany), the recorded data were rereferenced to the linked mastoid reference. Further EEG analyses were conducted using the Fieldtrip software (http://www.ru.nl/fcdonders/fieldtrip) supplemented by in-house written code and EEGLAB (Delorme & Makeig, 2004). The data were segmented into epochs of 2 sec before and 2.5 sec after the onset of the feedback. Epochs or channels with severe muscular artifacts or saturated recordings were manually excluded. An average of 7.3% of the trials had to be removed from the EEG data (children = 9.2%, adolescents = 5.8%, younger adults = 6.5%, and older adults = 5.1%). The number of rejected trials per condition was included as a covariate in the repeated measures MANOVA. All the main and interaction effects in the EEG data reported below proved to be robust with respect to individual differences in the number of rejected trials because of artifacts and differences in the percentage of loss trials. Because of age differences in the number of blocks required to reach the learning criteria, the average number of trials available for subsequent analyses differed across the age groups. Groups with slower learning performed more trials in this case. For choices resulting in a loss, the mean (SD) numbers of trials for the four groups were as follows: children = 123 (69) trials, adolescents = 96 (65) trials, younger adults = 83 (63) trials, and older adults = 178 (79) trials. For choices resulting in a gain, the mean (SD) numbers of trials were as follows: children = 200 (86) trials, adolescents = 174 (85) trials, younger adults = 157 (73) trials, and older adults = 285 (104) trials. Despite these age differences in the mean number of valid trials, the measurement reliability of ERPs did not differ between groups. In a related study, the stability of the measurement of the ERP amplitudes reported in the present study was assessed by adding a retest session two weeks later. Multigroup analyses showed that the 2-week test–retest stability did not differ significantly across the age groups (Hämmerer, Li, Völkle, Müller, & Lindenberger, in preparation).
The preprocessed data were subjected to an ICA decomposition using EEGLAB (Delorme & Makeig, 2004). ICA components representing ocular and muscular artifacts were further removed from the data. The recombined data were band-pass filtered in the range of 0.5 to 25 Hz and epoched 1000 msec after and 100 msec before the feedback onset. Baseline corrections were applied on the epoched data with respect to the 100-msec prestimulus baseline. ERPs were obtained by first averaging across trials for each electrode and condition for each participant and then across participants within each age group. Difference waves were calculated by subtracting the ERP after gains from the ERP after losses. Latencies and amplitudes of the P2 and N2 components after the feedbacks were defined for each condition as the most positive (or negative) peaks in the individual averages in the time windows 100–250 and 250–350 msec, respectively (for comparable time windows in developmental studies of EEG components to positive and negative feedback, see Eppinger et al., 2008; Nieuwenhuis et al., 2002). Following Yeung and Sanfey (2004), the FRN was defined as the difference in amplitude between the P2 and the N2 peaks. To compare the enhancement of the FRN after losses in relation to the FRN after gains independent of age differences in the FRN base amplitude, a ratio score, defined as (loss FRN − gain FRN) / gain FRN, was calculated for each participant. Furthermore, difference waves between the ERP after losses and gains were calculated. The most negative peak at electrode Fz in a time window 200 to 400 msec after the feedback was taken as the peak of the difference wave. Also for the difference waves, to compare the difference in reactions to gains and losses independent of age differences in the FRN base amplitude, a ratio score defined as peak difference wave/gain FRN was calculated for each participant.
The data were analyzed using SPSS (Release 15.0.0, September 6, 2006; SPSS Inc., Chicago, IL) and SAS (SAS 9.1.3, Windows Version 5.2.3790; Cary, NC). Deviations from normality were corrected by transforming the data using arcsin transformation. In the case of unequal variances, tests that allow for unequal variances, such as the SAS PROC MIXED procedure, were used. Nonparametric tests (Kruskal–Wallis test and Mann–Whitney test) were used when transformations failed to establish the normality of the data in all four age groups.
As previous studies showed clear age differences in EEG scalp distributions across the life span (e.g., Müller, Brehmer, von Oertzen, Li, & Lindenberger, 2008), we analyzed the data at the single electrode level rather than clustering the electrodes. MANOVA analyses were performed for each ERP component and experimental condition on 25 leads (F7, F3, Fz, F4, F8, FT7, FC3, FCz, FC4, FT8, T7, C3, Cz, C4, T8, TP7, CP3, CPz, CP4, TP8, P7, P3, Pz, P4, and P8), including age group as the between-subject factor (children, adolescents, younger adults, and older adults) and laterality (five levels: left, medium-left, midsagittal, medium-right, and right) as well as anterior-posterior (five levels: frontal, fronto-central, central, centro-parietal, and parietal) as the within-subject factors. To identify the electrodes with the maximal effects in each age group, further MANOVA analyses were performed separately for each age group and condition.
The EEG responses from the electrodes with the maximal effects within each age group were then compared across the four age groups for the experimental conditions outlined in the Results section using multivariate repeated measures analyses of variance. To further characterize the patterns of differences between age groups, the reliable main effects of age group were followed up by two planned contrasts. The first contrast tested for a linear pattern across the age groups and the second for a curvilinear pattern across the age groups. Significant interaction effects of age group and condition were further analyzed using paired samples t tests to assess the difference between the levels of the condition factor within each age group. Regarding effect size measures, the intraclass correlation coefficient rI was calculated as the effect size indicator for MANOVA, Cohen's d was calculated as the effect size indicator for planned contrasts and pairwise comparisons, and Pearson's r was computed as an effect size for correlational analyses. Effect sizes for the Mann–Whitney test were calculated by converting the test statistic into a z score and dividing it by the square root of the number of total observations.
In a second analysis of the behavioral data, we investigated the effect of outcome valence on choice behavior by analyzing the proportion of switches to the other option within a pair after a gain or a loss feedback (Figure 1B; for details, cf. Frank, Moustafa, Haughey, Curran, & Hutchison, 2007). On the basis of the lower levels of subcortical dopaminergic modulation during childhood (e.g., see Benes, 2001 for review) and old age (e.g., for reviews, see Bäckman et al., 2010; Li, Lindenberger, et al., 2009) and the greater impact of loss-related dips during lower DA levels (cf. Frank et al., 2004, 2007), we hypothesized that children and older adults should be relatively more sensitive to losses than to gains as compared with adolescents and younger adults. As expected, participants switched more frequently to the other option after losses than after gains, F(1, 136) = 1632.05, p < .01, rI = .92, for arcsin transformed data (see Figure 1B). Furthermore, there was also a reliable main effect of age group, F(3, 72) = 11.09, p < .01, rI = .32, which reflected in a curvilinear life span pattern, t = −5.49, p < .01, d = .91. These results indicated that overall—irrespective of gains or losses—children and older adults switched more frequently to the other option than adolescents and younger adults. Of particular interest, we also observed a significant interaction between age group and outcome valence, F(3, 62.5) = 4.82, p < .01, rI = .19. This interaction was also reflected in the significant, albeit smaller, curvilinear contrast effect for the differences between the age groups for switching after losses, t = −3.42, p < .01, d = .57, than for switching after gains, t = −6.91, p < .01, d = 1.16 (Figure 1B). Hence, the switching behavior of children and older adults differed more from adolescents and younger adults after gains than after losses (cf., Frank & Kong, 2008; Frank et al., 2004), suggesting that gain outcomes affected future choices less than loss outcomes in children and older adults as compared with adolescents and younger adults.
Confirming prior evidence on the life span development of the N2 to feedback stimuli, we found a linear decrease in FRN amplitude with increasing age, t = 14.94, p < .01, d = 3.34. Furthermore, in line with the assumption that the monitoring system reacts stronger to events that are aversive for the current task goal, the FRN amplitude was larger after losses than after gains; main effect for feedback type, F(1, 120) = 117.46, p < .01, rI = .70. This difference in reaction to gains and losses differed across the life span, as indicated by a significant interaction effect of age group and feedback type, F(3, 81.7) = 15.04, p < .01, rI = .60. Separate comparisons of the two feedback conditions conducted for each age group showed that the difference between the FRN amplitudes after gains and losses was smallest in older adults: children, t = 6.79, p < .01, d = 2.07; adolescents, t = 5.21, p < .01, d = 1.57; younger adults, t = 7.45, p < .01, d = 2.20; and older adults, t = 2.52, p = .02, d = .77. As can be seen in Figure 2, a similar pattern is observed in the difference wave measures of the loss minus the gain ERPs. Here, older adults show the smallest absolute value in the difference wave (children = −4.2 μV, adolescents = −4.7 μV, younger adults = −4.5 μV, and older adults = −2.17 μV; main effect for age group, F(3, 177) = 5.93, p < .01, rI = .33).
To compare the difference in monitoring gains and losses across the age groups, a ratio score was calculated (see Methods). This ratio score reflects the enhancement of the FRN after losses compared with the FRN after gains, taking into account the baseline amplitude of the FRN for each age group. The four age groups differed significantly in this ratio score, χ2(3, n = 176) = 19.21, p < .01. Pairwise comparisons with the Mann–Whitney test revealed that younger adults showed a significantly larger relative enhancement in FRN after losses than children, U = 609, p = .01, r = .27, and older adults, U = 428, p < .01, r = .42, as well as a trend of a larger FRN after losses relative to that after gains in adolescents, U = 702, p = .08, r = .19.
Again, a similar pattern is apparent in the difference wave between losses and gains when taking into account the baseline amplitude of the FRN after gains. The four age groups differed significantly in this ratio score, χ2(3, n = 176) = 20.17, p < .01. Also, in line with the results on the basis of the difference between the peak-to-peak measures, the ratio score of the difference wave measure was largest in younger adults as compared with the other three age groups (children vs. younger adults: U = 408, p < .01, r = .45; adolescents vs. younger adults: U = 563, p = .01, r = .30; and older adults vs. younger adults: U = 569, p = .02, r = .26).
Finally, it was of interest to investigate whether more differentiated monitoring reactions after gains and losses are related to better learning from performance feedback. To this end, for each age group, we also examined the correlations between interindividual differences in the ratio score on the basis of the peak-to-peak measure, indicating the relative salience of losses versus gains and interindividual differences in the mean frequency of choosing the good option. In the hardest condition (65–35% difference in reward probability), older adults with more differentiated FRN responses to gains and losses chose the good option more frequently, hence exhibiting superior learning from positive and negative feedback (r = .40, p = .01). This relation was not reliable in the other age groups: children, r = −.04, p = .83; adolescents, r = −.16, p = .31; younger adults, r = .01, p = .96. The correlation in the sample of older adults was robust to controlling for digit symbol scores (see Methods) and age differences within age group.
This study investigated life span changes in monitoring positive and negative outcomes during probabilistic reinforcement learning. ERPs were recorded to assess processing differences as a function of outcome valence and the degree of differences in reward probability between choice options. Two sets of findings were observed. First, the amplitude of the FRN after gains or losses was found to decrease monotonically from childhood to old age. Second, relative to adolescents and younger adults, children and older adults (a) showed smaller differences between the FRN after losses and the FRN after gains, (b) needed more trials to learn from choice outcomes, particularly when differences in reward likelihood between the choices were small, and (c) showed relatively less trial-to-trial learning from gains than from losses. In the following, each of these findings is addressed in turn.
FRN Decreases from Childhood to Early Adulthood and Old Age
The amplitude of the FRN after losses or gains decreased with increasing age in a monotonic fashion. This finding supports previous evidence of a larger FRN in reaction to positive and negative feedback in children (Eppinger et al., 2009). Furthermore, the larger monitoring reaction to outcomes in children dovetails with evidence suggesting a greater orientation toward external feedback in behavioral control as compared with internal control processes during childhood. Specifically, in a cued go/no-go task, children show stronger reactions to cue stimuli and imperative stimuli and weaker indices of internal motor control such as ERP components reflecting response preparation or response inhibition (cf. Hämmerer, Li, Müller, & Lindenberger, submitted; Jonkman, 2006). This stronger reaction to external feedback in children as compared with adult age groups is assumed to be compensatory and related to a not-yet-fully-developed ability of children to exert internal motor control (Luna & Sweeney, 2004). In line with these considerations, the stronger reaction to feedback or cue stimuli in children has been attributed to a greater sensitivity to external as compared with internal feedback (Crone, Somsen, Zanolie, & van der Molen, 2006).
The observed decrement in FRN amplitude among older adults is also consistent with earlier evidence (Eppinger et al., 2008; Pietschmann et al., 2008; Nieuwenhuis et al., 2002) and may point to a reduced signaling of prediction errors because of a less reactive dopaminergic system (Nieuwenhuis et al., 2002) or a weaker attentional focus (Hämmerer, Li, Müller, et al., submitted) in the elderly.
Less Differentiated Gain/Loss Distinction in Children and Older Adults
On the basis of the work of Holroyd et al. (2004, 2006), we assumed that the FRN does not represent feedback information as such but indicates whether the information from a feedback is beneficial for a predefined goal. Hence, we computed a ratio score to assess the differences between reactions to gains and losses independently of the aforementioned age differences in the overall amplitude of the FRN. This ratio score indicates the relative increase in the FRN after losses relative to the FRN after gains. Differences in the processing of positive or negative feedback were assumed to result in greater differences between electrophysiological signals for gains and losses. As expected, the relative increase of the FRN after a loss was reliably larger in younger adults than in children and older adults. Hence, the amplitude of the FRN differentiated less well between gains and losses in both children and older adults than in younger adults, although children showed the largest FRN response in general. This pattern suggests that despite the apparent stronger sensitivity to external feedback in children (Eppinger et al., 2009; Crone et al., 2006), the focus of the maturing outcome monitoring system does not yet yield a differentiated classification of the outcomes with respect to the individual task goals.
Interestingly, a similar pattern of results has recently been observed in a study investigating heart rate changes after positive and negative feedback during decision-making (Crone, Jennings, et al., 2004; Crone, Somsen, van Beek, & van Der Molen, 2004; Crone et al., 2003). A slowing of the heart rate was found to follow informative negative but not positive feedbacks (Crone et al., 2003). The relative slowing was less pronounced in children than in adults (Crone, Jennings, et al., 2004), mirroring the less differentiated EEG signals after gains or losses observed in the present study. Furthermore, upon presentation of noninformative positive and negative feedbacks, children slowed down more after negative feedbacks, whereas adults did not differentiate negative from positive feedback when the feedbacks themselves were not informative (Crone, Jennings, et al., 2004). Presumably, children's reactions to feedbacks are less related to actually using the information provided by the feedback to adapt future actions. The results by Crone, Jennings, et al. (2004) match nicely with the high but less differentiated EEG signals for feedbacks in children observed in the present study.
With respect to older adults, the less differentiated reactions to gains and losses are consistent with results from a study by Mathewson et al. (2008), who found that source activations in ACC for loss-related and gain-related feedback were more similar to each other in older adults than in younger adults. In addition, we recently found that older adults who are better able to maintain an attentional focus also show stronger monitoring reactions (Hämmerer, Li, Müller, et al., submitted). The weak monitoring reaction to performance feedback in older adults observed in this study might thus reflect a weaker attentional processing of the outcomes. Furthermore, the less differentiated reaction to positive and negative outcomes suggests a reduced focus of the monitoring system in classifying the outcomes according to task-specific goals in normal aging. In this context, it is worth noting that older adults with less differentiated reactions to gain and loss outcomes also learned less from performance feedback (see below).
Age Differences in Reinforcement Learning and Its Relation to Outcome Monitoring
In line with previous findings, children and older adults needed more trials than younger adults and adolescents when using probabilistic feedback to identify the option that is more likely to be rewarded (cf. Eppinger et al., 2008, 2009; Marschner et al., 2005; Mell et al., 2005). This age difference in learning increased with decreasing differences in reward likelihood between the two choices. Hence, we conclude that learning from probabilistic feedback is especially difficult for children and older adults when the differences in reward probabilities between choice options are small.
In older adults, differences in the FRN after gains and losses as reflected in the ratio score were related to the performance on the reinforcement learning task. More specifically, older adults with a lower ratio score (i.e., older adults with less distinctive gain- and loss-related FRNs) chose less frequently the good option on the pairs with the less distinctive reward probabilities of the two options. This suggests that the dedifferentiated monitoring reaction to feedback in older adults is accompanied by a decreasing ability to learn from feedback. Again, this finding is in line with a recent study showing that a larger FRN after negative feedback is related to lower error rates in the elderly and not in the younger adults (Mathewson et al., 2008).
We note that children and older adults, who showed less learning from probabilistic feedbacks, also showed the least differentiated reactions to gains and losses as indexed by the FRN ratio score. It is hard to tell whether lower learning rates are due to less differentiated monitoring signals or whether the lack of differentiation between reactions to gains and losses is an epiphenomenon of lower performance. Recent findings are in favor of the former interpretation, showing that the FRN amplitude does not change in the course of learning and is thus independent of how much is known about the probable outcomes of an option (Eppinger et al., 2008). According to this, a monitoring system capable of differentiating between responses to gains and losses is a prerequisite rather than a symptom of efficient learning. Nevertheless, other neural processes or cognitive functions may influence the operation of monitoring mechanisms and thus reflect in both behavioral performance during reinforcement learning and monitoring signals. For instance, we recently concluded that attentional impairments contribute to a weakening of monitoring signals in old age (Hämmerer, Li, Müller, et al., submitted).
Relative to Younger Adults, Children and Older Adults Choose Less Well after Gains Than after Losses
In addition to how frequently the good option was chosen, the choices made after gains and losses were analyzed to investigate how strongly recent gains and losses influenced choice behavior in the different age groups. In these analyses, the frequency of switching to the other option within a pair after a loss on the preceding presentation of the pair was compared with the frequency of switching after a gain (for a prior report on these measures, see Frank et al., 2007). The results show that, overall, children and older adults switched to the other option more frequently, irrespective of whether the prior feedback indicated a gain or a loss. Hence, independent of the feedback received, children and older adults alternated to a greater extent between the two options of a pair than adolescents and younger adults, thus demonstrating choice behavior that was more random and feedback unrelated. Furthermore, this more frequent switching in children and older adults—as compared with adolescents and younger adults—was more pronounced after gains than after losses. This result suggests that gain outcomes affected subsequent choice behavior less strongly in children and older adults than in younger adults and adolescents. Thus, the sensitivity to gains and losses differed across the age groups, with children and older adults being comparatively less sensitive to gains than to losses.
Recently, it has been shown that an increased sensitivity to negative feedback is related to a larger difference in FRN in response to losses and gains (Frank et al., 2005). These findings would suggest that—compared with adolescents and younger adults—children and older adults should show an increased difference in the reaction to negative feedback in comparison with positive feedback. Furthermore, those participants within each age group that seem—on the basis of their switching behavior after gains and losses—to be more sensitive to negative rather than to positive feedback should also show a larger difference in their monitoring reaction to losses and gains. Such a negativity bias effect, however, could not be observed in our data, perhaps reflecting insufficient statistical power to detect the effect (cf. Frank et al., 2005, who report rI = .28, p = .05, in a sample of 65 participants). Also, in the case of the children and the older adults, the relatively stronger effect of a reduction in the distinctiveness of the monitoring reaction might compromise the detection of the negativity bias effect.
As a possible explanation for the greater sensitivity of children and older adults to loss outcomes, we suggest that lower levels of DA in the maturing and aging brain may result in greater loss sensitivity among children and older adults (Bäckman et al., 2006; Diamond et al., 2004; Diamond, 2002). Lower levels of DA in Parkinson's patients off medication have been shown to be related to better learning from negative feedback, an effect that is assumed to result from the relative accentuation of phasic DA responses to negative feedback when the baseline DA level is low (Frank & Kong, 2008; Frank et al., 2004, 2007). Our findings thus agree with earlier findings of a greater sensitivity to losses in older seniors as compared with younger seniors (Frank & Kong, 2008) and show that this relationship between low DA levels and greater sensitivity to negative outcomes is also true in children. The latter finding is corroborating evidence of stronger activations during negative feedback processing in children as compared with adults (van Leijenhorst, Crone, & Bunge, 2006).
This study investigated life span age differences in the monitoring reaction to positive and negative feedback during a probabilistic reinforcement learning task. Despite the fact that children show stronger monitoring signals to feedback than younger adults, the ability of their monitoring system to focus on a classification of the feedback according to task-relevant goals appeared to be not yet fully developed. In contrast, in older adults, weaker monitoring signals were observed as well as less differentiated reactions to gain and loss outcomes. Given that older adults with less differentiated monitoring signals also had more difficulties in learning from performance feedback, a weaker attentional processing of outcomes or reduced focus of the monitoring system might be present. In addition to these life span differences in general aspects of feedback processing, our study showed that the sensitivity to the valence of a feedback differs across age groups. When compared with adolescents and younger adults, age groups with lower levels of dopaminergic modulation-children and older adults-appeared to be less sensitive to positive than to negative feedback.
However, it needs to be noted that the few existing studies on life span differences in outcome sensitivity have generated mixed results, especially for children, where fMRI data suggest that children show greater activation than younger adults in dorsolateral prefrontal and orbito-frontal areas in response to negative feedback (van Leijenhorst et al., 2006) but also in response to positive feedback (van Duijvenvoorde, Zanolie, Rombouts, Raijmakers, & Crone, 2008; Galvan et al., 2006). Furthermore, children tend to show greater activations in dorsolateral prefrontal and orbito-frontal areas than in medial frontal areas during feedback processing as compared with younger adults. Recent evidence shows that outcomes that are resulting from own actions are processed in medial prefrontal areas, whereas outcomes that are the result of instructed actions elicit stronger activity in orbito-frontal areas (Walton et al., 2004). One might thus suggest that children are less aware of the contingencies between the feedbacks and their own choices. Switching away from negative feedback could then reflect different processes in children than in younger adults, with children merely reacting to negative events whereas younger adults would show more adaptive behavior by reacting to negative performance feedback. Future studies should address this issue by systematically varying the action contingency of feedback in a life span sample.
The assessment of EEG responses to positive and negative feedback provided some initial insights into how changes in the development of feedback processing across the life span might relate to age differences in behavioral correlates of reinforcement learning. Future studies should aim at clarifying which cognitive functions contribute to the observed age differences in the processing of gains and losses (most importantly attention and working memory) and also investigate in more detail (e.g., by assessing individual differences in dopaminergic modulation in genetic polymorphisms) to what extent differences in dopaminergic modulation influence reinforcement learning across the life span.
This work was supported by the German Research Foundation's grant for the research group on Conflicts as Signals (DFG FOR 778). The authors gratefully thank Beate Czerwon, Viola Störmer, Katja Zschenderlein, Minh Tam Luong, Natalie Trumpp, and Katja Breitenbach as well as all other research assistants for their valuable support during data collection. The authors appreciate Bernd Wischnewski, Markus Bauer, and Markus Werkle-Bergner for their technical support on EEG recordings and data analyses. Parts of this article were written while UL was a fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford. The authors thank Guido Biele, Lea Krugel, Julius Verrel, and members of the “Neuromodulation of Attention and Perception” project for fruitful discussions.
Reprint requests should be sent to Dorothea Hämmerer or Shu-Chen Li, Center for Lifespan Psychology, Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin, Germany, or via e-mail: firstname.lastname@example.org; email@example.com.