According to the reinforcement learning account of the error-related negativity (ERN), the ERN is a manifestation of a signal generated in ACC as a consequence of a phasic decrease in the activity of the mesencephalic dopamine system occurring when the monitoring system evaluates events as worse than expected. This signal is also hypothesized to be used to modify behavior to ascertain that future events will have better outcomes. It is therefore expected that this signal be correlated with learning outcomes. We report a study designed to examine the extent to which the ERN is related to learning outcomes within a paired-associates learning task. The feedback-related negativity (FRN) elicited by stimuli that indicated to the participants whether their response was correct or not was examined both according the degree to which the associates were learned in the session and according to whether participants recalled the associations on the next day. The results of the spatio-temporal PCA indicate that, whereas the process giving rise to the negative feedback elicited a FRN whose amplitude was not correlated with long-term learning outcomes, positive feedback was associated with a FRN-like activity, which was correlated with the learning outcomes. Another ERP component that follows the FRN temporally and shares its spatial distribution was found associated with long-term learning outcomes. Our findings shed light on the functional significance of the feedback-related ERP components and are discussed within the framework of the reinforcement learning ERN hypothesis.
Human behavior is shaped by the interaction between the self and the environment. In part, the environment shapes and modulates behavior as it is a source of feedback about the consequences of behavior; this feedback may correct, reinforce, or eliminate specific behaviors. There is evidence that performance feedback is crucial for goal setting and for improving performance (e.g., Sheen, 2006; Saxton, 2000; Erez, 1977). The feedback mechanisms have been illuminated with the discovery of a component of the ERP, which is elicited by erroneous responses and by negative feedback. This component, named the error-related negativity (ERN), is elicited when participants make an erroneous response in speeded RT tasks (e.g., Gehring, Goss, Coles, Meyer, & Donchin, 1993; Falkenstein, Hohnsbein, Hoormann, & Blanke, 1990). In such tasks, the fact that an error has been committed is evident when the response is executed. However, there are tasks in which an error cannot be detected on its commission. In such cases, the fact that a response was erroneous is communicated by a feedback event. It has been demonstrated that such feedback events elicit an ERN (e.g., Miltner, Braun, & Coles, 1997). This feedback ERN (fERN or FRN) is a fronto-central negativity that peaks at about 250–300 msec following the presentation of the feedback stimulus. Converging evidence points to ACC as the source of both the ERN and the FRN (e.g., Vlamings, Jonkman, Hoeksma, van Engeland, & Kemner, 2008; Ladouceur, Dahl, & Carter, 2007; Critchley, Tang, Glaser, Butterworth, & Dolana, 2005; Mars et al., 2005; Mathalon, Whitfield, & Ford, 2003; van Veen & Carter, 2002; Menon, Adleman, White, Glover, & Reiss, 2001; Kiehl, Liddle, & Hopfinger, 2000; Carter et al., 1998; Holroyd, Dien, & Coles, 1998; Dehaene, Posner, & Tucker, 1994). The involvement of ACC in the generation of the ERN led Holroyd and Coles (2002) to articulate the hypothesis that the FRN is generated in ACC as a consequence of a phasic decrease in the activity of the mesencephalic dopamine system occurring when the monitoring system evaluates events as worse than expected. Specifically, this hypothesis implies that the ERN and the FRN are products of the reinforcement learning (RL) system. Evidence from the study of response ERN is consistent with this theory as the ERN is elicited by incorrect responses and its amplitude is correlated with the importance of the error (e.g., Arbel & Donchin, 2009; Hajcak, Moser, Yeung, & Simons, 2005; Gehring et al., 1993). Furthermore, the amplitude of the ERN on a given trial is correlated with subsequent remedial actions (Gehring et al., 1993).
As the FRN can be elicited in a wide variety of feedback providing tasks, predictions derived from the RL theory can be tested under different experimental conditions. The FRN has been observed in gambling tasks in which participants guess on a trial-by-trial basis which stimulus will result in a rewarding outcome (e.g., Goyer, Woldorff, & Huettel, 2008; Hajcak, Moser, Holroyd, & Simons, 2007; Gehring & Willoughby, 2002). These tasks are particularly useful for testing the hypothesis that the FRN signal reflects the assessment that outcomes are worse than expected because it is fairly easy to manipulate and control the relative frequency and size of gains and losses within the task. The results vary substantially across studies, some reporting a clear effect of expectancy on the amplitude of the FRN (e.g., Bellebaum, Polezzi, & Daum, 2010; Holroyd & Krigolson, 2007; Holroyd, Nieuwenhuis, Yeung, & Cohen, 2003), which supports the “worse than expected” part of the theory, whereas others report that expectancy had no effect on the FRN (e.g., Larson, Kelly, Stigge-Kaufman, Schmalfuss, & Perlstein, 2007; Donkers, Nieuwenhuis, & Van Boxtel, 2005; Hajcak et al., 2005). It is important to note that many of the inconsistencies were later resolved by manipulating the reward probabilities (Donkers & van Boxtel, 2005) and the temporal proximity between predictions and outcomes (Hajcak et al., 2007). Holroyd, Krigolson, Baker, Lee, and Gibson (2009) demonstrated that the process giving rise to the FRN is indeed sensitive to the violation of reward expectancy but only when the optimal behavior is learnable. These latter results, coupled with the notion that the RL system is both a reward-seeking and information-seeking system (e.g., Bromberg-Martin & Hikosaka, 2011; Niv & Chan, 2011), emphasize the importance of studying the FRN in tasks that allow the use of feedback to modify behavior and subsequently optimize the outcomes. Indeed, there is evidence that the FRN is elicited in learning tasks in which feedback guides the learner through the trial-and-error learning experience (e.g., Sailer, Fischmeister, & Bauer, 2010; van der Helden, Boksem, & Blom, 2010; Eppinger, Mock, & Kray, 2009; Krigolson, Pierce, Holroyd, & Tanaka, 2009; Pietschmann, Simon, Endrass, & Kathmann, 2008; Holroyd & Coles, 2002), or affects decision making on subsequent trials (e.g., Chase, Swainson, Durham, Benham, & Cools, 2011; Frank, Woroch, & Curran, 2005). These studies add an important layer to the examination of the RL theory as they allow the evaluation of the FRN in relation to the utility of the feedback and to the consequence of feedback processing. Within the framework of the RL hypothesis, it is expected that the amplitude of the FRN will be correlated with learning outcomes.
Holroyd and Coles (2002) were the first to attempt to evaluate the FRN within a learning paradigm as they focused on a probabilistic learning task. Whereas some stimuli were mapped 100% to a specific response, others received random feedback (50% condition) or a particular feedback regardless of the participant's response (“always correct” and “always incorrect”). Although Holroyd and Coles' model predicted a gradual reduction in the amplitude of the FRN (defined as the difference between ERPs elicited by positive and negative feedback) and increase in the amplitude of the response ERN during learning associated with the learnable stimuli, the data reported by Holroyd and Coles (2002) display a somewhat different pattern. Although the amplitude of the response ERN increased with learning, the amplitude of the feedback ERN seems to have remained constant. The lack of an observed FRN reduction in the course of the 100% condition could be attributed to a very rapid learning during the first 10-trial bin or to the possible loss of observable effect because of an insufficient sensitivity when applying the difference wave technique. The data provided by Krigolson et al. (2009) support the former explanation. In their study, a more complex learning task that required categorization was utilized. A gradual reduction in FRN was observed in individuals who successfully completed the task (those who were classified as high learners). van der Helden et al. (2010) examined the FRN while participants performed a motor sequence learning task. The authors compared the FRN elicited by negative feedback that was followed by a different response on the next response opportunity (indicating, according to the investigators, a change of strategy and therefore learning from the feedback) with negative feedback that was followed by a repetition of the incorrect response (indicating that the participant did not use the feedback to adjust performance). The authors reported that negative feedback to incorrect responses, which were later modified, was associated with larger FRN than negative feedback provided to incorrect responses, which were later repeated. These results were interpreted to suggest that the FRN is sensitive to the extent to which negative feedback is being utilized to improve performance. Similar results are reported by Cohen and Ranganath (2007) who found that the FRN was associated with a change of response after a loss, such that its amplitude was larger on “loss” trials after which participants changed their response, in comparison with “loss” trials after which participants repeated the action that previously resulted in unfavorable outcomes.
The abovementioned studies support the RL theory and suggest that the FRN signal is used by ACC for the adaptive modification of behavior. However, it is yet to be determined whether the FRN is correlated with long-term learning outcomes related to the process during the practice of which this component is assumed to be generated. The study reported here was designed to address this question by examining the positive and negative feedback-related brain activity during a paired-associate learning paradigm and its association with performance on a subsequent recognition memory task. A similar paired-associate learning task was employed by Tricomi and Fiez (2012) in an fMRI study. Their results indicated a relationship between caudate activation during the delivery of positive feedback and long-term learning as measured by a subsequent memory task. Given that the BG have long been associated with the RL system (e.g., Pennartz et al., 2009; Haruno & Kawato, 2006; Schultz, 2006) and with performance feedback processing (e.g., Cincotta & Seger, 2007; Tricomi, Delgado, McCandliss, McClelland, & Fiez, 2006) and is assumed to play a role in the process that gives rise to the FRN (e.g., Foti, Weinberg, Dien, & Hajcak, 2011; Holroyd & Coles, 2002), we hypothesized that similar relationship between positive feedback and learning will be found in the ERP data. Our hypothesis was predicated on previous reports (e.g., Kreussel et al., 2012; Baker & Holroyd, 2011; Foti et al., 2011; Holroyd, Pakzad-Vaezi, & Krigolson, 2008; Potts, Martin, Burton, & Montague, 2006), which suggested that the variance in FRN amplitude may stem from neural activity associated with positive rather than negative feedback. Spatial–temporal PCA (STPCA) as described by Spencer, Dien, and Donchin (2001) was used to allow the separation of potentially overlapping components and for an analytic reduction of the spatial dimensionality of the data. This methodology is commonly used for the separation of ERP components (e.g., Simons, Graham, Miles, & Chen, 2001; Spencer et al., 2001; Spencer, Dien, & Donchin, 1999), including the ERN (e.g., Arbel & Donchin, 2009, 2011; Foti et al., 2011; Holroyd & Coles, 2008; Holroyd et al., 1998, 2008; Krigolson & Holroyd, 2006). As we focus on the behavior of different ERP components, it is preferable to use a methodology that parses the ERP into its components, allowing the measurement of each component in isolation from the other components.
Twenty-six undergraduate students from the University of South Florida participated in the experiment. The data of three of these participants were excluded from the analysis because of excessive artifacts. Analysis was performed on data from 23 participants (14 women, 9 men) aged 18–30 years (M = 20.47 years, SD = 3.1 years). Participants were right-handed with no history of learning disorders or neurological deficits. They had normal or corrected vision. English was the participants' primary language. Consent forms were signed by participants before the initiation of data collection.
Participants sat in front of a computer monitor in a quiet room and performed a learning task while their EEG was recorded. They were instructed to figure out and learn the names (nonwords) of 30 novel objects through trial and error. They were also informed that their memory for these associations would be tested on the following day. In each trial, participants were presented with a picture of a novel object accompanied by four possible names, one of which was defined as the “correct” name for the object. Participants were instructed to learn to pair a name with an object through trial and error. Feedback was provided after each trial to inform participants about the correctness of their choice. The experimental task consisted of 600 trials presented in five blocks of 120 trials. In each of the five blocks, a new set of six novel objects was presented (for 30 novel objects). Each of the six novel objects was presented in a random order 20 times within a block. In each trial, a fixation point appeared on the screen for 500 msec, after which the picture of a novel object was presented on the screen for 1000 msec, followed by a presentation of the same novel object accompanied by four nonwords displayed in a row underneath the picture for 3000 msec. The same four nonwords were always presented with each novel object but in different locations within the row. For each novel object, the four nonwords were selected so that two shared the same initial consonant, two had the same middle vowel, and two had the same final consonant (an example is presented in Figure 1). One predetermined nonword name was the correct label for the novel object, and the remaining three choices were the correct names of other objects. Participants were allotted 3000 msec to respond. After each selection by the participant, a positive (a “W”) or a negative feedback (an “X”) was presented on the screen for 800 msec to indicate the correctness of the response. Short breaks were provided after the completion of each block. A pair recognition test was given to each participant a day after ERP data collection. It consisted of a list of the 30 novel objects used in the experiment and a list of all the possible names. All 30 novel objects and all 30 names were presented at the same time. Participants were asked to match each of the 30 novel objects, which were presented during the experiment with their names. We will refer to this task as a recognition task although we do not wish to confuse it with a task in which participants are instructed to indicate whether a particular stimulus was present or not during the experiment.
The novel objects were borrowed from Kroll and Potter (1984). Nonwords were produced from the ARC Nonword Database (Rastle, Harrington, & Coltheart, 2002). The nonwords were in three-letter CVC format and were phonologically legal in English. The number of neighbors was between 0 and 10 (M = 5.9); the number of phonological neighbors was between 5 and 20 (M = 14).
EEG Recording Parameters
The EGI System 200 was used to acquire and analyze dense-array EEG data. The EEG was recorded using 129-channel HydroCel Geodesic Sensor Nets from EGI (Eugene, OR). The EEG was continuously recorded at a 250-Hz sampling rate with a band pass of 0.1–100 Hz. The electrode impedances were kept below 50 kΩ. The continuous EEG data were filtered using an off-line 40-Hz low-pass filter. The filtered data were then segmented into 1200-msec-long epochs, each starting 200 msec before the feedback stimulus and ending 1000 msec after the feedback presentation. An algorithm developed by Gratton, Coles, and Donchin (1983) for off-line removal of ocular artifacts was used to correct for eye movements and blinks. Averages of the artifact-free epochs were calculated for each type of feedback. Feedback stimuli were categorized by accuracy (positive and negative feedback) and by subsequent recognition data (recognized, not recognized), yielding four categories of feedback segments:
Positive feedback, recognized (Pos-Fd Rec) – positive feedback provided after correct selection of name for an object that was matched correctly on the following day.
Positive feedback, not recognized (Pos-Fd NotRec) – positive feedback provided after correct selection of name for an object that was not matched correctly on the following day.
Negative feedback, recognized (Ng-Fd Rec) – negative feedback provided after incorrect selection of name for an object that was matched correctly on the following day.
Negative feedback, not recognized (Ng-Fd NotRec) – negative feedback provided after incorrect selection of name for an object that was not matched correctly on the following day.
The analysis was performed on positive and negative feedback provided to the participant before a learning criterion has been met. The learning criterion was satisfied by five consecutive correct selections for a particular novel object. It is important to note that only data associated with objects for which the learning criterion was achieved within the task were included in the primary analysis because of insufficient trials for pairs that were not learned within the task. On the average, participants met the learning criterion for 85% of the items (SD = 14%), with some participants meeting the learning criteria to all (n = 3 participants) or almost all (one participant learned 29 of the 30 items; two additional participants learned 28 of the 30 items) the items. A secondary analysis of the ERPs elicited in association with items that were not learned within the task nor recognized a day later was conducted, yielding a fifth category of “negative feedback, not learned, not recognized” (Ng-Fd NotLrn NotRec). The six participants who learned more than 90% of the paired-associates were excluded from this secondary analysis because of no or very few trials in this category, leaving us with a sample of 17 participants. Naturally, there were too few trials in the “positive feedback, not learned, not recognized” category, leading to the exclusion of this category from the analysis.
Analysis was done on 1000-msec-long epochs, starting at the onset of the feedback stimuli and ending 1000 msec following the feedback. To analyze the EEG data, a STPCA as described by Spencer et al. (2001) was utilized. This analysis reduces the dimensionality of a large data set and disentangles overlapping ERP components. First, a “spatial” PCA then a “temporal” PCA (Spencer et al., 2001) was performed. The spatial PCA was performed by computing the covariance among electrode sites across the time points of each of the feedback types and participants, yielding a set of spatial factors. In the next step of the analysis, the “factor scores” for each of the participants, feedback types, and electrodes were used to create “virtual ERPs” (Spencer et al., 2001), which were submitted to a temporal PCA, analyzing the covariance among time points for each of the spatial factors, feedback type, and participants. A separate temporal PCA was conducted for each of the spatial factors. The resulting temporal factor scores for each spatial factor were used to measure the activity in the ERP with the morphology and scalp distributions of interest. For both spatial and temporal PCAs, the factors that were required to account for 95% of the variance in the input data set were retained for Varimax rotation. The factor scores of the temporal and spatial factors of interest were used for statistical analysis.
On average, participants learned (achieved learning criterion of five consecutive correct responses) to match 85% of the 30 novel objects with their names (SD = 14%). On average, 46.6% of the 30 (SD = 4.2) pairs of novel objects and their names were matched on the following day (53.8% of the learned pair associations, SD = 1.9%). The average number of error trials before reaching the learning criterion for pairs that were later recognized was smaller (M = 3.1, SD = 0.86) from that of pairs that were not subsequently recognized (M = 3.9, SD = 0.94), F(1, 22) = 9.27, p = .006.
Figure 2 provides a comparison between ERP waveforms elicited by positive and negative feedback. The trials were sorted for averaging depending on whether the paired associate was later recognized or not recognized. The grand-averaged data as presented in Figure 2 suggest that differences between the ERPs associated with the nature of the feedback were observed in central, fronto-central, and frontal electrode sites. In addition, a visual inspection suggests that ERPs elicited by feedback provided to items that were later matched with their correct names are different from those elicited by feedback to items that were not correctly matched.
A more in-depth analysis is provided by the STPCA. Figure 3 presents a summary of this analysis.
Spatial Factor 1 (SF1, accounts for 30.56% of the variance) seems to capture the FRN activity as it has the fronto-central spatial distribution of a typical FRN, and the activity as indicated by the virtual ERPs (Figure 4) is similar to that observed in the grand-averaged data in the fronto-central electrode sites. SF3 (accounts for 9.81% of the variance) seems to represent the centro-parietal spatial distribution of a P300. We focus on these two spatial factors as they represent, through their spatial distribution and the patterns shown in their virtual ERPs, the common error-related ERP components.
The spatial PCA was followed by temporal PCA for each of the spatial factors of interest to reduce the temporal dimensionality of the data set.
The FRN component was identified as the component with a fronto-central distribution (SF1) and maximal amplitude about 300 msec following the presentation of the feedback (Temporal Factor 5 [TF5]; see Figure 5). A 2 × 2 repeated-measures analysis was conducted with a Feedback Valence factor (positive feedback and negative feedback) and a Long-term Learning Outcome factor (recognized and not recognized). A main effect of Valence was found, F(1, 22) = 37.8, p < .0001, suggesting that negative feedback was associated with a larger negativity than positive feedback. No effect of Learning Outcome was found, F = 0.41, p = .52, suggesting that learning outcomes per se were not correlated with a change in the amplitude of the FRN. However, an interaction between Valence and Learning Outcomes was found, F = 4.21, p = .05. Post hoc paired comparison, which was conducted to examine the source of the interaction, revealed that positive feedback provided to correct pairs that were later recognized (Pos-Fd Rec) elicited a larger negativity than the positive feedback provided to pairs that were not subsequently recognized (Pos-Fd NotRec), F(1, 22) = 4.17, p = .05. The activity elicited in association with negative feedback did not differ as a function of subsequent recognition (i.e., Neg-Fd Rec was not found different from Neg-Fd NotRec), F(1, 22) = 0.23, p = .63.
For a subset of our participants (n = 17), we were able to examine ERPs elicited by negative feedback stimuli provided to pairs that were neither learned within the task nor subsequently recognized. As was mentioned before, 6 of the total 23 participants were excluded from this secondary analysis because of the small number of trials associated with this category (Ng-Fd NotLrn NotRec). Repeated-measures ANOVA followed by paired comparisons revealed that negative feedback provided to pairs that were not learned and not recognized a day later was associated with a negative-going activity, which was different from that elicited by positive feedback (F(1, 16) = 6.61, p = .02 when compared with Pos-Fd Rec; F(1, 16) = 14.8, p = .001 when compared with Pos-Fd NotRec) but not different from that elicited by the other categories of negative feedback (F(1, 16) = 0.26, p = .61 when compared with Ng-Fd Rec; F(1, 16) = 0.63, p = .43 when compared with Ng-Fd NotRec).
SF1 (fronto-central) at TF2 (∼400 msec) seems to represent the activity that follows the FRN (Figure 6). A 2 × 2 repeated-measures analysis reveals a main effect for Feedback Valence, F(1, 22) = 4.21, p = .005, suggesting that positive feedback was associated with a more negative-going activity, whereas negative feedback was associated with a larger positivity. In addition, a main effect for Long-term Learning Outcome was found, F(1, 22) = 8, p = .01, indicating a larger positivity associated with recognized items and a larger negativity associated with items that were not recognized. No interaction effect was found, F(1, 22) = 1.8, p = .19.
We were concerned that the obtained temporal factors and the scores for each of the categories may have been affected by potential latency differences between the activity associated with positive feedback and that associated with negative feedback. We therefore performed a separate temporal analysis for positive and negative feedback. The analyses yielded the same pattern of results with a similar temporal factor at ∼400 msec with positive scores associated with Ng-Fd NotRec and negative scores associated with Pos-Fd Rec. This analysis rejected the possibility that the analysis was biased by the potential latency differences between the activities associated with positive and negative feedback stimuli.
The analysis of a subset of our data, which included the category of Ng-Fd NotLrn NotRec, indicated that the activity elicited in association with this category was similar to that elicited by Ng-Fd NotRec, F(1, 16) = 0.06, p = .8, suggesting that this component is not modulated by “local” learning at least in association with negative feedback.
P300 was elicited in association with all feedback categories (Figure 7). No feedback valence effect, F(1, 22) = 1.32, p = .26, nor a learning outcome effect, F(1, 22) = 2.88, p = .1, was found.
Positive Feedback Stimuli: Examination of a Sequential Effect
As positive feedback was found associated with subsequent recognition, additional analysis was conducted to elucidate the role of positive feedback in the process of learning the paired associations. In the first step, a possible sequential effect was examined in relation to the positive feedback. The five consecutive positive feedback presentations that served as the learning criterion for each item were entered into a STPCA.
The STPCA revealed that the positive feedback elicited an FRN-like activity with a latency of approximately 300 msec. Repeated-measures analysis with the five consecutive positive feedback stimuli yielded no sequential effect, F(4, 88) = 1.6, p = .21.
A sequential effect was found in relation to the positivity that followed the FRN, F(4, 88) = 4.6, p = .008. Post hoc paired comparison revealed that this effect was driven by the activity associated with the first positive feedback, which was more positive-going than the other positive feedback events in the sequence (ps = .0007, .05, .0007, and .002, respectively). The other positive feedback stimuli in the sequence were not found different from each other.
The sequential effect was also examined in a spatio-temporal factor that seems to capture the P300 activity (see Figure 8). The analysis revealed a typical sequential effect of the P300 component, F(4, 88) = 9.7, p < .0001, Greenhouse–Geisser ɛ = 0.62, such that the more unexpected the stimulus was, the largest the amplitude of the P300 was. The first positive feedback, which indicates the first correct selection of name for an object, elicited the largest P300 activity. Post hoc paired comparison suggested a reduction in P300 amplitude from the first positive feedback in the sequence to the third, with no change in P300 amplitude from the third to the fifth feedback stimuli. The decrease in P300 amplitude suggests that positive feedback became less surprising with learning and that “learning” was still occurring at least up to the third positive feedback in the sequence.
Our data allowed the examination of the feedback-related ERP components and their relationship with long-term learning outcomes following a paired-associate declarative learning task. Three ERP components were identified as related to the feedback processing and learning: the FRN, a fronto-central ERP component that followed the ERN, and the P300. Negative feedback elicited a typical FRN whose amplitude was modulated by feedback valence, such that negative feedback elicited a larger negativity than positive feedback.
In accordance with fMRI data provided by Tricomi and Fiez (2012), the FRN associated with positive but not negative feedback stimuli was found related to long-term learning. More specifically, positive feedback provided to pairs, which were later recognized, elicited a relatively larger negativity than positive feedback stimuli provided to pairs that were not recognized. These data suggest that the neural activity giving rise to the FRN is associated with the utility of the positive feedback for long-term learning. This apparent selective sensitivity merits a discussion. The observed difference between the activity elicited by positive feedback provided in association with items that were later recognized and those that were not recognized will be discussed first. Items that were not subsequently recognized were more difficult to learn as evident by a longer learning process (more errors before reaching the learning criterion). It is therefore possible that positive feedback provided in relation to these items was more surprising to the learner than positive feedback provided to items that may have been perceived as “easier” to learn. It is important to note that we refer to “difficult” and “easy” here as a subjective individualized descriptor based on performance and not based on actual level of difficulty. This suggestion is in line with previous reports of the reward positivity, that is, that unexpected rewards (positive feedback) elicit a larger positive deflection. One possible explanation for the absence of FRN long-term learning effect associated with negative feedback is the difference in the frequency of negative feedback trials in each of the two categories (Neg-Fd Rec and Neg-Fd NotRec). Participants committed more errors before learning the pairs that were subsequently not recognized. It is, therefore, possible that amplitude differences are because of differences in the frequency of these feedback stimuli. This explanation can be rejected for two reasons: first, the probability difference of these two negative feedback stimuli was minimal in our study (15% vs. 20%). This small difference is not likely to affect the amplitude of the FRN; second, if Ng-Fd Rec was associated with fewer trials, we would have expected it to elicit an FRN with larger amplitude. This would have enhanced (rather than diminished/cancelled) any effect of long-term learning on the FRN.
Another possible explanation for the selective association between long-term learning and positive feedback is that the positive feedback in this particular task was more useful in the learning process, and therefore, only the positive feedback was associated with modulations in amplitude related to the utility of the feedback for learning. Indeed, the positive feedback seems to have been more informative in this task than the negative feedback and therefore may have been more crucial for learning. Positive feedback informed the participant that the current choice is correct, and therefore, if the participant is to repeat the same correct response, he or she is certain to be correct. On the other hand, negative feedback informed the participant that the current choice is incorrect, and therefore, at least for the first negative feedback (or if the participant has not stored in memory the previous incorrect choices), the message to the participant is that three other choices are still possible as correct responses. The idea that the positive feedback was more useful is supported by participants' anecdotal reports that they found themselves focusing more on the positive feedback and learning more from it. It is also supported by the report by Tricomi and Fiez (2012) who compared caudal activation associated with negative and positive feedback provided during two paired-associate learning tasks, which varied in the extent to which positive and negative feedback were equally informative. In a two-choice task, positive and negative feedback provided information that was equally informative for task performance. In a four-choice task, similar to that used in our experiment, positive feedback was more informative than negative feedback. The results of their study indicated that, although in the two-choice task, the caudate activity associated with the delivery of negative feedback was similar to that associated with positive feedback, the activity associated with negative feedback was reduced in the four-choice learning task. These results suggested that the RL system is not only sensitive to the valence of the feedback but also to the amount of information the feedback carries. Tricomi and Fiez (2008) divided their two-choice paired-associate task into three rounds, each of which presented the same pairs. In the first round, positive and negative feedback served to inform the participants about the correct association. In a second and a third round, positive and negative feedback provided within the exact same task informed the participants about their correct or incorrect performance. Tricomi and Fiez (2008) reported that the activation of the caudate was stronger during the second and third rounds, suggesting that the “caudate can be engaged in feedback-based declarative memory tasks, but it is more strongly engaged when feedback is ‘earned’ by performance than when it is informative but not tied to goal achievement” (Tricomi & Fiez, 2008, p. 1). In light of these previous results and interpretations, it is possible that, in our experiment, the first and perhaps the second positive feedback stimuli were informative, but because the associations were yet to be learned, these feedback stimuli did not yet serve to validate the learned association. Within this framework, it is possible that all five positive feedback stimuli were informative, but the first was unexpected, the second was less surprising, and the third may have marked the “learning” point in which the positive feedback was “earned.” A support for this contention is provided by the P300 elicited by the five consecutive positive feedback stimuli, which showed a gradual decrease in amplitude. The reduction in P300 amplitude may reflect the learning process during which the positive feedback becomes less unexpected. The sequential effects found in our study are also in-line with the RL ERN theory, which predicts that the first positive feedback in a learning task, which is also the most unexpected feedback, should elicit the largest reward positivity. This hypothesis is confirmed by our data as the positivity that follows the FRN was found largest for the first positive feedback. The expectancy effect is also confirmed by the P300 data, which indicates that the first positive feedback was the most unexpected of the five.
The possibility that variance in FRN amplitude stems from neural activity associated with positive rather than negative feedback, which is supported by previous reports (e.g., Kreussel et al., 2012; Baker & Holroyd, 2011; Foti et al., 2011; Holroyd et al., 2008; Potts et al., 2006) and by our data, has been suggested as a modification to the RL ERP hypothesis (Holroyd, 2004). Holroyd et al. (2008) further suggest that the observed FRN is in fact an N200 elicited by unexpected events, which is reduced in amplitude when positive feedback is presented because of the superposition of the reward positivity. In most of the abovementioned reports, positive feedback is associated with a positive-going deflection in the time range of the FRN. This pattern is different from that found in the current study as our results suggest that positive feedback elicits a negative-going activity, which is smaller in amplitude from the ERN elicited by negative feedback. Similar results have been described by Oliveira, McDonald, and Goodman (2007) who found that unexpected positive feedback elicited an FRN, which was not different from that elicited by unexpected negative feedback. Although the paradigm used by Oliveira et al. (2007) is different from the one employed in the current study, the results support the notions that the process that gives rise to the FRN is not limited to error processing. Taken into account the N200 hypothesis proposed by Holroyd et al. (2008), it is possible that the variations in the amplitude of the positive feedback-related activity stems from the unique relationship between expectancy and feedback valence in each of the specific tasks. We suggest that the relationship may be even more complex and depend on the role the feedback plays in the specific circumstances and the type and load of information it carries. The boundaries between positive and negative feedback may be fuzzier than we think and may depend on the extent to which feedback is informative.
Our data suggest that performance feedback also elicited an activity that follows the FRN. This ERP component was found associated with both feedback valence and with long-term learning. Our data suggest that the more useful the information provided by the feedback stimulus was, the less positive (or more negative) the amplitude of this component became. The results from the sequential effect examination of positive feedback can be viewed as supporting this hypothesis if one assumes that the first positive feedback which elicited the most positive activity is less informative than the following positive feedback presentations. This ERP component has not been observed or described in previous reports. One possible explanation for the novelty of our findings is the use of PCA which allows the disentangling of temporally overlapping components. As the activity of this component (i.e., the antecedent conditions that affected its properties) differs from that of the FRN and the P300 we suggest that it is a unique component rather than an “artifact” of the P300 and FRN related activity. Although it is too early to hypothesize about its functional significance, it appears from our analysis that this fronto-central component which follows the FRN temporally is related to the process that uses positive and negative feedback to facilitate learning.
Reprint requests should be sent to Yael Arbel, Department of Speech-Language Pathology and Audiology, Northeastern University, Boston, MA, or via e-mail: email@example.com; firstname.lastname@example.org.