Previous studies on the neurophysiological underpinnings of feedback processing almost exclusively used low-ambiguity feedback, which does not fully address the diversity of situations in everyday life. We therefore used a pseudo trial-and-error learning task to investigate ERPs of low- versus high-ambiguity feedback. Twenty-eight participants tried to deduce the rule governing visual feedback to their button presses in response to visual stimuli. In the blocked condition, the same two feedback words were presented across several consecutive trials, whereas in the random condition feedback was randomly drawn on each trial from sets of five positive and five negative words. The feedback-related negativity (FRN-D), a frontocentral ERP difference between negative and positive feedback, was significantly larger in the blocked condition, whereas the centroparietal late positive complex indicating controlled attention was enhanced for negative feedback irrespective of condition. Moreover, FRN-D in the blocked condition was due to increased reward positivity (Rew-P) for positive feedback, rather than increased (raw) FRN for negative feedback. Our findings strongly support recent lines of evidence that the FRN-D, one of the most widely studied signatures of reinforcement learning in the human brain, critically depends on feedback discriminability and is primarily driven by the Rew-P. A novel finding concerned larger frontocentral P2 for negative feedback in the random but not the blocked condition. Although Rew-P points to a positivity bias in feedback processing under conditions of low feedback ambiguity, P2 suggests a specific adaptation of information processing in case of highly ambiguous feedback, involving an early negativity bias. Generalizability of the P2 findings was demonstrated in a second experiment using explicit valence categorization of highly emotional positive and negative adjectives.
Individuals as actors in a dynamic environment must learn from the consequences of their behavior. Behaviors that are followed by positive consequences tend to be repeated in the future, whereas behaviors leading to negative consequences tend to be suppressed (Thorndike, 1911). Obviously, such reinforcement learning requires the ability to quickly discriminate between positive and negative events. Research during the last 20 years has indeed provided strong evidence for the existence of a set of flexible, powerful, and fast-acting brain mechanisms underlying the successful adaptation to the consequences of one's own behavior.
Much of this evidence comes from research using ERPs. In a seminal study, Miltner, Braun, and Coles (1997) observed larger relative frontocentral ERP negativity at around 250–300 msec for negative compared with positive feedback, which they termed the “feedback-related negativity” (FRN). Note that, for the sake of clarity, in the following we will refer to the larger relative frontocentral negativity for negative as compared with positive feedback as FRN-D (“D” for difference), whereas “FRN” and “reward positivity” (Rew-P; Baker & Holroyd, 2011) denote the raw ERP amplitudes for negative and positive feedback, respectively. Miltner et al. (1997), like many later authors, were able to localize the source of the FRN-D in ACC.
The reinforcement learning (RL) theory of FRN (Holroyd & Coles, 2002) proposes that worse-than-expected events such as receiving error feedback are detected by an adaptive critic located in the BG. This leads to a phasic decrease in midbrain dopaminergic activity, which disinhibits ACC and thereby produces an error signal seen in the FRN. On the other hand, in the prediction of response outcome (PRO) model by Alexander and Brown (2011), ACC is seen as an action outcome predictor. ACC activity as reflected in FRN-D is assumed to occur whenever there is a mismatch between the actual response outcome and the predicted outcome, irrespective of outcome valence. PRO theory can explain better than RL theory the findings of larger FRN-like ERP negativity for unexpected positive outcomes in certain tasks (e.g., Oliveira, McDonald, & Goodman, 2007). However, both PRO theory and RL theory in its original form do not account for an increasing number of recent findings from learning and gambling tasks: There is evidence that FRN-D may be mainly driven by a positive ERP deflection at around 300 msec for positive feedback (i.e., the Rew-P), rather than by a negative deflection for negative feedback (e.g., Kujawa, Smith, Luhmann, & Hajcak, 2013; Kreussel et al., 2012; Luque, Lopez, Marco-Pallares, Camara, & Rodriguez-Fornells, 2012; Baker & Holroyd, 2011; Foti, Weinberg, Dien, & Hajcak, 2011; Holroyd, Krigolson, & Lee, 2011; Hewig et al., 2007, 2010; Eppinger, Kray, Mock, & Mecklinger, 2008; Holroyd, Pakzad-Vaezi, & Krigolson, 2008; Cohen, Elger, & Ranganath, 2007; Potts, Martin, Burton, & Montague, 2006; Holroyd, 2004; Holroyd, Nieuwenhuis, Yeung, & Cohen, 2003). Eppinger et al. (2008), for example, used a probabilistic learning task where three stimuli each were mapped to two different responses. For one pair of stimuli feedback was 100% valid, whereas it was only 80% and 50% valid, respectively, for the other two pairs of stimuli. The authors observed larger FRN-D for low-validity (or, more surprising) feedback, but this effect was driven by larger Rew-P for positive feedback in the low-validity condition, whereas FRN for negative feedback did not differ between validity conditions. It should be noted, however, that, despite its original focus on negative events, RL theory (Holroyd & Coles, 2002) can easily incorporate the Rew-P as a “better-than-expected” signal. Accordingly, unexpected rewards lead to a phasic increase in midbrain dopaminergic activity, which produces the Rew-P either directly (Foti et al., 2011) or indirectly, via its inhibitory effect on ACC (Eppinger et al., 2008; Holroyd et al., 2003).
A notable limitation of the FRN literature is that the vast majority of studies used exactly two easily discriminable feedback stimuli such as “+” and “−” signs. This setting of low feedback ambiguity1 may, however, not adequately address all situations of feedback processing in everyday life, where the specific (e.g., verbal) stimuli indicating positive or negative feedback often cannot be fully predicted. Mars, De Bruijn, Hulstijn, Miltner, and Coles (2004) conducted a study with more complex feedback. Participants estimated 1-sec intervals and received either simple correct/incorrect feedback or feedback including additional graded information regarding under- and overestimation. FRN-D was larger in the former than in the latter condition. Liu and Gehring (2009) found FRN-D in a two-choice guessing task to be reduced if feedback valence depended on feature conjunctions (e.g., positive = red square or blue circle) as opposed to single features (e.g., positive = red irrespective of shape or square irrespective of color). In a subsequent study, Liu, Nelson, Bernat, and Gehring (2014) observed significantly larger FRN-D for perceptually dissimilar (S vs. T) than similar (E vs. F) positive and negative feedback stimuli. Finally, Pfabigan, Zeiler, Lamm, and Sailer (2014) presented two types of feedback in a time estimation task, symbolic (“+” and “−” signs) and facial feedback (a happy and an angry face). When the two feedback modalities were randomly distributed over trials, FRN-D was reduced compared with a presentation in separate blocks. Together, these studies indicate that FRN-D requires easily discriminable (i.e., low ambiguity) feedback. This notion has strong theoretical implications because it requires further specification of the concepts of “worse-than-expected events” (Holroyd & Coles, 2002) and “expectation mismatch” (Alexander & Brown, 2011).
Against this background, this study of feedback ERPs in a pseudo trial-and-error learning task has two aims: First, we want to investigate the effects of presenting the same two feedback words across several consecutive trials (blocked condition; low feedback ambiguity) versus presenting feedback that is randomly drawn on each trial from sets of positive and negative words (random condition; high feedback ambiguity). We expect that the FRN-D reflecting greater frontocentral relative negativity for negative than positive feedback will be more pronounced in the blocked condition than in the random condition. This prediction rests on the idea that in the blocked condition the positive–negative discrimination of the feedback words can be performed quickly enough to timely inform the fast-acting mental processes underlying FRN-D. In the random condition, in contrast, a fast perceptual classification of the feedback is effectively prevented, as words used for positive and negative feedback differ between subsequent trials. Then, to determine its valence, feedback has to be analyzed at a semantic level, which may take too long to elicit the typical FRN-D at around 200–300 msec. FRN-D could then either be completely absent or merely delayed (cf., Liu & Gehring, 2009). Indeed, Baker and Holroyd (2011) found their Rew-P-driven FRN-D to be delayed by about 100 msec when the feedback included an additional cue on how to respond on the next trial. This study aimed to distinguish between these two potential effects of feedback ambiguity (FRN-D delayed vs. completely suppressed).
If FRN-D is indeed absent in the random condition, we will aim to explore other ERP components that might be sensitive to feedback valence. Repeatedly, a modulation of the centroparietal/parietal P3b (or late positive complex, LPC) by feedback valence was reported, albeit with inconsistent direction of the effect: Larger LPC for positive than negative feedback was observed, for example, by Hajcak, Moser, Holroyd, and Simons (2007), Pfabigan, Alexopoulos, Bauer, and Sailer (2011), and Pfabigan et al. (2014), whereas others found larger LPC for negative than positive feedback (West, Bailey, Anderson, & Kieffaber, 2014; West, Bailey, Tiernan, Boonsuk, & Gilbert, 2012; Frank, Woroch, & Curran, 2005). LPC reflects the controlled attentional processing of motivationally relevant stimuli (e.g., Schacht & Sommer, 2009a; Fischler & Bradley, 2006; Schupp et al., 2000). In the present task, this involves standard verbal processing of the feedback words, which is possible in both the blocked and random conditions and provides explicit valence information that may affect the LPC. In view of the inconsistent literature regarding the polarity of the feedback LPC effect, however, we can only put forward a nondirectional hypothesis, predicting a modulation of the LPC by feedback valence. This effect should be similar in the blocked and random conditions.
Besides LPC and FRN-D, early frontocentral P2 was also sensitive to feedback classes in some studies (San Martin, Appelbaum, Pearson, Huettel, & Woldorff, 2013; Goyer, Woldorff, & Huettel, 2008). Although the effect concerned feedback magnitude rather than feedback valence, frontocentral P2 may be another candidate for blocked/random differences in feedback processing.
Our second aim is to further corroborate recent findings that FRN-D and experimental variations of FRN-D were mainly driven by positive feedback (e.g., Holroyd et al., 2008). The present design provides an excellent opportunity to further test this idea: By comparing ERPs between the blocked and random conditions separately for positive and negative feedback, we can determine whether increased FRN-D in the blocked condition is due to larger FRN for negative feedback, larger Rew-P for positive feedback, or both. Because most studies that explicitly addressed this issue found FRN-D effects to result from variations in Rew-P rather than FRN, we predict the following: If FRN-D is larger in the blocked condition than in the random condition, this effect will be due to larger Rew-P in the blocked condition than in the random condition, whereas FRN will not differ between conditions.
Thirty right-handed participants (17 women) were recruited from introductory courses in psychology at the University of Goettingen. Age ranged from 19 to 36 years (M = 22.8 ± 3.7). Because of distorted EEG recordings, two participants were excluded from all analyses. Participants received partial course credit or were paid € 7.50/hr. An additional € 0.01 was awarded for each “correct” response. Because 60% of the 400 trials involved positive feedback, this bonus amounted to exactly € 2.40 for all participants. All had normal or corrected-to-normal vision and were naive as to the purpose of the experiment. The study was approved by the local ethics committee.
Apparatus and Stimuli
Stimuli were presented on a 17-in. SVGA monitor. The capital letters from A to I and the digits from 1 to 9 served as imperative stimuli and were presented in red, blue, green, yellow, pink, or purple in NRC7bit font (size 36). There were two sets of five negative words, Falsch! [wrong], Schlecht! [poor], Nein! [no], Daneben! [miss], and Irrtum! [error], and five positive words, Genau! [precise], Korrekt! [correct], Richtig! [right], Exakt! [accurate], and Treffer! [hit]. For the German words, negative (6.2 ± 1.5) and positive feedbacks (6.2 ± 1.1) had identical mean word length and did not differ in mean word frequency (t < 1, p > .40). Feedback was shown white-on-black in NRC7bit font (size 24) under horizontal and vertical visual angles of 3° and 1°, respectively, at a viewing distance of 60 cm. The experiment was run under ERTS software (Beringer, 1996), and responses were recorded using a custom-made response pad.
A trial started with a white fixation cross (1.5 × 1.5 cm) presented for 1200 msec in the center of the screen. It was replaced by the imperative stimulus, which remained on the screen until a response was made, but for no longer than 3000 msec. At 800 msec after the response, the feedback word was presented for 1000 msec (see Figure 1). Immediately thereafter, the next trial started. In the rare cases where no response was given within 3000 msec, the words Zu langsam! [Too slow] were shown for 800 msec, before the next trial started.
Overall, 400 trials were run. The 200 trials of the blocked condition comprised five blocks of 40 trials each. Within each block, 24 positive (60%) and 16 negative feedbacks (40%) were presented in random order, but always using the same two words that were randomly drawn without replacement from the sets at the beginning of each block. The 200 trials of the random condition likewise comprised five 40-trials blocks (24 positive, 16 negative feedbacks), but the feedback word was randomly drawn without replacement from the positive or negative word sets on each trial. In total, each of the five positive and negative feedbacks was presented 24 and 16 times, respectively, to match the blocked condition. The 60/40 ratio of positive and negative feedbacks was employed to not overly frustrate the participants who were unable to learn any of the alleged rules.
Participants were instructed to try and find out the rule underlying positive and negative feedback, by pressing the left or right button within 3000 msec after onset of the colored stimuli. It was mentioned that different positive and negative words could be used as feedback and that the rule might be related to stimulus color, stimulus type (digit vs. letter), digit parity, letter type (vowel vs. consonant), or combinations thereof. After each 40-trial block, there was a 1-min break during which participants were informed that now a new rule would be in effect. Again, they should try and find out this rule by trial-and-error learning. Actually, however, there was no rule governing feedback in any of the blocks.
Upon arrival, participants were informed about the course of the EEG study and gave written informed consent. After preparation for EEG, instructions were displayed. Then, the experiment started with the first 40-trial block, without preceding practice trials. Of the final 28 participants, 14 started with the blocked condition and 14 with the random condition. The experiment was conducted in a sound-attenuated, electrically shielded, and dimly lit recording booth. Before leaving, participants were paid and debriefed. They learned that in fact there was no rule in any of the blocks because a certain feedback had to be presented equally often to all participants to allow for a systematic analysis.
EEG Recording and Preprocessing
EEG was recorded from 64 scalp electrodes of the 10% system (Chatrian, Lettich, & Nelson, 1988) using electrode caps (EasyCap, Herrsching, Germany) with sintered Ag/AgCl electrodes. Frontopolar (FP) to occipital (O) and outer left (e.g., T7) to outer right (e.g., T8) areas were covered, plus two mastoid electrodes. Ground was AFz; the right mastoid (TP10) served as the reference. The EOG was monitored from electrodes below and above the right eye, and from outer left and right canthi. EEG was recorded continuously using a digital 64-channel BrainAmp amplifier and VisionRecorder software (Brain Products GmbH, Gilching, Germany). Sampling rate was 500 Hz; bandpass was 0.1–70 Hz.
Offline, EEG was re-referenced against averaged mastoids, subjected to ICA-based removal of blink artifacts (Jung et al., 2000), and segmented into (−100, 1000 msec) epochs relative to feedback onset. Only trials with a response within 3000 msec after onset of the imperative stimulus were included. Baseline correction involved the (−100, 0 msec) interval. Data were screened for artifacts (i.e., amplitudes exceeding ±120 μV), and contaminated trials were rejected (less than 10% per participant). Minimum numbers of trials retained were 72 and 73 (of 80) in the conditions blocked/negative feedback and random/negative feedback, respectively, and 109 and 106 (of 120) in the blocked/positive and random/positive conditions, respectively, with no significant blocked–random differences (both t < 0.2, p > .85). Artifact-free epochs were averaged separately for the four cells of the design.
Analyses were run under SPSS 22 (IBM Corp., Armonk, NY) and STATISTICA 5 (StatSoft Inc., Tulsa, OK). Mean RT was analyzed using 2 × 2 ANOVA with repeated measures on Condition (blocked, random) and Feedback (positive, negative). This analysis served to ensure that ERP differences between the blocked and random conditions were not due to a different amount of time elapsing between the imperative stimulus and the feedback. Regarding ERPs, visual inspection revealed three effects: Frontocentral P2 at around 200 msec was larger for negative than positive feedback in the random but not the blocked condition. In the FRN-D time range at around 300 msec, ERPs were less positive for negative than positive feedback over frontocentral midline electrodes in the blocked condition but not the random condition. At around 650 msec, centroparietal LPC was increased for negative compared with positive feedback in both conditions (see Figures 2 and 3).
P2, FRN/Rew-P, and LPC were quantified as mean ERP amplitudes in the 190–240, 280–360, and 500–800 msec time windows, respectively, and subjected to 5 × 5 × 2 × 2 ANOVAs with repeated measures on Caudality (anterior, frontocentral, central, centroparietal, posterior), Laterality (outer left, inner left, medial, inner right, outer right), Condition (blocked, random), and Feedback (negative, positive). The 25 clusters of electrodes were composed as follows (see Figure 2): anterior/outer left (AOL): AF7, F7, F5; anterior/inner left (AIL): AF3, F3, F1; anterior/medial (AM): FP1, FPz, FP2, Fz; frontocentral/outer left (FCOL): FT7, FC5; frontocentral/inner left (FCIL): FC3, FC1; frontocentral/medial (FCM): FCz; central/outer left (COL): T7, C5; central/inner left (CIL): C3, C1; central/medial (CM): Cz; centroparietal/outer left (CPOL): TP7,CP5; centroparietal inner left (CPIL): CP3, CP1; centroparietal/medial (CPM): CPz; posterior/outer left (POL): PO7, P7, P5; posterior/inner left (PIL): PO3, P3, P1; posterior/medial (PM): Pz, POz, O1, Oz, O2. Inner and outer right clusters were homologous to inner and outer left clusters.
Only effects of Condition and Feedback are reported. Significant effects in the ANOVAs were followed up using Scheffé's test. Greenhouse–Geisser correction was applied if appropriate. For control purposes, Condition and Feedback effects on mean vertical EOG (vEOG) amplitude in the P2, FRN-D, and LPC time windows were investigated by means of t tests.
Participants on average did not respond within 3000 msec after onset of the imperative stimulus in 0.46 trials (SEM = 0.19) of the blocked condition and in 0.43 trials (SEM = 0.24) of the random condition. This difference was not significant, t(27) = 0.24, p = .81.
The 2 × 2 repeated-measures ANOVA with factors Condition (blocked, random) and Feedback (positive, negative) revealed a main effect of Feedback, F(1, 27) = 9.3, p = .005, ηp2 = 0.26. Responses preceding positive feedback were significantly slower than responses preceding negative feedback. Importantly, however, both the Condition main effect and the Condition × Feedback interaction were not significant (F < 1). Note that the Feedback main effect on the RT of trial n can only have to do with the feedback on trial n − 1. This is because feedback was drawn randomly, yet with the restriction of 60% positive and 40% negative feedback. Thus, the probability that negative feedback on trial n − 1 was followed by positive feedback on trial n was about twice as high as the probability for another negative feedback on trial n. Indeed, of the 160 trials following negative feedback, on average 105 involved positive feedback and only 55 another negative feedback. Because RT typically slows down after errors or negative feedback (posterror slowing; Rabbitt, 1966), in this study the slow responses after negative feedback were often followed by positive feedback. Indeed, we observed significantly larger RT on trials following negative compared with positive feedback, t(27) = 2.2, p = .035, which can explain the larger RT preceding positive than negative feedback.
In the 5 × 5 × 2 × 2 repeated-measures ANOVA for mean ERP amplitude between 190 and 240 msec, employing factors Caudality, Laterality, Condition, and Feedback, the main effect of Feedback was significant, F(1, 27) = 6.1, p = .021, ηp2 = 0.19. P2 was larger for negative than positive feedback. Significant interactions Caudality × Feedback, F(4, 108) = 17.0, p < .001, ε = .47, ηp2 = 0.39, and Laterality × Feedback, F(4, 108) = 3.3, p = .032, ε = .71, ηp2 = 0.11, were further qualified by a significant interaction Caudality × Laterality × Feedback, F(16, 432) = 3.0, p = .002, ε = .49, ηp2 = 0.10. According to Scheffé's test, P2 was larger for negative than positive feedback at the five clusters AM (p < .001), FCIL (p = .002), FCM (p = .006), and CM (p = .008); for the remaining clusters, p > .10. Most notably, there was a significant interaction Condition × Feedback, F(1, 27) = 10.2, p = .004, ηp2 = 0.27. P2 was significantly increased for negative compared with positive feedback in the random condition (p = .002; see Figure 2) but not in the blocked condition (p = .995). Finally, the interaction Caudality × Condition × Feedback was significant, F(4, 108) = 3.5, p = .047, ε = .43, ηp2 = 0.11. At anterior and frontocentral clusters, P2 was larger for random/negative compared with both, random/positive and blocked/negative (both ps < .001). At central leads, P2 was enhanced for random/negative compared with random/positive (p = .003; all other ps > .12). No P2 effects were found at centroparietal and posterior clusters (all ps > .16). The four-way interaction was not significant, F(16, 432) = 1.3, p = .263. Amplitude of the vEOG did not differ between the four combinations of Condition and Feedback, all ts < 1.2, all ps > .23, suggesting that the P2 effect was unrelated to blink artifacts.
FRN-D Time Range
In the 5 × 5 × 2 × 2 repeated-measures ANOVA for mean amplitude between 280 and 360 msec,2 employing factors Caudality, Laterality, Condition, and Feedback, the Condition main effect was significant, F(1, 27) = 19.7, p < .001, ηp2 = 0.42, with more positive ERPs in the blocked condition than in the random condition. The significant Feedback main effect, F(1, 27) = 9.0, p = .006, ηp2 = 0.25, resulted from greater relative negativity for negative compared with positive feedback.
The significant interaction Condition × Feedback, F(1, 27) = 17.1, p < .001, ηp2 = 0.39, was due to significant FRN-D in the blocked condition (p < .001) but not the random condition (p = .878; see Figure 2). Figure 3 shows the ERP difference maps (negative minus positive feedback) separately for conditions, suggesting that FRN-D in the blocked condition was largest over frontocentral areas and hence not caused by feedback effects on parietal P300. Most importantly, ERPs in the FRN time range were more positive after positive feedback in the blocked condition, compared with positive feedback in the random condition (p < .001), whereas the Condition effect was not significant for negative feedback (p = .132). There were significant interactions Laterality × Feedback, F(4, 108) = 7.7, p < .001, ε = .55, ηp2 = 0.22, Caudality × Feedback, F(4, 108) = 6.5, p = .003, ε = .50, ηp2 = 0.19, Laterality × Condition, F(4, 108) = 8.7, p = .001, ε = .41, ηp2 = 0.24, Caudality × Laterality × Feedback, F(16, 432) = 4.4, p < .001, ε = .42, ηp2 = 0.14, Caudality × Laterality × Condition, F(16, 432) = 4.9, p < .001, ε = .35, ηp2 = 0.15, and Laterality × Condition × Feedback, F(4, 108) = 6.4, p = .002, ε = .57, ηp2 = 0.19.
All these interactions were further qualified by a significant four-way interaction, F(16, 432) = 2.3, p = .022, ε = .57, ηp2 = 0.08. Whereas in a follow-up ANOVA for the random condition the three-way interaction Caudality × Laterality × Feedback was clearly nonsignificant, F(16, 432) = 0.9, p = .552, it was significant for the blocked condition, F(16, 432) = 5.6, p < .001, ε = .49, ηp2 = 0.17. Scheffé's test revealed significant FRN-D in the blocked condition for the nine clusters from FCIL to CPIR (all ps < .001). For all other clusters, p > .10. In the random condition, feedback effects were not significant at all 25 clusters (all ps > .97). Moreover, ERPs in the FRN-D time range were significantly more positive for blocked/positive compared with random/positive at the nine clusters from FCIL to CPIR (all ps < .001). This markedly contrasts with the nonsignificant Condition effect for negative feedback (p > .10 at all 25 clusters). Results therefore suggest that FRN-D in the blocked condition was due to ERP positivity for positive feedback (i.e., the Rew-P; see Figure 4). Mean vEOG amplitude again did not significantly differ between the four combinations of Condition and Feedback, all ts < 1, all ps > .40.
In the 5 × 5 × 2 × 2 ANOVA for mean ERP amplitude between 500 and 800 msec with repeated measures on Caudality, Laterality, Condition, and Feedback, the Feedback main effect was significant, F(1, 27) = 14.6, p < .001, ηp2 = 0.35, with larger LPC after negative than positive feedback. The main effect of Condition was clearly nonsignificant, F(1, 27) = 0.005, p = .94. The interactions Condition × Feedback, Caudality × Condition × Feedback and Laterality × Condition × Feedback were not significant (all Fs < 1.8, p > .19), suggesting similar feedback LPC effects in the two conditions. Significant interactions Caudality × Feedback, F(4, 108) = 13.0, p < .001, ε = .49, ηp2 = 0.33, and Laterality × Feedback, F(4, 108) = 8.6, p < .001, ε = .58, ηp2 = 0.24, were further qualified by a significant interaction Caudality × Laterality × Feedback, F(16, 432) = 2.8, p = .007, ε = .47, ηp2 = 0.10. LPC was significantly larger after negative than positive feedback at all CP and P clusters (all ps < .001), but not at A clusters (all ps > .09). Of the FC clusters, only FCIL and FCM (both ps < .01) showed a significant feedback LPC effect (all other ps > .21). Of the C clusters, the feedback LPC effect was significant for CIL, CM (both ps < .001), and CIR (p = .033; all other ps > .98). The four-way interaction was not significant, F(16, 432) = 1.5, p = .144, ε = .57. A targeted test at CPz where LPC was largest indicated similarly-sized LPC effects of feedback valence in the blocked and random conditions, t(27) = 1.0, p = .325. Also in the LPC time range, vEOG amplitude did not significantly differ between the four combinations of Condition and Feedback, all ts < 1.4, all ps > .17.
Additional P2 Analysis
Because anterior P2 increase for negative feedback in the random condition was a novel finding, we ran an additional analysis to allow for a better understanding of this effect. Briefly, we assumed that P2 modulation reflects a specific adaptation of information processing, that is, a negativity bias in case of high feedback ambiguity (see General Discussion). If so, the P2 effect should be reduced in participants who started with the blocked condition. This is because they were already familiar with the different feedback stimuli when the random phase began, which should have reduced feedback ambiguity. To test this idea, we computed a 2 × 2 × 2 ANOVA with a between-subject factor Order of Conditions (blocked first, n = 14 vs. random first, n = 14) and two within-subject factors Condition and Feedback for mean P2 amplitude (190–240 msec) at FCz. The main effect of Feedback, F(1, 26) = 7.7, p = .010, ηp2 = 0.23, and the Condition × Feedback interaction, F(1, 26) = 16.7, p < .001, ηp2 = 0.39, were significant, confirming the results reported earlier. The main effect of Order of Conditions was not significant, F(1, 26) = 0.9, p = .36. Most notably, however, there was a significant three-way interaction, F(1, 26) = 4.6, p = .042, ηp2 = 0.15. Scheffé's test indicated significantly larger P2 for negative than positive feedback in the random condition of the random-first group (p = .004; see Figure 5), but neither in the blocked condition of this group (p = .997), nor in the random condition of the random-last group (p = .682).3
Additional Analyses Related to Cognitive Load
One could argue that, in the random condition, not only feedback ambiguity was higher, but also (1) affective variability of positive and negative feedback words, (2) entropy/complexity of the feedback space, and (3) cognitive load. Although Experiment 2, below, among other purposes, addresses the first point and the second point may be mainly a matter of nomenclature (see General Discussion), the cognitive load aspect requires further analysis. That is, altered feedback ERPs could result from a general state of higher cognitive load throughout the whole random condition and/or from the expectation of higher cognitive load, rather than from higher cognitive load/ambiguity because of each single feedback. Two additional analyses rule out these ideas: First, mean RT to the imperative stimulus was not slower in the random condition (798 msec) than in the blocked condition [817 msec; t(27) = 0.61, p = .55]. Second, to test for differences with regard to the participants' expectation of (or preparation for) the feedback, we analyzed the prefeedback ERP baseline (−100, 0 msec), that is, the (700, 800 msec) interval relative to the response. Baseline correction involved the (−100, 0 msec) interval preceding the response. In the 61 × 2 repeated-measures ANOVA with factors Electrode (61) and Condition (blocked, random), both the Condition main effect, F(1, 27) = 1.0, p = .32, and the interaction, F(60, 1620) = 0.58, p = .76, ε = .11, were not significant, as was a targeted comparison at the most relevant electrode FCz, t(27) = 0.53, p = .60. Thus, neither RT nor feedback-preceding ERPs supported the assumption that altered feedback ERPs in the random condition were due to a general state of increased cognitive load, or to the expectation of higher cognitive load.
Experiment 2 aimed to confirm our suggestion that the P2 increase for negative feedback in the random condition reflects a negativity bias under high ambiguity. If so, the effect should generalize to tasks other than feedback processing. In Experiment 2, therefore, participants performed explicit valence discrimination on visually presented words. New sets of carefully matched positive and negative words were used. Similar to Experiment 1, in the blocked condition always the same two words, one positive and one negative, appeared across several consecutive trials, whereas in the random condition all 10 words (five positive, five negative) were presented randomly. We expected a larger frontocentral P2 at around 200 msec for negative than positive words in the random but not the blocked condition.
Sixteen right-handed participants (13 women) not enrolled in Experiment 1 were recruited at the University of Bonn. Age ranged from 18 to 41 years (M = 23.8 ± 5.2). Participants received partial course credit or were paid € 7.50/hr. An additional € 0.01 was awarded for each correct response. All had normal vision and were naive as to the purpose of the experiment.
Apparatus and Stimuli
Stimuli were presented on a 23-in. TFT monitor. Two sets, each containing ten highly emotional adjectives, five positive and five negative, were taken from Schwibbe, Räder, Schwibbe, Borchardt, and Geiken-Pophanken (1994), including scores on a 1–7 Likert scale for valence, arousal, emotionality, imagery, concreteness, and meaning. In both sets, there were comparable differences in mean valence of the positive and negative subsets [Set 1: 5.10 vs. 2.81; t(8) = 19.2, p < .0000001; Set 2: 5.20 vs. 2.88; t(8) = 18.5, p < .0000001], whereas arousal (both ps > .34) and emotionality (both ps > .37) did not differ. Variability of the emotionality scores within the four sets was generally low (<0.4 on a 1–7 Likert scale) and did not differ between sets (all ps > .21). Moreover, in neither set were there positive–negative differences with regard to imagery, concreteness, and meaning (all ps > .61). This also applied to mean word frequency according to wortschatz.uni-leipzig.de (both ps > .43). Word length varied from 4 to 8, with each of the four subsets including each word length once. Within a given set, all 10 words had different initials. Words were shown white-on-black in Arial font (size 30) under horizontal and vertical visual angles of 3° and 1°, respectively, at a viewing distance of 60 cm. The experiment was run under Presentation (Neurobehavioral Systems, Berkeley, CA). Responses were recorded using the left and right CTRL keys of a standard computer keyboard.
A trial started with a white fixation cross (1.5 × 1.5 cm) shown for a random duration between 1500 and 2000 msec in the center of the screen. The subsequent word remained on the screen until a response was made, but for no longer than 1500 msec. After correct responses, the next trial started immediately. After wrong responses, error feedback (Falsch! [wrong]) was provided for 2000 msec, before the next trial started. In the rare cases (<0.2%) where no response was given within 1500 msec, the words Zu langsam! [Too slow] were shown for 800 msec, before the next trial started.
A total of 200 trials were run. The 100 trials of the blocked condition comprised five blocks of 20 trials each. Within each block, on 10 trials each, one constant positive and one constant negative word were presented in random order. The two words were randomly drawn without replacement from one of the word sets at the beginning of each block. Also the 100 trials of the random condition were subdivided into five 20-trials blocks (10 positive, 10 negative words), but words were randomly drawn without replacement on each trial from the word set not used in the blocked condition. Order of conditions and assignment of word set to condition were balanced across participants. Participants had to indicate as quickly and accurately as possible whether they saw a positive or a negative word by pressing the left or right CTRL key. After each block, there was a 1-min break during which participants learned about their current score (1 ct for each correct response within 1500 msec). Assignment of response side to valence was counterbalanced across participants. Upon arrival, participants were informed about the course of the EEG study and gave written informed consent. After preparation for EEG, instructions were displayed. Then, the experiment started with the first 20-trial block. The experiment was conducted in a sound-attenuated, electrically shielded, and dimly lit recording booth and took approximately 10 min.
EEG Recording and Preprocessing
EEG was recorded from a set of 17 electrodes of the 10% system (Chatrian et al., 1988; FP1, FP2, AFz, F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2, TP9, CPz, TP10, Pz, and Oz), chosen to cover FCz and its neighbors plus midline anterior and posterior sites. A 64-channel ACTICAP system (Brain Products, Gilching, Germany) was used, with impedances kept below 10 kΩ. The ground was located near AFz. FCz served as the reference, and bipolar vEOG was monitored from an electrode below the right eye and Fp2. For EEG recording hardware, software, and parameters, see Experiment 1. Offline, EEG was filtered (0.1–15 Hz), re-referenced against averaged mastoids (TP9, TP10), subjected to eyeblink correction (Gratton, Coles, & Donchin, 1983), and segmented into (−100, 1000 msec) epochs relative to word onset. Baseline correction involved the (−100, 0 msec) interval. Trials with artifacts (i.e., amplitudes exceeding ±100 μV) were rejected (less than 6% per participant, with no difference between positive and negative words in both the blocked and random conditions, t < 0.7, p > .49). Artifact-free epochs were averaged separately for the four cells of the 2 × 2 design (Condition: blocked, random; Valence: positive, negative).
Software used was identical to Experiment 1. Mean RT and error percent were analyzed using 2 × 2 ANOVAs with repeated measures on Condition (blocked, random) and Valence (positive, negative). ERP analyses focused on frontocentral P2 peaking at 210 msec. P2 was quantified as mean amplitude between 180 and 230 msec at FCz and subjected to 2 × 2 ANOVA analogous to RT. Our hypotheses regarding larger P2 for negative than positive words in the random condition but not in the blocked condition were tested with Bonferroni-corrected one-tailed t tests. Using two-tailed t tests, we controlled for Valence effects on mean vEOG amplitude in the P2 time range.
The 2 × 2 ANOVA for mean RT revealed a significant main effect of Condition, F(1, 15) = 70.9, p < .001, ηp2 = 0.83. RT was larger in the random condition (538 msec, SEM = 10.3 msec) compared with the blocked condition (464 msec, SEM = 11.2 msec). Both the Valence main effect, F(1, 15) = 2.54, p = .13, ηp2 = 0.15, and the interaction, F(1, 15) = 0.04, p = .85, were not significant.
In the 2 × 2 ANOVA, the Condition effect was marginally significant, F(1, 15) = 3.95, p = .066, ηp2 = 0.21. Percentage of errors tended to be higher in the blocked condition (2.9%) compared with the random condition (2.3%), which may be a side effect of the faster responses in the blocked condition (speed–accuracy trade-off). Both the Valence main effect, F(1, 15) = 0.20, p = .67, and the interaction, F(1, 15) = 0.35, p = .56, were not significant.
P2 (180–230 msec)
The 2 × 2 ANOVA yielded significant main effects of Condition, F(1, 15) = 6.9, p = .019, ηp2 = 0.32, and Valence, F(1, 15) = 11.8, p = .004, ηp2 = 0.44, on P2 amplitude at FCz. P2 was larger in the random compared with the blocked condition and for negative compared with positive words. Most importantly, however, the interaction was also significant, F(1, 15) = 5.5, p = .034, ηp2 = 0.27. Bonferroni-corrected planned comparisons indicated larger P2 for negative than positive words in the random condition, t(15) = 3.4, p = .004 (one-tailed), d = 0.85, but not in the blocked condition, t(15) = 0.07, p = .94 (see Figure 6). Neither in the blocked condition nor in the random condition was there a significant Valence effect on vEOG amplitude, both t(15) < 0.35, p > .73.
Findings in the FRN-D Time Range: The Rew-P and Feedback Ambiguity
Experiment 1 of this study investigated feedback-related ERPs and used a pseudo trial-and-error learning task with blocked versus randomized presentation of multiple positive and negative feedback words. Results are clear-cut and support our hypotheses. The FRN-D, peaking at FCz, was significantly larger in the blocked than in the random condition. This finding was independent from eye movements and posterior P300. Recently, Pfabigan et al. (2014) reported larger FRN-D for blocked versus randomized presentation of symbolic and facial feedback, but FRN-D in the random condition was still significant. Our approach was somewhat different, as feedback varied within the verbal modality and a novel finding concerned the complete absence of FRN-D in the random condition [FCz, +0.1 μV, t(27) = 0.1, p = .921, d = 0.005]. Given the relatively large sample size (n = 28) and the small-to-medium effect in the blocked condition [FCz, −2.3 μV, t(27) = 4.9, p < .001, d = 0.33], low statistical power should be no issue here. Also, lack of an FRN-D in the random condition was not merely due to a delayed relative ERP negativity for negative feedback: As Figure 2 (right) shows, frontocentral ERPs were more positive for negative than positive feedback in the whole interval from 300 to 1000 msec.
Scrutiny of the findings indicated that FRN-D in the blocked condition resulted from larger Rew-P for positive feedback. The blocked–random difference for negative feedback was clearly nonsignificant. Again, the latter finding cannot be referred to low statistical power: A medium-sized blocked–random effect for positive feedback [FCz, 2.8 μV, t(27) = 5.6, p < .001, d = 0.40] contrasts with a nonsignificant difference for negative feedback [FCz, 0.5 μV, t(27) = 0.8, p = .437, d = 0.07]. Note that the polarity of the latter difference was exactly opposite to what one would expect if larger FRN-D in the blocked condition were due to increased FRN.
Our data therefore suggest that instrumental learning from the consequences of one's behavior is strongly driven by a rewarding process triggered by positive feedback, at least under conditions of low feedback ambiguity as in the present blocked condition. According to the literature, positive feedback activates the midbrain dopaminergic reward system (Schultz, Dayan, & Montague, 1997), which is seen in a mediofrontal Rew-P (Baker & Holroyd, 2011; Holroyd et al., 2008, 2011; Martin, Potts, Burton, & Montague, 2009; Eppinger et al., 2008; Potts et al., 2006) of possibly striatal origin (e.g., Foti et al., 2011; Martin et al., 2009; Nieuwenhuis, Slagter, Von Geusau, Heslenfeld, & Holroyd, 2005). On the other hand, according to the original RL theory of FRN (Holroyd & Coles, 2002), negative feedback leads to a phasic decrease in midbrain dopaminergic activity, which disinhibits ACC and thereby produces a worse-than-expected signal, the FRN. Our study, however, adds to the recent evidence that negative feedback may not always directly contribute to FRN-D and instead is mainly characterized by the absence of the Rew-P. For example, Kreussel et al. (2012) observed a significant magnitude (win/loss of 10 vs. 40 ct) by valence interaction, with larger FRN-D for high- versus low-magnitude trials (see also Wu & Zhou, 2009; Goyer et al., 2008). This effect was caused by larger Rew-P for high- vs. low-magnitude win trials, instead of larger FRN for high- versus low-magnitude loss trials. As outlined in the Introduction, RL theory (Holroyd & Coles, 2002) can easily incorporate the Rew-P as a “better-than-expected signal,” which is consistent with the general understanding of midbrain reward processing (Schultz, 2006). However, more research is necessary to further investigate the conditions under which FRN-D and experimental effects on it are driven, at least partly, by negative feedback.
The present approach and findings can be distinguished from and neatly extend related earlier work. Mars et al. (2004) found smaller FRN-D if negative feedback contained additional (graded) information about over- and underestimation of a 1-sec interval. However, the authors did not refer to reduced discriminability of the feedback in the graded conditions, but instead emphasized the lower importance of the correct/incorrect information compared with the more useful information about over- and underestimation. Thus, Mars et al.'s (2004) study differs from this study in which feedback was equally informative for rule learning in the blocked and random conditions. Liu and Gehring (2009), on the other hand, found smaller FRN-D if feedback valence depended on feature conjunctions rather than single features. Subsequently, Liu et al. (2014) observed larger FRN-D for perceptually dissimilar (S vs. T) as compared with similar (E vs. F) positive and negative feedback stimuli. On the basis of these findings, the authors called for an extension of the RL theory (Holroyd & Coles, 2002), such that “the FRN is elicited only when the reward properties of the feedback are indicated by single, distinctive features” (Liu & Gehring, 2009, p. 635). The present findings lend valuable support to this notion, though we shall elaborate on some differences between findings/interpretations.
First, in Liu et al. (2014) FRN-D was still significant (p < .01) when perceptual discriminability was low (E vs. F), whereas it was completely absent in the present random condition. It seems likely that simply several trials are needed for tuning one's perception toward an efficient E–F discrimination, that is, to establish two sufficiently distinct perceptual templates. Once this is achieved, feedback can again be discriminated in the FRN time range, which explains the still-significant FRN-D in Liu et al. (2014). In contrast, in our random condition, feedback words are, in principle, easily perceptually discriminable, but a one-to-one mapping of two distinct perceptual templates onto positive and negative valence cannot be established. This ambiguity seems to impose more fundamental and long-lasting constraints on the Rew-P, can account for the complete absence of FRN-D in our random condition, and also explains the difference to Baker and Holroyd's (2011) study. These authors found their Rew-P-driven FRN-D to be delayed by about 100 msec, rather than completely suppressed, if the feedback contained an additional cue on how to respond on the next trial (repeat response vs. switch). We suggest that the parallel processing of the two types of information (current reward vs. cue for next trial) caused an attentional conflict that slowed down the processing of reward information and, hence, delayed the Rew-P. However, feedback valence could still be determined on a perceptual level in that study ($$$ vs. $ symbols), which explains why the Rew-P was not completely suppressed.
Second, Liu et al. (2014) did not investigate whether discriminability effects on FRN-D were driven by positive or negative feedback. Rather, they followed the FRN interpretation of FRN-D and proposed that for negative feedback there is a mismatch between the actual perceptual input and a perceptual template for positive feedback. It was argued that if perceptual discriminability of the feedback is low, such a template does not exist or is only vaguely defined, which reduces the mismatch and, hence, the FRN. However, this study clearly showed that FRN-D was due to greater Rew-P in the blocked condition involving easily discriminable feedback. In analogy to Liu et al.'s (2014) idea of a perceptual mismatch underlying the FRN, we therefore suggest that the reduced FRN-D for high-ambiguity feedback is due to the fact that, in the absence of a clearly defined perceptual template for positive feedback, the latter cannot result in a perceptual match, which disables the Rew-P.
Regarding our use of the term “ambiguity,” we hasten to admit that other labels might be used, as long as they adequately describe ecologically valid scenarios in which the valence of (feedback) stimuli cannot be determined based on a purely perceptual discrimination. We do, however, suggest that “ambiguity” is more appropriate than, for example, “complexity,” to denote a difficulty accessing the meaning of a stimulus. Complexity can refer to the amount of information contained in a feedback beyond the mere concept of success versus failure, such as graded performance feedback (Mars et al., 2004) or instructions on how to respond on the next trial (Baker & Holroyd, 2011). Note, however, that those manipulations of feedback complexity resulted in a smaller (Mars et al., 2004) or delayed (Baker & Holroyd, 2011) FRN-D, but not in a complete suppression of FRN-D as in this study.
Centroparietal LPC and the Controlled Semantic Processing of Feedback
Unlike the Rew-P, the later centroparietal LPC (500–800 msec) effectively distinguished positive from negative feedback in both the blocked and random conditions, with larger LPC for negative feedback. In general, LPC effects of emotional words are often driven by arousal, but valence effects have also been reported (for a review, see Citron, 2012). However, the literature is rather inconsistent, including both, reports of larger LPC for positive than negative and/or neutral words (e.g., Herbert, Junghöfer, & Kissler, 2008) and for negative than positive and/or neutral words (e.g., Schacht & Sommer, 2009b; Kanske & Kotz, 2007). As these findings may have limited significance for this study where the words (allegedly) indicated performance feedback, prior studies on the effects of feedback valence on LPC also have to be considered. Again, some studies reported larger P3b/LPC for positive than negative feedback (e.g., Pfabigan et al., 2011, 2014; Hajcak et al., 2007) and other larger P3b/LPC for negative than positive feedback (West et al., 2012, 2014; Frank et al., 2005). In view of the inconsistent literature, we can only refer to a widely agreed conceptualization of LPC in the affective domain, that is, as an index of the controlled attentional processing of motivationally relevant stimuli (Schacht & Sommer, 2009a; Fischler & Bradley, 2006; Schupp et al., 2000). Thus, larger LPC in this study may reflect the greater motivational relevance of negative feedback which, unlike positive feedback, signals the need for behavioral adjustment (i.e., trying a different rule vs. staying with the currently tested rule). This idea fits well with the study by San Martin et al. (2013), which highlighted the role of the LPC for behavioral adjustment: In a probabilistic gambling task, differences in LPC amplitude (416–796 msec) between high- and low-magnitude feedback predicted the individual scores of gain maximization and loss minimization, whereas FRN-D was no predictor.
Frontocentral P2 and the Negativity Bias under High (Feedback) Ambiguity
A surprising finding concerned larger frontocentral P2 for negative than positive feedback in the random but not the blocked condition. Because P2 preceded the FRN-D, we can conclude that ERP effects of feedback valence actually occurred earlier for the hard than for the easy discrimination, which can only be explained in terms of strategic adaptations (i.e., top–down tuning) of information processing. To be specific, anterior P2 has been shown sensitive to the task relevance (Potts, 2004; Potts, Patel, & Azzam, 2004) and motivational relevance (Potts et al., 2006; Carretié, Hinojosa, Martín-Loeches, Mercado, & Tapia, 2004) of a stimulus, as well as top–down attentional selection (Anllo-Vento, Luck, & Hillyard, 1998) and motivated perception (Amodio, 2010). Accordingly, anterior P2 is increased for stimuli that match a frontally represented template of the target stimulus in a given task or motivational state. This is in line with San Martin et al. (2013), who interpreted their increased frontocentral P2 for high-magnitude compared with low-magnitude gains and losses in terms of an “implicit appreciation of the task relevance of a stimulus [which] can be biased” (p. 7018). Thus, larger P2 for negative feedback in the random condition may reflect top–down attentional selection which is necessary because of high feedback ambiguity in combination with the higher relevance of negative feedback (see above). This interpretation fits nicely with the fact that P2 modulation was restricted to the random-first group for whom feedback ambiguity, and hence, the need for an attentional bias toward the behaviorally more relevant negative feedback should be particularly high. In the blocked condition, on the other hand, only two words have to be discriminated for a series of trials, and a valence-based attentional selection as reflected in P2 should not be required.
Note that effects of task-relevant valence information on anterior P2 are quite plausible, even for emotional words. In several studies, the valence of written words already affected posterior P1 (e.g., Hofmann, Kuchinke, Tamm, Võ, & Jacobs, 2009; Scott, O'Donnell, Leuthold, & Sereno, 2009). Thus, valence is represented in the brain not later than 100 msec after word onset and, hence, can feed into a P2 mechanism of attentional selection in a similar way as has been suggested for nonaffective perceptual features (see above). Of course, further research on this topic is necessary.
At the very least, the feedback P2 effect allows for two conclusions: First, there are ERP effects of feedback valence prior to the FRN-D time range. These should be investigated in more detail in future studies, all the more because the eliciting conditions of high feedback ambiguity are ecologically valid but have not been routinely tested until now. Second, anterior P2 and Rew-P have to be carefully distinguished. This notion is supported both by the present findings of larger P2 for negative feedback as opposed to larger Rew-P for positive feedback and by Foti et al.'s (2011) study. Using spatiotemporal PCA, these authors separated a frontocentral P2 peaking at 200 msec from a frontocentral Rew-P peaking at 300 msec. Note that in Foti et al. (2011) feedback valence had effects on Rew-P, but not on P2. This, however, may have been due to the low-ambiguity feedback used, which provided conditions similar to the present blocked condition where there was no P2 modulation either.
From a more general perspective, increased Rew-P for positive feedback in the blocked condition and increased P2 for negative feedback in the random condition can be understood as manifestations of positivity and negativity biases, respectively. These biases are assumed to reflect adaptive strategies in response to situational demands (e.g., Scholer, Stroessner, & Higgins, 2008; for a review, see Peeters & Czapinski, 1990). For example, Sarinopoulos et al. (2010) found that under conditions of uncertainty the likelihood of negative events was strongly overestimated and selective neural responsivity to aversive stimuli increased. Interestingly, Xu et al. (2011) observed, in a gambling task, larger frontocentral feedback P2 after an uncertain cue than after a certain cue, which was interpreted in terms of a negativity bias. Thus, under uncertainty, as with the present random condition, negative feedback might be processed particularly intensely, whereas in case of certainty, as with the present blocked condition, positive feedback is prioritized, perhaps reflecting “well-established approach acts executed in a safe environment familiar to the actor” (Peeters & Czapinski, 1990, p. 37). The idea of a link between (fronto-)central P2 amplitude and an attentional negativity bias is well in line with two further studies (Weymar, Bradley, Hamm, & Lang, 2013; Huang & Luo, 2006). For example, in Weymar et al. (2013) words elicited a larger P2 when presented in a font color signaling threat of electric shock.
One might argue that, if the observed P2 effect indeed reflected a negativity bias under conditions of high ambiguity, this should be a general pattern not restricted to feedback (cf. Ito, Larsen, Smith, & Cacioppo, 1998). This notion was neatly confirmed by Experiment 2: During speeded explicit valence discrimination of highly emotional adjectives, negative words elicited larger P2 than positive words—but only in the high-ambiguity condition in which words were presented randomly. In the blocked condition, in contrast, always the same two words had to be discriminated for a series of trials, and P2 amplitude was virtually identical for positive and negative adjectives. Note that this conceptual replication of the ambiguity-sensitive valence effect on P2 used sets of affective words that were very carefully matched for various parameters such as emotionality, variability of emotionality within the sets, frequency of occurrence, and concreteness (see Experiment 2, Method). This strongly suggests that the P2 effect in the random condition of Experiment 1 was not an artifact of differences of this type between positive and negative words, which might somehow have enabled a more effective attentional selection of negative words. Rather, the observed P2 modulation obviously reflects the systematic influence of ambiguity on affective processing, with negative information being prioritized early-on in an ambiguous context. It seems plausible that the mental representation of feedback, which represents a particularly informative, behaviorally relevant type of affective information, is prone to such a bias in equal measure. More studies using, for example, different stimulus materials and/or different implementations of ambiguity are needed to further support this claim.
Role of Probability Effects
In Experiment 1, we presented negative feedback on only 40% of the trials, as opposed to 60% of positive feedback, to reduce potential frustration of our participants that might result from their inability to learn any of the alleged rules. However, FRN-D (Hajcak et al., 2007; Hewig et al., 2007; Holroyd & Krigolson, 2007), Rew-P (Kreussel et al., 2012; Eppinger et al., 2008; Holroyd et al., 2008), and P3b (Von Borries, Verkes, Bulten, Cools, & de Bruijn, 2013; Hajcak et al., 2007) are all known to be larger for low-probable feedback. It is therefore important to point out that our findings on P2 and Rew-P cannot be due to probability effects, because they concern differences between the blocked and random conditions to which the 60:40 ratio applied equally. On the other hand, LPC was larger for negative feedback in both conditions; therefore, we cannot completely rule out a probability effect for the LPC. Note, however, that probability effects are rather unlikely in this late ERP time range (500–800 msec). Consistently, the P3b that showed probability effects in Hajcak et al. (2007) and Von Borries et al. (2013) peaked much earlier, that is, right before 400 msec.
To sum up, in line with recent work (Liu et al., 2014; Pfabigan et al., 2014; Liu & Gehring, 2009) the present data suggest that FRN-D requires feedback that can be discriminated on a perceptual level. Although one conclusion may be that future research should employ only easily discriminable feedback (Liu & Gehring, 2009) or blocked rather than randomized presentation if multiple positive and negative feedbacks are used (Pfabigan et al., 2014), we advocate a somewhat different view: Feedback in everyday social contexts will often be of the more ambiguous type where identity of the specific stimuli indicating praise and blame cannot be fully predicted. This then requires a controlled semantic analysis, which additionally may benefit from an attentional bias operating on quickly available, implicit valence information. This study suggests centroparietal LPC and anterior P2, respectively, as potential ERP indices of these two mechanisms. At any rate, future studies should investigate situations of more ambiguous feedback in more detail, to get a broader understanding of human feedback processing. Our findings strongly support the notion that FRN-D and experimental effects on it are primarily due to the Rew-P elicited by positive feedback (e.g., Eppinger et al., 2008; Holroyd et al., 2008; Cohen et al., 2007). Finally, larger P2 for negative feedback in the random condition provides first evidence for a negativity bias under conditions of high feedback ambiguity, whereas the dominant Rew-P in the blocked condition suggests a positivity bias under conditions of low feedback ambiguity.
The authors thank Cynthia Bengs, Kristin Katschak, Hannah Kirsten, Jan Romppel, Rafaela Warkentin, and Christina Wittinghofer for help with data acquisition and Anja Leue for helpful comments on an earlier draft of this article.
Reprint requests should be sent to Henning Gibbons, Department of Psychology, University of Bonn, Kaiser-Karl-Ring 9, D-53111 Bonn, Germany, or via e-mail: firstname.lastname@example.org.
Although one might think of other labels (see General Discussion), the term “ambiguity” is used in the present article to refer to situations where no simple one-to-one mapping of perceptual features of the feedback onto feedback valence exists.
Note that the significant, condition-dependent effect of feedback valence on amplitude of the preceding P2 component precludes a meaningful peak-to-peak analysis of the FRN.
For reasons of completeness, analogous 2 × 2 × 2 ANOVAs were also computed for the Rew-P (280–360 msec at FCz) and the LPC (500–800 msec at CPz). Unlike with the P2 component, however, the effects of Feedback and Condition were not further modulated by Order of Conditions [Rew-P: interaction Order × Feedback, F(1, 26) = 1.3, p = .265; interaction Order × Condition × Feedback, F(1, 26) = 0.01, p = .921; LPC: interaction Order × Feedback, F(1, 26) = 0.4, p = .537; interaction Order × Condition × Feedback, F(1, 26) = 0.002, p = .966].