The differences between erroneous actions that are consciously perceived as errors and those that go unnoticed have recently become an issue in the field of performance monitoring. In EEG studies, error awareness has been suggested to influence the error positivity (Pe) of the response-locked event-related brain potential, a positive voltage deflection prominent approximately 300 msec after error commission, whereas the preceding error-related negativity (ERN) seemed to be unaffected by error awareness. Erroneous actions, in general, have been shown to promote several changes in ongoing autonomic nervous system (ANS) activity, yet such investigations have only rarely taken into account the question of subjective error awareness. In the first part of this study, heart rate, pupillometry, and EEG were recorded during an antisaccade task to measure autonomic arousal and activity of the CNS separately for perceived and unperceived errors. Contrary to our expectations, we observed differences in both Pe and ERN with respect to subjective error awareness. This was replicated in a second experiment, using a modified version of the same task. In line with our predictions, only perceived errors provoke the previously established post-error heart rate deceleration. Also, pupil size yields a more prominent dilatory effect after an erroneous saccade, which is also significantly larger for perceived than unperceived errors. On the basis of the ERP and ANS results as well as brain–behavior correlations, we suggest a novel interpretation of the implementation and emergence of error awareness in the brain. In our framework, several systems generate input signals (e.g., ERN, sensory input, proprioception) that influence the emergence of error awareness, which is then accumulated and presumably reflected in later potentials, such as the Pe.
Error monitoring is a fundamental part of everyday life and is implemented in an extensive network throughout the brain. Yet monitoring of actions and outcomes on a subconscious neuronal level is probably not sufficient to fully adapt one's behavior and attitudes to ameliorate a formerly imperfect outcome: Some degree of subjective insight into the defectiveness of one's own actions seems necessary.
In recent decades, cognitive neuroscience has identified areas in the brain that are forming a so-called “performance monitoring network,” which evaluates and supervises ongoing actions, matches intended against actual outcomes, and initiates a cascade of remedial actions if necessary. The cortical portion of this network is, in its major parts, situated in the posterior medial (pMFC; Ridderinkhof, Ullsperger, Crone, & Nieuwenhuis, 2004), and lateral parts (Kerns et al., 2004; Miller & Cohen, 2001) of the frontal lobe. Outside the frontal cortex, the insular cortex (IC) is also commonly being observed as a part of this system (Magno, Foxe, Molholm, Robertson, & Garavan, 2006; Ullsperger & von Cramon, 2004).
In an electrophysiological research, two functional correlates of error processing have mainly been reported: the error-related negativity (Ne/ERN; Gehring, Goss, Coles, Meyer, & Donchin, 1993; Falkenstein, Hohnsbein, Hoormann, & Blanke, 1990), a negative voltage deflection peaking at fronto-central electrode sites approximately 50–100 msec after the commission of an error, and the more parietally distributed error positivity (Pe; Overbeek, Nieuwenhuis, & Ridderinkhof, 2005; Nieuwenhuis, Ridderinkhof, Blom, Band, & Kok, 2001; Falkenstein, Hoormann, Christ, & Hohnsbein, 2000), which emerges at about 200 msec after an error. Consequently, those potentials are commonly attributed to reflect activity of the areas implicated in error monitoring. Converging evidence suggests the ERN to be generated in the rostral cingulate zone of the pMFC (Debener et al., 2005; Ullsperger & von Cramon, 2001; van Veen, Cohen, Botvinick, Stenger, & Carter, 2001; Dehaene, Posner, & Tucker, 1994). Attempts to find the source of Pe have generated rather heterogeneous results (Vocat, Pourtois, & Vuilleumier, 2008; O'Connell et al., 2007; Van Boxtel, Van Der Molen, & Jennings, 2005; Herrmann, Rommler, Ehlis, Heidrich, & Fallgatter, 2004; Brázdil et al., 2002; van Veen & Carter, 2002).
Previous work has demonstrated that errors also elicit a strong response of the autonomic nervous system (ANS). Most prominently, relative heart rate deceleration is consistently being reported to occur subsequent to errors (Fiehler, Ullsperger, Grigutsch, & von Cramon, 2004; van der Veen, Nieuwenhuis, Crone, & van der Molen, 2004; Crone et al., 2003; Hajcak, McDonald, & Simons, 2003). Also, other indices of autonomic arousal, such as changes in skin conductance and pupil diameter, are associated with error commission (O'Connell et al., 2007; Critchley, Tang, Glaser, Butterworth, & Dolan, 2005).
Although the differential properties of these systems, which monitor and respond to errors on a neuronal or physiological level, are fairly well understood and motivated a great deal of research, disproportionally little consideration has been put into the issue of subjective error perception. One of the first electrophysiological studies that explicitly addressed the consequences of whether subjects are aware of the erroneous nature of their own actions or not was conducted by Nieuwenhuis et al. (2001). They employed an eye movement paradigm (antisaccade task, AST), which yielded a sufficient ratio of perceived and unperceived errors, and found the Pe amplitude to be sensitive toward the fact of whether subjects recognized their erroneous saccades or not. The ERN, on the other hand, was unaffected by the subjects' subjective error awareness. Although this finding was in contradiction to an earlier study that found such a modulation in the ERN (Scheffers & Coles, 2000), this study was later replicated by Endrass, Reuter, and Kathmann (2007). In the same vein, a series of Go/NoGo experiments has been conducted to further investigate these effects in fMRI (Hester, Foxe, Molholm, Shpaner, & Garavan, 2005) using combined recordings of EEG and electrodermal activity (skin conductance reaction, SCR; O'Connell et al., 2007; see also Shalgi, Barkan, & Deouell, 2009; Endrass, Franke, & Kathmann, 2005). They also did not find a significant difference in ERN/pMFC activity. Additionally, in O'Connell et al.'s study, SCR has been shown to be increased on perceived errors compared with correctly withheld responses while being severely diminished on unperceived errors.
Using an adapted AST from the original Nieuwenhuis et al. (2001) study in an fMRI experiment, Klein et al. (2007) found the left anterior inferior IC to be exclusively active on perceived errors compared with unperceived errors, suggesting a decisive role of this area in the generation of subjective error awareness (see also Craig, 2002). Interestingly, the insular regions have been shown to be related to both the performance of subjects in an interoceptive heartbeat detection task (Critchley et al., 2005; but see Khalsa, Rudrauf, Feinstein, & Tranel, 2009) and the arousal measured via pupillometry (Critchley, Tang, et al., 2005), suggesting a decisive role of differential responses of ANS in error awareness.
It remains an open question, though, which differential role the ANS plays during error processing and how this has influenced (or is influencing) subjective error awareness. Heart rate deceleration and pupil dilation have been shown to differ with respect to erroneous and correct responses (Critchley, Tang, et al., 2005). None of these error-related effects of autonomic arousal have been investigated with respect to subjective error awareness.
The current study was designed to test the differences between the ANS activity evoked by response errors in comparison with correct responses, specifically with regard to subjective error awareness. We, therefore, employed the same AST that was used in earlier studies (Endrass et al., 2007; Klein et al., 2007; Nieuwenhuis et al., 2001) and directly compared the autonomic reactions (ECG and pupil diameter) with respect to perceived and unperceived errors.
Furthermore, as we observed deviating results from the aforementioned AST studies with regard to the ERN in the first experiment, we conducted a second experiment with a slightly different stimulus layout and timing, in which we show that the ERN amplitude is indeed sensitive toward the difference between perceived and unperceived errors. This is backed up by the additional finding that the amplitude of ERN varies significantly with the speed of the participants' rating of their accuracy: Larger ERN amplitudes lead to faster error signaling. On the basis of these data and previously established relations between pMFC and autonomic responses, we propose a central role for pMFC in the generation of subjective error awareness.
Nineteen neurologically and psychiatrically healthy subjects were recruited from the institute's database. One subject's data set had to be excluded because of technical problems (neither eye tracker nor EOG yielded a clean signal); another subject's data set showed an insufficient number of errors (less than 10 in both conditions) and was, therefore, also excluded from further analysis. This left a data set of 17 right-handed subjects (two women) with a mean age of 24.4 years (SD = 3.9 years, ranging from 20 to 33 years). All subjects have normal or corrected-to-normal vision. Participants gave written informed consent and received a payment of €10 per hour of the session.
We adopted the experimental paradigm from Klein et al. (2007), which itself is an adaptation of the original AST paradigm used by Nieuwenhuis et al. (2001). In this AST, each trial began with a random fixation period of 2050–3250 msec, in which two white dashed square outlines (subtending a visual angle of 1.3° × 1.3°) and a central fixation cross (0.3° of visual angle in diameter) were presented in the vertical center of an otherwise black screen. The centers of the squares are 22° apart from one another. After the fixation period, a white circle (diameter of 0.9°) was displayed in one of the two squares for 200 msec. Subjects were instructed to direct their gaze away from the central fixation cross upon onset of this stimulus into the square that does not contain the circle. After a period of 1500 msec, a query appeared, requiring the subject to indicate whether they think that they made a mistake on the trial or not (via a manual response box). A mistake was explained to the subject as a trial on which they directed their gaze toward the square containing the imperative circle stimulus before directing their gaze toward the required location. They had 2000 msec to indicate the accuracy of their performance. In case no button was pressed, the next trial began, and the preceding trial was excluded from the analysis (as were trials in which more than one button press occurred as a response to the rating prompt).
In addition, to increase the likelihood of an erroneous saccade on 50% of the trials, “precues” were displayed 50 msec before stimulus onset, which consisted of the dashed outlines of the squares that became solid for the time period of 50 msec. Fifty percent of those precues were “incongruent” (i.e., the precue appeared in place of the square, which will contain the imperative stimulus) and 50% were “congruent.” The total trial length varied between 5850 and 7050 msec. The experiment comprised of 424 trials, evenly split over four experimental blocks (Figure 1).
EEG activity was recorded with Ag/AgCl sintered electrodes mounted in an elastic cap (Easycap, Herrsching, Germany) from 60 scalp sites of the extended 10–20 system. The ground electrode was positioned at F2. The vertical EOG was recorded from electrodes located above and below the left eye. The horizontal EOG was collected from electrodes positioned at the outer canthus of each eye. Data were on-line referenced to electrode CPz and re-referenced off-line by subtracting the average of all electrodes from each individual electrode signal.
Electrocardiographic data were collected using an additional electrode attached to the lower back of the subject, which was also referenced against CPz. All electrode impedances were kept below 5 kΩ.
EEG, ECG, and EOG were recorded continuously and were A–D converted with a 16-bit resolution at a sampling rate of 500 Hz using BrainAmp MR plus amplifiers (Brain Products, Gilching, Germany).
For pupil tracking, a monocular SMI iView X camera-based system (Sensomotoric Instruments, Berlin, Germany) with a temporal resolution of 1250 Hz and a spatial resolution of approximately 0.25–0.5° was used. Saccadic event information and continuous pupil diameter information were exported into ASCII format for further processing in Matlab. In case of an eye tracker malfunction, saccadic event information was extracted from the horizontal EOG data to still be able to the analyze EEG and ECG data.
After preparation of the electrophysiological instruments, subjects were briefed about the task and then signed informed consent. Subjects sat in front of the eye tracker (placed 0.8 m away from the computer screen), and their head position was adjusted in order for the eye tracker to function properly while maintaining maximal seating comfort. The whole experimental setup was situated in an electrically shielded chamber. The light in the chamber was turned off after subject preparation. After calibration of the eye tracker, the subjects performed a test phase of 15 AST trials to get accustomed to the task. After the experiment, the subjects were debriefed and paid for their participation.
All data analyses, unless declared otherwise, were done using custom routines under Matlab 2007a (The MathWorks, Natick, MA) and the EEGlab 6.01b toolbox (Delorme & Makeig, 2004).
Saccadic onsets and offsets were extracted from both the eye tracker and horizontal EOG using a procedure similar to Marple-Horvat, Gilbey, and Hollands (1996), which includes computation of a derivative of the continuous signal and subsequent peak value detection from that derivative using a moving window rationale. Response markers were set to the onset of the first saccade following the stimulus that exceeded a threshold of 1.2° of visual angle. All saccades were visually checked for accurate classification by the automatic algorithm (misses and false alarms of the algorithm combined for less than 5% of all trials in all subjects and were discarded from further analysis).
Saccades taking place in the first 80 msec poststimulus were treated as anticipatory (Fischer et al., 1993; Wenban-Smith & Findlay, 1991); such trials were dismissed from the data. Trials without saccades were treated as misses and were also excluded from further analysis.
On erroneous trials, corrective saccades were marked in cases where a saccade in the correct direction was made in the period between the response and the onset of the rating screen.
The recorded EEG data were filtered via a linear phase FIR filter to a frequency bandwidth of 0.5–50 Hz. The data were re-referenced to common average and sliced into stimulus-locked epochs of −100 to 1900 msec around the stimulus onset. The data epochs were checked for gross movement and EMG-related artifacts via means of visual inspection; epochs containing such artifacts were dismissed from the data set. The remaining epochs were decomposed into independent components using a temporal infomax independent component analysis algorithm implemented in EEGlab (Delorme, Sejnowski, & Makeig, 2007), preceded by a PCA, reducing the data dimensionality to 40 dimensions. The resulting independent components' topographies and time courses for each data set were visually inspected for ICs representing eye blinks, horizontal eye movements, and electrode artifacts. Such components were removed from the data before inverse matrix multiplication. The resulting artifact-corrected data were re-epoched relative to the response saccade (i.e., the first saccade after stimulus onset), covering a period from −100 to 700 msec. A baseline subtraction was performed with the time range of 100 msec preresponse to response onset as baseline period.
After computing the ERPs (in this case, ERN and Pe) through standard averaging procedures, ERN was defined via a trough-to-peak measurement, quantifying the ERN as the difference between the most negative voltage deflection at the electrode site FCz in the first 150-msec postresponse and the most positive voltage deflection between the response and the aforementioned negative peak. Pe was defined as the mean voltage amplitude in the time range spanning from 200 to 500 msec after the response.
The ECG data were digitized at a sampling rate of 500 Hz. Afterwards, the R peaks of the ECG's QRS complex were identified using an algorithm provided as part of the FMRIB plug-in for EEGlab (provided by the University of Oxford Centre for Functional MRI of the Brain; Niazy, Beckmann, Iannetti, Brady, & Smith, 2005). The length of the interbeat interval was calculated as the difference between two successive R peaks. Heart rate values for the time in the interbeat interval itself were obtained by means of linear interpolation. The data were checked for artifacts via visual inspection and subsequently epoched into segments spanning from 500 msec preresponse to 5500 msec postresponse, with the period of 500 msec preresponse serving as mean amplitude for baseline correction. It was then averaged separately for perceived and unperceived errors. Heart rate deceleration was quantified by extracting mean amplitudes of 0–500 msec, 500–1500 msec, 1500–2500 msec, and so on.
After importing into Matlab, vertical pupil diameter data were checked for artifacts stemming mainly from eye blinks, as the eye tracker temporarily loses visual contact with the pupil when the eyelid closes. Such eye-blink-related artifacts were corrected by means of linear interpolation, because dismissing contaminated epochs in no case left enough data to warrant reliable analyses of all trial types (details on the interpolation procedure can be found in the Supplementary Data).
The continuous pupil diameter data were subsequently down-sampled to 500 Hz and epoched into 7000 msec long segments, ranging from 1000 msec preresponse to 6000 msec postresponse.
After applying baseline subtraction (with the 200 msec of stimulus presentation serving as baseline), the data were averaged separately for correct trials, perceived and unperceived errors.
Because of the nature of AST, measurements of pupil dilation will almost automatically be influenced by the pupillary light reaction: Because the imperative stimulus consists of a white circular expanse, the luminatory properties of the visual field change once the stimulus sets in. Additionally (and more severely), the type of the trial might systematically influence the size of the pupillary light reflex, because on error trials, the subjects initially move their gaze toward the light stimulus at first, whereas on correct trials, this is not the case. To countermand those effects and consequently still be able to compare between trials, we performed a trial-and-sample-wise residualization of the pupil data, separately for erroneous trials (saccade toward the light stimulus) and correct trials (saccade away from the light stimulus). To do so, for each subject, we extracted two vectors of n (n = number of trials) values for single-trial saccadic RTs and saccade size. Also, we extracted 3500 vectors of n pupil diameter values (one for each of the 3500 samples of every epoch). Subsequently, we regressed saccadic RT, saccade size, their interaction, and their respective second-order polynomials (to account for nonlinear associations) onto each data vector for the whole epoch. After this regression (which accounts for most of the pupillary light reaction, as in Figure 5B), we retained only the residuals of each data vector and continued calculations with these data.
Repeated measures analyses of variance with the factor trial type (correct trials, perceived errors, and unperceived errors) were used to test for global effects, unless otherwise specified. In case of a violation of the sphericity assumption (with ɛ < .7), the Greenhouse–Geisser correction was employed (uncorrected degrees of freedom are reported for reasons of clarity). In case of more severe violations (ɛ > .7), we used the Pillai–Bartlett trace for testing of significance (Mendoza, Toothaker, & Nicewander, 1974). These cases have been marked with an asterisk. All contrasts are corrected for cumulating type I error probabilities by means of the Bonferroni–Holm procedure. η2 denotes the partial eta squared coefficient as a measurement of effect size.
An rmANOVA with factor Trial Type revealed a significant main effect on saccadic RT (F(2, 32) = 54.17, p < .001, η2 = .77). Correct saccades (averaged subject medians = 293 msec, SD = 30 msec) were significantly slower than both perceived (averaged subject medians = 192 msec, SD = 42 msec) and unperceived errors (averaged subject medians = 186 msec, SD = 33 msec). Perceived and unperceived errors did not differ with respect to RT (p = .7).
Subjects displayed an average error rate of 22.5% (perceived errors: 14%, SD = 8.4%; unperceived errors: 8.5%, SD = 8.2%). Error rate was different neither for Error Type (F(1, 16) = 2.991, p > .1) nor for the interaction of Error Type and Type of Precue (F(2, 15)* = 1.467, p = .262). However, the type of precue had a significant effect on error rate (F(2, 15)* = 11.973, p < .01, η2 = .615). Whereas the error rates for no precue and congruent precue were identical (p = 1), the error rate was significantly increased in the incongruent condition (p < .001 for both comparisons).
Similar to Klein et al. (2007), we did not find significant post-error slowing on a group level, which we attribute to the very long trial durations that left enough time for remedial actions before the next trial starts (Jentzsch & Dudschig, 2009). Still, when compared directly, post-error RTs were slower for perceived than for unperceived errors (t(16) = 2.97, p < .01, d = .42; averaged subject medians for corrects = 279 msec, SD = 30 msec; averaged subject medians for perceived errors = 284 msec, SD = 33 msec; averaged subject medians for unperceived errors = 269 msec, SD = 35 msec).
With respect to saccade size, there was a significant main effect of Error Type (F(2, 32) = 47.33, p < .001, η2 = .75). Saccade sizes for unperceived errors were smaller than both saccades for perceived errors (p < .001) and for correct trials (p < .001). Correct trials and perceived errors did not differ with respect to saccade size (p > .9).
Error correction rate tended to be higher for unperceived errors (88.4%) than for perceived errors (81.1%, t(16) = 1.96, p = .068). Error correction latencies (time from offset of the erroneous response saccade to the onset of the following corrective saccade) also differed largely between the two conditions: Unperceived errors were corrected approximately after 160 msec (median), whereas perceived errors took longer to be corrected (386 msec, t(16) = 6.14, p < .0001).
Autonomic Nervous System
We observed the expected heart rate deceleration effect after errors (main effect of factor TRIALTYPE in the latency range from 500 to 1500 msec: F(2, 32) = 4.4, p < .05, η2 = .22; contrast corrects vs. perceived errors: t(16) = 3.12, p < .05). Importantly, this effect is absent in unperceived errors. In fact, the heart rate following unperceived errors is even less decelerated than following correct trials. This does not reach statistical significance, though (contrast corrects vs. unperceived errors: t(16) = 1.27, p = .052), yet parallels the SCR results of O'Connell et al. (2007).
As phasic heart rate changes are not a uniphasic phenomenon, this effect is also present on the subsequent heart rate reacceleration (latency range from 4500 to 5500 msec poststimulus, not shown in Figure 4: F(2, 32) = 4.92, p = .01, η2 = .235).
Because we could not make qualified a priori assumptions as to the exact latency range of the pupil diameter effects, we computed random permutation tests (perceived vs. unperceived errors) for the whole epoch by means of a Monte Carlo simulation with 10,000 iterations using custom Matlab routines. The extracted p values were subsequently corrected by means of a false discovery rate correction for multiple comparisons (α = .05, one-sided). Stretches of data where the difference between perceived and unperceived errors was significant are labeled with black markers on the x axis of Figure 5C. The pupil diameter was significantly wider for perceived than for unperceived errors in time ranges 200–300 msec, 1640–1740 msec, 1820–1910 msec, 2710–2800 msec, and after 2930 msec relative to stimulus onset.
Regarding the Pe, we observed the same effects reported in the aforementioned studies focusing on ERPs while assessing awareness of error processing. The Pe is enhanced for errors compared with corrects, indicated by a significant main effect of the factor TRIALTYPE (F(2, 32) = 9.92, p < .001, η2 = .42). Importantly, this enhancement is more pronounced for perceived errors as opposed to unperceived errors (t(32) = 2.14, p < .05), confirming the results previously reported for Pe.
Unexpectedly, ERN shares similar properties: It is largest for perceived errors (factor TRIALTYPE: F(2, 32) = 3.23, p = .05, η2 = .188), yet it is severely diminished for unperceived errors. A contrast between ERN to perceived errors and the negativity following correct trials (correct-related negativity, CRN) reveals a significantly larger ERN (t(32) = 2.0, p < .05). For the unperceived errors, this contrast is nonsignificant (p > .2), meaning the ERN after unperceived errors did not differ from the CRN. A contrast of ERN to perceived errors versus unperceived errors reveals that those ERPs differ significantly (t(32) = 2.2; p < .05), indicating that, in our experiment, the ERN is sensitive to error awareness.
Regarding the correlation of ERP amplitudes and speed of the accuracy rating, no associations were found in this experiment.
The results regarding the ANS will be discussed in the general discussion following Experiment 2. As for the CNS potentials, we found results that partially contradict the findings first reported by Nieuwenhuis et al. (2001, and replicated by Endrass et al., 2007). Whereas we find a significantly enlarged Pe for perceived as compared with unperceived errors and correct trials, which is in line with the previous findings, we also find a similar modulation of the ERN due to error awareness. As in Figure 6 (and also from Figure 7, electrode Cz), the ERN is significantly smaller for unperceived than for perceived errors. Because this was an unexpected outcome, judging from the previously published interpretations of the differential roles of ERN and Pe in (un-)conscious error processing in AST, we tried to shed further light on this result.
To do so, we conducted a second experiment, in which we again employed the AST but, this time, designed the trial timing and stimulus layout to exactly match the original study from Nieuwenhuis et al. (2001). Because the first experiment was, with respect to the stimulus layout and trial timing, identical to an fMRI study published earlier by our group (Klein et al., 2007), the trial timing was considerably slower than in the previously reported EEG studies. Consequently, the biggest alteration from Experiment 1 to Experiment 2 is the faster trial timing, which is now between 2715 and 3915 msec instead of 5850–7050 msec in Experiment 1. This difference is mainly because of the shortened intertrial interval, yet there are also important alterations with regard to intertrial timings, namely stimulus display duration (117 msec instead of 200 msec) and response window (time between stimulus onset and rating screen: 1000 msec instead of 1500 msec).
Twenty neurologically and psychiatrically healthy subjects were recruited from the institute's database. Two subjects' data sets had to be removed from the sample because of lack of sufficient error numbers (less than 10 in any of the conditions). One subject had to be excluded because of an intolerable amount of misses (no saccades present in neither eye tracker nor EOG data on more than one third of the trials), suggesting limited motivation to follow the task instructions. This left a data set of 17 right-handed subjects (14 women) with a mean age of 24.5 years (SD = 2.9 years, ranging from 20 to 33 years). All subjects have normal or corrected-to-normal vision. Participants gave written informed consent and received a payment of €10 per hour.
In this experiment, we adapted the stimulus configuration (trial timing and stimulus layout) of the study by Nieuwenhuis et al. (2001). Alterations to Experiment 1 were as follows: Regarding the trial timing, the initial fixation period was now between 1000 and 1400 msec long, followed by a 200-msec gap, which included an optional 50 msec precue with the same probabilities as in Experiment 1. The imperative stimulus was now displayed for 117 msec, followed by a 1000 msec response window and a rating screen that lasted for between 400 to 1250 msec. As a consequence, trial durations now ranged from 2715 to 3915 msec, which was considerably shorter compared with Experiment 1. The stimulus layout was changed from white dashed squares as target stimuli to yellow solid squares that subtended 3.4° of visual angle and whose centers were 19° apart from each other. The target stimulus was still a white circle, subtending 1.2° of visual angle; the central fixation dot subtended 0.6°. The optional precues now consist of a brief thickening of the outlines of the squares.
The stimulus material is of considerably bigger size compared with Experiment 1. Because of the shorter trial timing, subjects were now presented with 800 stimuli, evenly spread out over eight experimental blocks.
The remaining differences between this paradigm and the study of Nieuwenhuis et al. (2001) concern the absence of vertical saccades in our version of the task (which were originally introduced by the authors to warrant more overall errors but omitted from further analyses), and the substitution of the white cross in the target box that prompted the subjects to signal (or not signal) their errors. Instead, we used the same rating screen as in Experiment 1, which prompted subjects to indicate whether they felt their saccade was either wrong or right (Figure 8).
Electrophysiology, Procedure, and Data Analysis
Preparation of EEG, behavioral data analysis, task procedure, and subsequent data analysis was identical to Experiment 1.
An rmANOVA with the factor TRIALTYPE revealed a significant main effect on saccadic RT (F(2, 32) = 85.8, p < .001, η2 = .84). Correct saccades (averaged subject medians = 254 msec, SD = 36 msec) were significantly slower than both perceived (averaged subject medians = 149 msec, SD = 28 msec) and unperceived errors (averaged subject medians = 164 msec, SD = 32 msec). Perceived and unperceived errors did not differ with respect to RT (p = .092).
The subjects displayed an average error rate of 16.7% (perceived errors = 11.4%, SD = 7.9%; unperceived errors = 5.3%, SD = 4.4%). In this version of the experiment, error rates did differ with respect to awareness, with subjects making more perceived than unperceived errors (F(1, 16) = 5.25, p < .05, η2 = .236) Also, contrary to Experiment 1, there was a significant Precue Type (incompatible, compatible, or none) × Error Type (perceived or unperceived) interaction (F(2, 32) = 7.27, p < .5, η2η2 = .3) with respect to error rate, revealing that more perceived errors were made on incompatible trials. Additionally, there was a significant influence of precue type (F(2, 32) = 74.4, p < .001, η2 = 8.14), with subjects making most errors on incompatible trials (p < .001 for both comparison) while also making fewer errors on compatible trials compared with no-cue trials (p < .05).
As expected, the post-error slowing effects replicated the findings of Nieuwenhuis et al. (2001). There is significant post-error slowing (F(2, 32) = 7.26, p < .01, η2 = .31), yet only for trials following perceived errors (averaged subject medians = 258 msec, SD = 41 msec; averaged subject medians for postcorrect trials = 241 msec, SD = 49 msec). Trials following unperceived errors (averaged subject medians = 238 msec, SD = 52 msec) do not differ in RT from trials following corrects (p = .75; averaged subject medians for corrects = 240 msec, SD = 40 msec).
With respect to saccade size, there was a significant main effect of Error Type (F(2, 32) = 7.31, p < .01, η2 = .3). Saccade sizes for unperceived errors were smaller than both saccades for perceived errors (p = .017) and for correct trials (p = .024). Saccade sizes of correct trials and perceived errors did not differ (p > .9).
Error correction rate was significantly higher for unperceived errors (86.7%) than for perceived errors (70.5%, t(16) = 4.12, p < .001). Error correction latencies (time from offset of the erroneous response saccade to the onset of the following corrective saccade) also differed largely between the two conditions: Unperceived errors were corrected approximately after 169 msec (median), whereas perceived errors took longer to be corrected (294 msec, t(16) = 4.8, p < .001).
As expected, the effects of Pe initially reported by Nieuwenhuis et al. (2001) were again observed in this experiment (see Experiment 1, global rmANOVA: F(2, 32) = 7.53, p < .01, η2 = .32; perceived vs. unperceived errors: t(32) = 2.07, p < .05, d = .52).
Identical to Experiment 1, though, the ERN was also sensitive to error awareness (global rmANOVA: F(2, 15)* = 4.19, p < .05, η2 = .36; contrast perceived vs. unperceived errors: t(32) = 2.2, p < .05). Also noteworthy was the relatively big CRN in this data set (CRN = −1.25 μV; perceived_ERN = −1.2 μV; unperceived_ERN = −0.5 μV; Figure 11).
Because of the significant mismatch between both error types with respect to both overall error rate and precue specific error rate (significant Precue Type × Error Type interaction; see Behavioral Results), we performed a matching procedure, randomly sampling identical amounts of trials, with respect to precue type, for all three types of responses, consequently eliminating possible effects of trial and precue number from the ERPs. This analysis did not change the results, except for the fact that the ERN for perceived errors was then numerically bigger than the CRN (see Supplementary Figure 1).
In addition to these results, we investigated the relationship between the size of the ERN and the time that the subjects take to rate the accuracy of their behavior; that is, the time between the onset of the rating screen and the depression of the button indicating whether the subjects felt they made an error or not (rating time). The assumption here was that the longer the subjects take to rate their behavior, the higher their uncertainty regarding the decision.
Two subjects have to be excluded from this analysis, because they pressed the rating button during the response window and did not wait for the onset of the rating screen.
We observed a highly negative subject level correlation of the ERN and individual uncertainty as indicated by rating RTs. The amplitude of the ERN for unperceived errors correlates with the rating time after unperceived errors (r = −.57, p < .05, Pearson's correlation), whereas the amplitude of the ERN for perceived errors correlates with the rating time after perceived errors (r = −.63, p < .05, Pearson's correlation). A scatter plot of these correlations can be found in Figure 12.
To further investigate whether this relationship holds true in the intrasubject level and to connect it explicitly to subjective error awareness, we performed a median split of each individual subject's trials on the basis of the rating time for every trial type. The ERP waveforms can be seen in Figure 13 and revealed a marginally significant main effect of rating speed (F(1, 14)* = 3.5, p = .08, η2 = .21). This main effect is almost solely driven by the difference between the slow and fast rating trials for perceived errors (see Figure 13; t(14) = 2.65, p < .01, d = 1.16 for this contrast), whereas the ERN to unperceived errors does not differ with respect to rating time on an intrasubject level (t(14) = 0.45, p = .66).
As in Experiment 1, we did not only find enhanced Pe potentials for perceived errors compared with correct trials and unperceived errors, but we observed the same result regarding the ERN amplitude's sensitivity to subjective error awareness as we did in Experiment 1.
In addition, in this experiment we found a strong (negative) correlation between the size of the ERN and the time the subjects take to assess their own behavior (indicated by their RT toward the rating screen). This measurement is an indicator for the participants' subjective certainty of their behavioral assessment and is, as in Figure 13, strongly dependent on the size of the preceding ERN: The bigger the ERN, the faster the detection of an erroneous response. A reason why we did not find this relation in Experiment 1 suggests itself when looking at the trial timing: Whereas in this version of the task subjects had only the fraction of a second previously to the rating screen to assess their behavior before the rating is demanded by the paradigm (1000 msec response window minus the actual RT), they had 0.5 sec more to come up with a preassessment (which adds up to more than a second of time between the response and the rating screen, on average) in Experiment 1. This explanation is backed up by recent findings, suggesting a decisive role of the length of the RSI for error-related trial-by-trial adjustments (Jentzsch & Dudschig, 2009). This, in combination with the overall slower trial timing, which could lead subjects to process the task in a more unconstrained manner than in the relatively fast second version (as also indicated by the faster average correct trial RT medians in Experiment 2 (t = 3.44, p < .01, d = 1.21)), could lead to an attenuation of these effects in Experiment 1.
We employed a previously established AST to investigate autonomic (ECG, pupil diameter) and ERP (ERN/Pe) reactions toward perceived and unperceived errors. In doing so, we found the response-locked Pe and ERN potentials to be sensitive to subjective error perception. We also showed that the established heart rate deceleration effect following erroneous actions is enlarged for perceived in comparison with unperceived errors. Furthermore, we showed that the previously demonstrated error-related pupil dilation is also significantly enlarged for perceived in comparison with unperceived errors.
Heart rate deceleration after errors has been reported as early as 1971 (Danev & de Winter, 1971) and has consistently been found in the context of performance monitoring (van der Veen, van der Molen, Crone, & Jennings, 2004; Crone et al., 2003; Hajcak et al., 2003; Somsen, Van der Molen, Jennings, & van Beek, 2000). Danev and de Winter already offered an interpretation of the post-error heart rate deceleration effect as a corollary of the orienting response (OR; Sokolov, 1960). The OR is a reflex-like reaction of the organism to improbable changes in its environment, especially those which are potentially motivationally relevant. It is accompanied by a cascade of central and ANS reactions, including heart rate, pupil diameter, and skin conductance changes associated with increased arousal. This interpretation along the OR has not explicitly been taken into consideration in later studies of post-error heart rate deceleration, yet accounts of linking error processing and post-error processes to the OR, in general, are just now being picked up on again (Notebaert et al., 2009). Interestingly, a lot of current discussion concerns the phenomenological similarities between the Pe and the P3a/P3b complex (e.g., Overbeek et al., 2005) and possible functional equivalencies between those potentials. Because parts of the P3a/P3b complex have been hypothesized as a possible ERP correlate of the OR (e.g., Ritter, Vaughan, & Costa, 1968), an interpretation of the heart rate deceleration effect along this vein is very tempting. Indeed, as both pupil size and SCR are also said to mirror the OR, the parallels between the heart rate and pupil diameter results reported here, and the SCR results reported by O'Connell et al. (2007), especially with regard to subjective error awareness, seem to support these interpretations. Perceived errors seem to provoke a stronger OR compared with unperceived errors, which then again poses the question of whether this is an effect of the subjective awareness or, in turn, whether the OR itself triggers processes that eventually lead to subjective error awareness. Most measurements of ANS activity, though, are too limited in their time resolution to be able to allow for such causal judgments, although in our study, significant pupil diameter differences already evolve immediately after stimulus offset (see Figure 5C), that is, around (or even before) most of the responses occur (the effects reported by Critchley, Tang, et al., 2005, were in a similar latency range, immediately following stimulus presentation). Interestingly, in recent years, there have been theories that link pupil diameter (as an indirect index of locus coeruleus noradrenergic activity) directly to task performance, such as the adaptive gain theory (Gilzenrat, Nieuwenhuis, Jepma, & Cohen, 2010; Aston-Jones & Cohen, 2010). It would be interesting for future studies to find out how these effects might relate to emerging error awareness. Regarding the question of a possible causal order between ANS activity, OR, and error awareness, lesions studies, possibly involving IC and/or ACC, should be able to provide greater insights.
In contrast to the ANS findings, ERP measurements of cognitive control offer a quicker index of error-related physiological activity. In this domain, as mentioned before, our results partially contradict previous experiment using AST and ERPs (Endrass et al., 2007; Nieuwenhuis et al., 2001) as well as a previous fMRI study using AST (Klein et al., 2007). Whereas in these AST experiments, no significant ERN amplitude or pMFC activity differences between both error types were reported, we found them in both of our experiments. Also, we found that ERN amplitudes were clearly related to the time it took the subjects to rate their own behavior, which might be an indicator of the subjective certainty of their assessment. Differences in ERN amplitude seem to be explained by a subset of trials where subjects were rather certain that they just had committed an error. In those trials, they had a pronounced ERN and subsequently signaled their errors very quickly. Interestingly, there were no significant differences in ERN amplitude between slow- and fast-rated trials in unperceived errors. Neither were there differences between perceived error ERN amplitudes and unperceived error ERN amplitudes in the respective slow rating percentiles, indicating that in such high-uncertainty cases there might be other factors than the ERN that can “tip the scales” toward signaling or not signaling one's error.
The question remains, why studies using similar or identical experimental AST layouts do not find a significant modulation of the ERN/pMFC by subjective error awareness. There are several possible reasons for this.
The finding of ERN differences between perceived and unperceived errors, although surprising in the context of the AST, is not completely unexpected when considering other paradigms. Whereas in Go-NoGo experiments, null effects with respect to ERN amplitude/pMFC activity have been reported (O'Connell et al., 2007; Hester et al., 2005), there have been studies using different experimental designs that did show significant differences in the ERN amplitudes depending on subjective error perception. In a flanker task, Maier, Steinhauser, and Hubner (2008) showed a severely diminished ERN for unperceived errors, in line with the results obtained in our experiment. Also, already before the first error awareness study using AST, there had been a study by Scheffers and Coles (2000), demonstrating a significant ERN amplitude effect with respect to subjectively perceived accuracy, also finding smaller ERNs for nonperceived errors. This has again been found in a recent experiment using a perceptual discrimination task (Steinhauser & Yeung, 2010). These findings led us to propose an accumulating evidence account of conscious error perception (Ullsperger, Harsay, Wessel, & Ridderinkhof, 2010). When insufficient information about an error accumulates, it is unlikely to be consciously perceived. In manual tasks described above, it appears that insufficient information about the intended action outcome is gathered, either because of reduced stimulus perception (Maier et al., 2008; Scheffers & Coles, 2000) or because of insufficient activation of complex task rules (Hester et al., 2005). In contrast, in rapidly corrected errors in the AST, the information about the erroneous eye movement may be insufficient and, thus, hamper conscious error perception. Whereas in manual motor tasks evidence about the executed response is gathered from the efference copy, proprioceptive, somatosensory, visual (seeing the finger move), and auditory (hearing the click sound) feedback, less such evidence is available for eye movement. Because of saccadic visual suppression, usually no visual percept is generated during eye movement. Thus, the visual percept after a rapidly corrected saccadic error indicates that the correct target has been reached, whereas the short erroneous saccade likely remains unnoticed. Thus, shorter prosaccades followed by rapid corrective antisaccades provide less error evidence and should be less often detected. This view was supported by our finding that, compared with unperceived errors, consciously perceived errors were reliably bigger in saccade size (which equaled the size of correct antisaccades) and associated with less and slower corrections. In addition, according to the conflict monitoring theory as well as the mismatch model, the larger erroneous prosaccade should result in a larger ERN amplitude for perceived errors (Yeung, Cohen, & Botvinick, 2004; Botvinick, Braver, Barch, Carter, & Cohen, 2001).
However, to our knowledge, the present study is the first demonstration of such effects in the AST. A comparison of our study with the studies finding null effects reveals several possible explanations for this apparent dissociation.
With the exception of the experiment of Nieuwenhuis et al. (2001), error-awareness-related AST studies (Endrass et al., 2007; Klein et al., 2007) generally showed numerically larger ERN amplitudes/pMFC activity for perceived compared with unperceived errors. The fact that in the original experiment of Nieuwenhuis et al. no such difference was found (the ERN amplitudes to unperceived errors were even numerically higher than the ERN amplitudes to perceived errors) possibly stems from a decisive difference in experimental design. Whereas in the current study (and both Klein et al.'s and Endrass et al.'s studies), subjects were prompted to push a button both when they thought they were correct and when they thought they had made an erroneous saccade, in Nieuwenhuis et al.'s original study, subjects were prompted to only indicate their errors via a button press. In effect, in their experiment, trials in which subjects were very unsure about the accuracy of their performance, that is, where they might hesitate when having to signal (or not signal) an error, are prone to be classified as unperceived errors (if there indeed was an erroneous saccade), because (a) the next trial might start and the rating period would be over and (b) pressing a button is more effortful than not pressing a button, introducing a possible response bias toward not signaling an error.
Because both Klein et al.'s and Endrass et al.'s studies, using the same rating rationale as the one used here, did find numerically different pMFC/ERN activity in the same direction as found in our study, one main reason for those null findings might potentially relate to statistical power. Almost all previously reported studies (this one included) have rather small sample sizes, increasing the type II error probability. This is especially true given the relatively small amplitudes of the ERN in those paradigms (compared with other paradigms; the same holds true for the Go/NoGo studies). Also, if one reviews the fMRI results reported by our group earlier (Klein et al., 2007), the hemodynamic response function obtained from the pMFC is considerably bigger for perceived than unperceived errors by visual judgment (Figure 3A in Klein et al.), this, however, fails to reach statistical significance in the 13 subject sample presented there (p = .211). Thus, similar tasks need to be conducted with bigger samples in future studies to warrant sufficient power of statistical testing.
Adding to that, in the results presented here, we find an unusual amount of subjects that do not yield a visible ERN on a single subject basis (three subjects in Experiment 1 and four subjects in Experiment 2). This might have to do with the task itself, as this task is very different from other performance monitoring paradigms, especially with regard to the response domain. As subjects are very well used to monitoring their hand and finger movements (e.g., button presses) in real life and psychological experiments, actively monitoring eye movements, with the eyes being the main motor effector here, is an unusual situation for most participants. Those non-ERN subjects in our analysis significantly decreased statistical power, yet we still included them in the analysis, because there was no behavioral justification for a removal. Unfortunately, there is currently no manual choice reaction time task available that yields a sufficient number of unperceived errors (there is, however, a Go/NoGo paradigm by Hester et al., 2005).
However, a different finding from the initial study of Nieuwenhuis et al. (2001) (and subsequent replications) could again be reported in our experiments, that is, the Pe being highly sensitive toward conscious error perception.
These results lead provide further support for the notion that accumulating evidence leads to the emergence of subjective error awareness (Ullsperger et al., 2010). Our results from the ERP analysis indicate that both ERN and Pe vary with subjective error perception and that the ANS reacts differentially toward consciously and not consciously perceived errors on a broad scale (both heart rate and pupil size). As discussed above, there was simply more perceptual and proprioceptive evidence for the cognitive system to detect an error on consciously perceived trials. As this information has to be encoded in the performance monitoring system at some point, we propose the ERN as the most likely correlate of the encoding of this rather “objective,” external error evidence. Evidence from sensory input and emerging response conflict (potentially coded in the ERN), along with several other sources of possibly available information, for example, the efference copy, proprioceptive cues, etc. (and maybe even early ANS modulations), can subsequently serve as hints for the cognitive system to evaluate its own performance in terms of accuracy, a process more likely reflected in the later Pe. A very interesting question, as stated before, would be whether the ANS activity changes resembling the OR are also part of the input to this evaluation process or rather part of its output, that is, an effect rather than a cause of emerging error awareness. However, our experiments do not allow any causal or chronological interpretation at this point, and it is also hard to disentangle the differential properties of ERN and Pe. Still, future experiments that might help in building an integrative framework of ERN, Pe, and ANS activity might not only focus on lesion experimentation but should also try to disentangle rather objective properties of the stimulus material/task context that influence error awareness like error magnitude and error correction from the subjective rating process. This could, for instance, be done by means of introducing different forms of detection/response biases (as has been done in Steinhauser & Yeung, 2010).
Finally, factors that might influence error awareness can also be present already before an error, be it different neurochemical states as predicted by the adaptive gain theory (Aston-Jones & Cohen, 2005) or activity of the performance monitoring system itself. There is evidence that pMFC activity diminishes gradually before an error occurs (Eichele et al., 2008). Similar interpretations have been put forward for electrophysiological potentials such as the error preceding positivity (Ridderinkhof, Nieuwenhuis, & Bashore, 2003). It would be interesting to see whether these effects vary, depending on subsequently reported subjective error awareness.
Taken together, in the experiments presented here, we could not only demonstrate that both heart rate and pupil diameter are sensitive to subjective error awareness, but we also showed that the amplitude of the ERN, contrary to previous findings using AST experiments, covaries with subjective error awareness, as reflected both in the ERPs and in brain behavior associations.
This work was supported by a grant from the Gertrud Reemtsma Foundation for Brain Research to J. R. W. We thank S. Doering for helping in data acquisition and Sander Nieuwenhuis and three anonymous reviewers for comments on an earlier version of this article.
Reprint requests should be sent to Jan R. Wessel, Max Planck Institute for Neurological Research, Cognitive Neurology, Gleueler Str. 50, 50931 Cologne, Germany, or via e-mail: email@example.com.