Abstract
Changes in error processing are observable in a range of anxiety-related disorders. Numerous studies, however, have reported contradictory and nonreplicating findings, thus the exact mapping of brain response to errors (i.e., error-related negativity [ERN]; error-related positivity [Pe]) onto specific anxiety symptoms remains unclear. In this study, we collected 16 self-reported scores of anxiety dimensions and obtained spatial features of EEG recordings from 171 individuals. We then used machine learning to (1) identify symptoms that are central for elevated ERN/Pe and (2) estimate the generalizability of traditional statistical approaches. ERN was associated with rumination, threat overestimation, and inhibitory intolerance of uncertainty. Pe was associated with rumination, prospective intolerance of uncertainty, and behavioral inhibition. Our findings emphasize that not only the amplitude of ERN but also other sources of brain signal variance encode information relevant to individual differences in error processing. The results of the generalizability check reveal the need for a change in result-validation methods to move toward robust findings that reflect stable individual differences and clinically useful biomarkers. Our study benefits from the use of machine learning to improve the generalizability of results.
INTRODUCTION
The primary goal of research in cognitive and affective neuroscience is to establish the relationship between the structure and activity of the brain and behavior. Although this brain–behavior association is the focus of most studies in cognitive neurophysiology, the reliability of reported results has recently come under criticism (Brederoo, Nieuwenstein, Cornelissen, & Lorist, 2019). Standard statistical analysis usually includes an examination of differences between either groups or conditions, and results are often described in terms of the mean, the standard deviation, and the confidence intervals for each group or condition (Calhoun, Lawrie, Mourao-Miranda, & Stephan, 2017). However, making inferences at the group level can lead to misleading conclusions. As suggested by Rouder and colleagues (2021) in their recent study, inferring from the mean makes sense only when all individuals in a population show an effect in the same direction. In such a case, an effect could be both explained and expressed as a function of the variables of interest; the same function can then be used to predict the effect for all individuals. The picture is more complicated when individuals in the population show an effect in the opposite directions or part of the population shows no effect. Such complex individual differences suggest that the explained phenomenon is more complex than would appear from a group study, therefore attempting to analyze and predict this phenomenon through group comparison or simple linear regression may yield misleading results.
Moreover, recent controversies about the level of replicability in behavioral research have attracted researchers' interest to the impact of various methodological and nonmethodological choices on obtained results (Algermissen et al., 2022; Pavlov et al., 2021). Studies differ, for example, in their specifications of neuronal biomarkers (Clayson, 2020), their specifications of experimental paradigms (Weinberg, Dieterich, & Riesel, 2015), or their sample characteristics (Ging-Jehli, Ratcliff, & Arnold, 2021). All these factors contribute to differences in the results of apparently similar studies and create almost insurmountable barriers to establishing robust and stable brain–behavior associations. These fundamental problems have begun to be addressed by means of preregistrations of studies and increased rigor in describing both methods and samples; nevertheless, they still contribute immensely to the number of contradictory findings. Along with acknowledging the impact of the aforementioned differences on results, researchers have started to pay attention to developing more efficient approaches to analyzing the results of experiments (Poldrack et al., 2017; Simmons, Nelson, & Simonsohn, 2011). Small sample sizes make neuroimaging and neurophysiological studies extremely vulnerable to sampling error (Button et al., 2013). Simple regression models built on such limited data might not achieve a significant level of generalization, and their performance on unseen data is often very poor (Rosenberg, Casey, & Holmes, 2018); thus, their utility is limited, both clinically and scientifically. Assessing the statistical significance of results via p value or confidence intervals does not seem sufficient to prevent overfitting (Clayson, Brush, & Hajcak, 2021; Lakens et al., 2018). Greater stringency in evaluating the reliability of results would help reduce the number of false positives within similar paradigms; this would represent a step toward robust findings that reflect stable individual differences and clinically useful biomarkers.
To address the limitations of the statistical procedures that are commonly used in empirical research, neuroscience and experimental psychology have started to pay great attention to machine learning (ML) approaches (Orrù, Monaro, Conversano, Gemignani, & Sartori, 2020; Glaser, Benjamin, Farhoodi, & Kording, 2019). When carefully applied, ML techniques improve the generalizability and robustness of results (Scheinost et al., 2019; Woo, Chang, Lindquist, & Wager, 2017). Following this approach, the present study aims to use an ML framework to predict anxiety-related symptoms from brain activity in humans. In our approach, we focus on different dimensions of anxiety and error-related brain responses.
Error-related Negativity and Anxiety
A fundamental characteristic of human cognition is its imperfection. According to the theory of bounded rationality (Simon, 1990), human rationality is limited when individuals have to make a decision. This causes people to rarely perform tasks perfectly; therefore, they always have to bear the costs of this. People differ widely in their ability to initiate and maintain cognitive control. One of the main control functions is error monitoring, which is responsible for the processing and evaluation of errors, thus preventing maladaptive behavior (Ullsperger, Danielmeier, & Jocham, 2014). A well-established biomarker of error monitoring is error-related negativity (ERN; Gehring, Goss, Coles, Meyer, & Donchin, 1993), an ERP component that is a neural response to error commission. ERN (also called error negativity; Falkenstein, Hohnsbein, Hoormann, & Blanke, 1991) manifests as a negative deflection in the ERP component, which reaches its peak between 50 and 100 msec after error commission. The most likely neural generator of ERN is the ACC; reliable evidence for this comes from fMRI and EEG studies (Debener et al., 2005; Dehaene, Posner, & Tucker, 1994).
Numerous studies have found that situational context and motivation can modulate the ERN component, for example, directing a participant's attention to improving the accuracy of responses yields significantly larger ERN amplitudes (Gehring et al., 1993). Furthermore, when people are not explicitly aware of making an error, ERN is reduced (Wessel, 2012). In contrast, ERN amplitude increases when incorrect responses are evaluated (Hajcak, Moser, Yeung, & Simons, 2005) or punished (Riesel, Weinberg, Endrass, Kathmann, & Hajcak, 2012), and when they are personally meaningful (Legault & Inzlicht, 2013; Amodio, Devine, & Harmon-Jones, 2008) or costly (Hajcak et al., 2005); this suggests that error monitoring is sensitive to motivational and affective factors.
Because ERN is involved in several processes that are essential to adaptive adjustment, changes in error processing are observable in a range of psychopathologies (Michael et al., 2021; Meyer & Hajcak, 2019; Pasion & Barbosa, 2019). Various studies have shown a strong association between anxiety disorders and behaviors and ERN amplitude (Klawohn, Santopetro, Meyer, & Hajcak, 2020; Olvet & Hajcak, 2008). However, increased ERN amplitude is not unique to anxiety disorders. A similar relationship between anxiety and ERN amplitude has been observed in nonclinical populations, suggesting that an individual's susceptibility to anxiety and worrying, regardless of clinical severity, significantly affects error monitoring (Banica, Sandre, Shields, Slavich, & Weinberg, 2020; Moser, Moran, Schroder, Donnellan, & Yeung, 2013; Weinberg, Riesel, & Hajcak, 2012). Because covariation of ERN amplitude and anxiety is observed in both clinical and nonclinical populations, ERN has started to be considered a biomarker of anxiety levels. Meanwhile, numerous clinical (Weinberg et al., 2012) and nonclinical (e.g., Macedo, Pasion, Barbosa, & Ferreira-Santos, 2021; Härpfer, Carsten, Spychalski, Kathmann, & Riesel, 2020; Seow et al., 2020) studies either have not confirmed the association between anxiety and ERN, or the found association was opposite to the expected one. Anxiety, however, is a highly heterogeneous phenomenon. Therefore, instead of focusing on anxiety itself to explain the relationship between ERN and anxiety, research on error monitoring has focused on various dimensions related to anxiety. For example, defensive behavior and error avoidance, both of which are systematically linked with anxiety and worry, have been found to co-occur with increased ERN amplitude in patients with obsessive–compulsive disorder (OCD; Olvet & Hajcak, 2008); this is explained by the avoidance of negative situations and thus abnormal error control, with evidence coming from studies on nonclinical populations (Zambrano-Vazquez, Szabo, Santerre, & Allen, 2019; Hajcak & Foti, 2008). On the basis of large-scale studies on nonclinical adolescents, Weinberg and colleagues (2016) suggested that checking behaviors represent a specific anxiety dimension that may partially explain increased ERN amplitude. Following this dimensional path, Seow and colleagues (2020) performed an analysis of the relationship between nine anxiety-related phenomena and ERN amplitude in a nonclinical population, but they failed to replicate prior ERN–anxiety findings. Altogether, although the anxiety-ERN findings are relatively replicable in the clinical anxiety population, where the severity of most anxiety symptoms is high, the results in the nonclinical or only comorbid anxiety populations are highly inconsistent. This lack of consistency, especially in nonclinical studies, leaves it unclear which dimensions of anxiety represent the specific anxiety symptom that can explain the increased ERN amplitude.
Thus, the first goal of this article is to use ML predictive modeling to investigate whether changed ERN in a nonclinical population can be explained by any specific dimension of anxiety. Specifically, we aim to clarify the relationship between ERN and 15 selected dimensions of anxiety that have previously been linked to ERN. In addition, we decided to model the relationship between self-esteem and ERN. Although the item contents of self-esteem and anxiety measures usually do not overlap, low self-esteem may be a potential vulnerability factor for certain forms of anxiety (Sowislo & Orth, 2013).
Error-positivity and Anxiety
Another error-specific component is error-positivity (Pe; Falkenstein et al., 1991). Pe manifests as a large, positive centro-parietal wave that occurs approximately 200–500 msec after an incorrect response. Functionally, Pe is associated with a later aspect of conscious error processing that is related to error awareness or error salience (Ridderinkhof, Ramautar, & Wijnen, 2009; Falkenstein, Hoormann, Christ, & Hohnsbein, 2000). Pe amplitude covaries with error salience, that is, larger Pe amplitudes are observed for errors that are perceived as significant. ERN and Pe seem to reflect different aspects of error monitoring as they differ highly in terms of the preceding conditions. When both direct or indirect manipulation of the activity of the dopaminergic system alters ERN, Pe magnitude remains unchanged (Ridderinkhof et al., 2009). Nevertheless, it remains unclear whether Pe is an expression of error awareness or of the processes that lead to error awareness.
Similarly to ERN, Pe has been explored in relation to phenomena characterized by anxiety. Nevertheless, results regarding Pe and anxiety-related phenomena are inconsistent. Tanovic, Hajcak, and Sanislow (2017) found no association between Pe and depression, anxiety, and rumination symptoms in a nonclinical population. Schrijvers and colleagues (2008) and Schrijvers, De Bruijn, Destoop, Hulstijn, and Sabbe (2010) reported attenuation of the Pe component in major depressive disorders (MDDs), perfectionism, and negative affect. Wu and colleagues (2014) and Hajcak, McDonald, and Simons (2004) state that Pe is associated more with stress than with cognitive aspects of anxiety. Thus, the nature of Pe has yet to be clarified.
To broaden our knowledge of the phenomena that influence error awareness, the second goal of this article is to use ML predictive modeling to explore the relationship between Pe and 15 selected dimensions of anxiety. As in the ERN analysis, we decided to model the relationship between self-esteem and Pe.
The Proposed ML Framework
We proposed a novel ML framework that may improve the reliability and generalizability of ERN–anxiety studies. ML may be particularly useful for capturing ERN–anxiety associations in the general population as these associations may be weaker than in the clinical population. The proposed ML framework employs principal component analysis (PCA) for spatial feature extraction. Because single-electrode EEG signals are characterized by a low signal-to-noise ratio (Blankertz, Tomioka, Lemm, Kawanabe, & Müller, 2008), it is difficult to use them directly in predictive analysis. The application of spatial filters enhances the signal-to-noise ratio and preserves a significant amount of information on brain spatial dynamics that are usually discarded—and are thus rarely analyzed on their own—when selecting a single electrode (for an overview, see Cohen, 2017). Multiple analyses have confirmed that this preserved information can be successfully used for, for example, signal-to-noise ratio optimization (Dmochowski, Greaves, & Norcia, 2015; Kayser & Tenke, 2003), reconstruction of the brain signal (Cohen, 2017; Dmochowski et al., 2015), or decoding brain states (King & Dehaene, 2014; Vigario & Oja, 2008). Thus, it stands to reason that the spatial dimension may contain information relevant to individual differences. The classification and brain–computer interface tasks also showed that spatial filters enhance models' performance and the generalizability of results by improving signal-to-noise ratio or/and extraction of spatial brain features (Rahman & Joadder, 2020; Jiang, He, Li, Jin, & Shen, 2019; Boye, Kristiansen, Billinger, do Nascimento, & Farina, 2008). For a broader discussion on the utility of spatial filters in ERP research, see Kayser and Tenke (2003).
One of the fundamental properties of ERP components is that their amplitudes are expected to vary as a function of the experimental manipulation or between-subjects differences. Moreover, ERPs are characterized by a special variance distribution: The variance of the amplitude should be large near ERP components' peaks (Kayser & Tenke, 2003). Following this assumption, PCA is widely used for spatial filtering of ERPs because its goal is to maximize the variance explained by subsequent components (Dmochowski et al., 2015; Kayser & Tenke, 2003; Rösler & Manzey, 1981). In this study, the use of PCA helps to separate the main statistical source of ERN/Pe variance (which is presumably related to the variance of ERP components' amplitude, and thus their magnitude) from the remaining variance, which we consider to be secondarily associated with error-related brain activity. Thus, PCA may give new insights into which individual differences are related to the main source of variance, that is, ERN/Pe amplitude, and which are related to the remaining sources of variance that are associated with facets of error monitoring other than intensity.
Simultaneously, we performed a classic ERP analysis to expand the knowledge on the relationship between the mean ERN/Pe amplitude measured within specific time windows (0–100 and 150–350 msec, respectively) and anxiety and to allow researchers to directly compare our results against the existing literature. To ensure a high level of generalization to the population, the models were trained using a cross-validation strategy. The findings were subsequently tested on a hold-out data set to test the effect on an unseen sample. We only assumed that our analysis confirmed the relationship if the validation on the hold-out data set yielded nonzero results.
The third goal of this study is to investigate how results depend on the studied population, that is, to check the generalizability of results. We aim to prove that the body of research underestimates the sensitivity of models to overfitting, which contributes to a number of contradictory and nonreplicating findings in individual differences research. To achieve our objective, we decided to contrast estimates on the entire data set and cross-validated scores for both the classic ERP analysis and the proposed ML framework.
To preview the study findings, we show that the adopted ML approach yields more generalizable results than analysis conducted on ERP waves; we also show that the proposed PCA-based framework can reveal associations that remain hidden during classic analyses.
METHODS
Participants
The sample size was determined based on a review of past literature on anxiety-related individual differences in error processing and was five times larger than the mean sample size of studies included in two recent meta-analyses (Cavanagh & Shackman, 2015; Moser et al., 2013). We did not use a power calculation. We report all data exclusions, all manipulations, all measures in the study, and each step of the data analysis.
One hundred seventy-one volunteers (120 women and 51 men) aged 18–40 years (M = 22.75, SD = 3.60) took part in the study. Participants were recruited from the general population via Internet advertisements. All participants were healthy, free of medications, declared no history of neurological or psychiatric diseases, and had a normal or corrected-to-normal vision. The average number of years of education among the participants was 15.16 (SD = 2.56). Before the analysis, eight participants were excluded because of low overall EEG data quality. ML analysis conducted on small data sets requires high-quality data. To ensure the quality of the EEG data, each segment containing artifacts on any of the set of 64 electrodes was discarded. Thus, 33 participants who performed less than five artifact-free trials that contained an erroneous response were excluded. The final sample consisted of EEG data from 130 participants (93 women and 37 men) aged 18–40 years (M = 22.88, SD = 3.76); the average number of years of education was 15.08 (SD = 2.49).
Procedure and Task
Participants received verbal and written information about the purpose and procedure of the study. Then, while the EEG signal was recorded, they performed a speeded color plus orientation go/no-go discrimination task that has previously been validated in several studies (Pourtois, 2011; Vocat, Pourtois, & Vuilleumier, 2008). Figure 1 shows the scheme of the go/no-go task. A detailed description of the task and its conditions can be found in Appendix A. After finishing the task, the participants filled out a series of self-report questionnaires. All participants provided written informed consent and were monetarily compensated for their time. The study was performed following the Declaration of Helsinki (World Medical Association, 2009), and the protocol was approved by the Research Ethics Committee at the Philosophical Faculty of Jagiellonian University in Kraków, Poland.
Scheme of the go/no-go task used and its conditions: Go trials (A); successful no-go trials (B); unsuccessful no-go trials, that is, erroneous response (C). Participants had to press the response key if the black geometric figure turned green and kept the same spatial orientation. By contrast, they had to withhold their response if the black figure turned green but changed orientation, or if it turned orange irrespective of orientation. A detailed description of the task and its conditions can be found in Appendix A.
Scheme of the go/no-go task used and its conditions: Go trials (A); successful no-go trials (B); unsuccessful no-go trials, that is, erroneous response (C). Participants had to press the response key if the black geometric figure turned green and kept the same spatial orientation. By contrast, they had to withhold their response if the black figure turned green but changed orientation, or if it turned orange irrespective of orientation. A detailed description of the task and its conditions can be found in Appendix A.
Measures/Questionnaires
From the set of questionnaires (Polish adaptations or Polish translations made using a forward–backward translation protocol) completed by participants, the following were selected for analyses of anxiety-related phenomena: Rumination-Reflection Questionnaire, rumination scale (RRQ; Trapnell & Campbell, 1999); State–Trait Anxiety Inventory, trait scale (STAI; Spielberger, Strelau, Wrześniewski, & Tysarczyk, 2006; Spielberger, Gorsuch, & Lushene, 1970); Depression Anxiety Stress Scales–21, anxiety scale (DASS-21; Antony, Bieling, Cox, Enns, & Swinson, 1998; Lovibond & Lovibond, 1995); Behavioral Inhibition System (BIS) Scale (Müller & Wytykowska, 2005; Carver & White, 1994); Obsessive–Compulsive Inventory–Revised, checking, hoarding, obsessing, ordering, neutralizing, and washing scales (OCI-R; Huppert et al., 2007; Foa et al., 2002); Obsessive Beliefs Questionnaire–20, overestimation of threat scale (Moulding et al., 2011); White Bear Suppression Inventory (WBSI; Cichoń, Szczepanowski, & Niemiec, 2020; Wegner & Zanakos, 1994); Intolerance of Uncertainty Scale–12, inhibitory and prospective scales (Carleton, Norton, & Asmundson, 2007); Rosenberg Self-Esteem Scale (SES; Łaguna, Lachowicz-Tabaczek, & Dzwonkowska, 2007; Rosenberg, 1989). All scorings on subscales were performed in line with instructions provided by the authors of the questionnaires. The detailed description of the score extraction can be found in Appendix B.
Electrophysiological Recording and Data Preprocessing
The experiment was conducted by trained researchers in a sound-attenuated room. The EEG signal was continuously recorded at 256 Hz from 64 silver/silver-chloride (Ag/AgCl) active electrodes (with preamplifiers) using the BioSemi Active-Two system and referenced online to the Common Mode Sense (CMS) - Driven Right Leg (DRL) ground, which drives the average potential across all electrodes as closely as possible to amplifier zero. The horizontal and vertical EOGs were monitored using four additional electrodes placed above and below the left eye and in the external canthi of both eyes after adequate skin preparation. The EEG signal was preprocessed with BrainVision Software (Brain Products GmbH). The signal was rereferenced offline to the average of the left and right mastoid electrodes; it was initially filtered with a Butterworth fourth-order filter with a high pass of 0.05 Hz, and a Butterworth second-order filter with a low-pass of 128 Hz. Power-line noise was removed with a notch filter at 50 Hz. Data were further segmented into response-locked epochs ranging from 100 msec before the reaction to 600 msec after. The ocular artifact correction was performed with Gratton, Coles, and Donchin's algorithm (Gratton, Coles, & Donchin, 1983). Noise epochs were rejected via an automatic procedure, with rejection criteria of ±65 μV in the ±200-msec time window. Epochs were baselined to the 100-msec interval preceding the response and divided into correct-response and error-response trials. Note that error-response trails consist of only commission errors because we created response-locked epochs. Later analyses were performed on error-response trials.
Feature Extraction
Before any further analyses, the data set was divided into testing (hold-out) and training sets with a proportion of 3:7. In each set, participants who had less than five artifact-free epochs with erroneous responses were rejected, resulting in a hold-out (external) data set with 34 participants and a training (internal) data set with 96 participants. The average number of artifact-free epochs consisting of erroneous responses included in the analysis per participant in the training set was 17.03 (SD = 10.80); in the testing set, it was 17.85 (SD = 8.81).
To limit the spatial variability of the data to the presumed signal of interest, we selected different subsets of the channels, that is, ROIs. Because this is a data-driven and exploratory ML framework, we decided to test two ROIs to examine the impact of the included spatial variability on the results. ROIs were selected based on the preliminary internal consistency analysis: We defined seven ROIs consisting of frontal, central, and parietal channels that might contain signals relevant to error processing. We then selected two ROIs for Pe and two ROIs for ERN that yielded maximal mean internal consistency scores of the signal or had the best internal consistency distribution. The results of the preliminary internal consistency tests are shown in Appendix C. The following ROIs were selected for ERN: (ROI 1) Fpz, AFz, Fz, FCz, Cz, CPz, P1, Pz, P2; (ROI 2) Fpz, AFz, F1, Fz, F2, FCz, C1, Cz, C2, CPz, P1, Pz, P2. The following ROIs were selected for Pe: (ROI 3) Fpz, AFz, Fz, FCz, C1, Cz, C2, CPz, P1, Pz, P2; (ROI 4) Fpz, AFz, F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2, CP1, CPz, CP2, P1, Pz, P2.
The signal was additionally filtered with a Butterworth sixth-order filter at 40 Hz and averaged across trials for each participant. To extract spatial features of the brain signals, error-response trials were processed with PCA. The number of PCA components was treated as a hyperparameter of the models with an allowed range of one to four and was selected automatically during a model's hyperparameter tuning step. We decided to limit the maximal number of PCA components to four as a trade-off between overfitting, interpretability, and the amount of information preserved. Excessively complex models that consist of a lot of hyperparameters are strongly susceptible to overfitting and thus to poor performance on test data; simultaneously, the first four components explained most of the variance contained in the signal, and their topographical distributions support interpretation attempts, which was not the case with the subsequent components. A detailed explanation of PCA and its interpretation can be found in Appendix D. The ERN scores and Pe scores were quantified as peak-to-peak amplitude, separately for each spatial component. The data were averaged over 47-msec bins before applying the peak-to-peak measure to compensate for the distortion from high-frequency background EEG noise on peak-to-peak amplitude (Clayson, Baldwin, & Larson, 2013). Figure 2 shows the detailed ERN/Pe score-extraction pipeline.
Feature extraction pipeline. Raw EEG error-response trials on all 64 electrodes (1), ROI extraction (2), PCA (3), averaging error-response trials per participant (4), binning the signal inside 47-msec bins (5), detection of the bin with maximal negative amplitude in the first PCA component, that is, ERN peak bin. Note that the first PCA component was used to detect the ERN peak bin because it accounts for the largest variance and is the average of the EEG signal at the electrodes included in the ROI. The ERN peak bin will be used to determine the time windows for ERN and Pe (6). ERN scores are labeled with the variable x: peak-to-peak difference in the time window of the two bins preceding the ERN peak to the bin after the ERN peak bin. The time window for the ERN peak-to-peak difference is marked with a blue shaded area. Pe scores are labeled with the variable z: peak-to-peak difference in the time window from the first to the fifth bin after the ERN peak bin. The time window for the Pe peak-to-peak difference is marked with a green shaded area (7).
Feature extraction pipeline. Raw EEG error-response trials on all 64 electrodes (1), ROI extraction (2), PCA (3), averaging error-response trials per participant (4), binning the signal inside 47-msec bins (5), detection of the bin with maximal negative amplitude in the first PCA component, that is, ERN peak bin. Note that the first PCA component was used to detect the ERN peak bin because it accounts for the largest variance and is the average of the EEG signal at the electrodes included in the ROI. The ERN peak bin will be used to determine the time windows for ERN and Pe (6). ERN scores are labeled with the variable x: peak-to-peak difference in the time window of the two bins preceding the ERN peak to the bin after the ERN peak bin. The time window for the ERN peak-to-peak difference is marked with a blue shaded area. Pe scores are labeled with the variable z: peak-to-peak difference in the time window from the first to the fifth bin after the ERN peak bin. The time window for the Pe peak-to-peak difference is marked with a green shaded area (7).
We also conducted classic ERP analyses for ERN and Pe. For the ERP analyses, ERN was quantified as the mean amplitude in the 0- to 100-msec time window at electrode Fz. Pe was quantified as the mean amplitude in the 150- to 350-msec time window msec at electrode Cz. Time windows for the targeted ERP components were selected based on our previous study using a similar experimental paradigm requiring response inhibition (Senderecka, 2016). The ERP analyses were conducted using linear regression.
Subject-level Internal Consistency
Regression Models
Separate regressions were performed for each anxiety-related phenomenon because of correlation across the questionnaires. The ERN or Pe PCA-based scores, extracted as described in the Feature Extraction section, served as independent variables; the questionnaire score served as the dependent variable. For the analyses, the following regression models were chosen: (1) ElasticNet and (2) Kernel Ridge with a radial basis function kernel (KR-rbf). According to its authors, Zou and Hastie (2005), ElasticNet outperforms other linear models for data sets with a limited number of samples and a lot of independent variables. The range of the two hyperparameters of the ElasticNet model (λ and α) consists of 20 logarithmic-spaced values from 1e-7 to 1e3 and from 1e-8 to 1, respectively. KR-rfb combines linear regression using L2 Norm regularization with a kernel trick to capture nonlinear relationships using a linear combination of coefficients (Slavakis, Theodoridis, & Yamada, 2008). The strength of L2 norm regularization (α) was the models' hyperparameter. The range of α consists of 20 logarithmic-spaced values, from 1e-5 to 1e3.
Validation Strategy
We employed threefold cross-validation to determine the models' hyperparameters, which is a trade-off between the size of the training set and the duration of training. Models were evaluated with the coefficient of determination R2 score. Once the best-performing estimator was determined along with its optimal hyperparameters for the task, additional validation was carried out. This additional validation consisted of (1) evaluating the models' performance on the hold-out data set with the R2 score and (2) evaluating the statistical significance of its result. The hold-out data set consisted of questionnaire and EEG data from randomly selected participants; note that the hold-out was not used during the model training process. Each model during validation was fed with participants' preprocessed error-response epochs from the hold-out data set; these were then processed in the same way as the error-response epochs from the training data set because the feature-extraction steps were parts of the model (see Feature Extraction section). The statistical significance of the results was assessed with permutation tests, with n = 10,000 to obtain a p value with an accuracy of up to four decimal places. We did not correct the obtained results for multiple comparisons as we treat each questionnaire-ERN/Pe model as a separate hypothesis. Figure 3 shows the models' training and validation pipeline.
RESULTS
Behavioral
The average number of erroneous and correct responses per participant in the training set were 29.75 (SD = 16.19) and 221.38 (SD = 7.14), respectively; in the testing set, it was 31.88 (SD = 17.08) and 222.18 (SD = 2.97) for erroneous and correct responses, respectively. In the training data set, the average RT for errors was 282 msec (SD = 43 msec); for correct responses, it was 329 msec (SD = 39 msec). As expected, participants responded significantly faster on error trials relative to correct trials, t(95) = −17.31, p < .001. In the testing data set, the average RT for errors was 295 msec (SD = 62 msec); for correct responses, it was 334 msec (SD = 43 msec). Again, participants responded significantly faster on error trials relative to correct trials, t(33) = −6.85, p < .001. Table 1 displays descriptive statistics and correlations for the questionnaire data within the training data set. Distributions of scores for each of the selected scales for both training and testing data sets are shown in Appendix E.
Descriptive Statistics and Correlations for Questionnaires in the Training Data Set
. | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | 8 . | 9 . | 10 . | 11 . | 12 . | 13 . | 14 . | 15 . | 16 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. DASS-21 | – | .29*** | .25** | .19* | .30*** | .20** | −.04 | .12 | .20* | .15 | .12 | .28*** | .15 | .10 | −.07 | −.21** |
2. STAI-T | – | .73*** | .65*** | .49*** | .54*** | .26*** | .60*** | .27*** | .06 | .34*** | .55*** | −.02 | .20* | −.10 | −.83*** | |
3. BIS | – | .60*** | .34*** | .40*** | .39*** | .58*** | .22** | .03 | .27*** | .37*** | .08 | .11 | −.03 | −.61*** | ||
4. RRQ | – | .56*** | .51*** | .34*** | .51*** | .30*** | .05 | .36*** | .55*** | .02 | .12 | .06 | −.51*** | |||
5. WBSI | – | .45*** | .32*** | .39*** | .47*** | .30*** | .34*** | .55*** | .28*** | .18* | .10 | −.37*** | ||||
6. OT | – | .33*** | .48*** | .46*** | .28*** | .34*** | .52*** | .23** | .28*** | .11 | −.53*** | |||||
7. IUS-P | – | .56*** | .35*** | .28*** | .26** | .17* | .31*** | .14 | .19* | −.20* | ||||||
8. IUS-I | – | .25** | .14 | .29*** | .36*** | −.00 | .22** | .01 | −.52*** | |||||||
9. OCI-R | – | .71*** | .64*** | .59*** | .78*** | .60*** | .59*** | −.20** | ||||||||
10. Checking | – | .34*** | .16 | .56*** | .44*** | .25** | −.02 | |||||||||
11. Hoarding | – | .40*** | .31*** | .30*** | .16 | −.27*** | ||||||||||
12. Obsessing | – | .24** | .21** | .22** | −.44*** | |||||||||||
13. Ordering | – | .38*** | .46*** | .02 | ||||||||||||
14. Neutralizing | – | .30*** | −.19* | |||||||||||||
15. Washing | – | .12 | ||||||||||||||
16. SES | – | |||||||||||||||
M | 1.72 | 2.24 | 3.03 | 3.55 | 3.42 | 2.90 | 3.06 | 2.43 | 2.22 | 2.60 | 2.29 | 2.58 | 2.67 | 1.45 | 1.75 | 2.92 |
SD | 0.53 | 0.48 | 0.58 | 0.90 | 0.80 | 1.19 | 0.81 | 1.00 | 0.63 | 1.06 | 0.97 | 1.11 | 1.18 | 0.58 | 0.86 | 0.72 |
Min | 1.00 | 1.30 | 1.86 | 1.33 | 1.53 | 1.00 | 1.00 | 1.00 | 1.11 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Max | 3.29 | 3.40 | 4.00 | 5.00 | 5.00 | 5.40 | 4.86 | 4.80 | 3.72 | 5.00 | 5.00 | 5.00 | 5.00 | 3.00 | 4.33 | 4.00 |
Range | 1–4 | 1–4 | 1–4 | 1–5 | 1–5 | 1–7 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–4 |
Cronbach's α | 0.73 | 0.88 | 0.78 | 0.93 | 0.90 | 0.76 | 0.80 | 0.87 | 0.84 | 0.74 | 0.64 | 0.84 | 0.81 | 0.43 | 0.67 | 0.92 |
p value | .364 | .723 | .874 | .791 | .552 | .720 | .220 | .709 | .877 | .578 | .130 | .758 | .667 | .291 | .903 | .960 |
. | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | 8 . | 9 . | 10 . | 11 . | 12 . | 13 . | 14 . | 15 . | 16 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. DASS-21 | – | .29*** | .25** | .19* | .30*** | .20** | −.04 | .12 | .20* | .15 | .12 | .28*** | .15 | .10 | −.07 | −.21** |
2. STAI-T | – | .73*** | .65*** | .49*** | .54*** | .26*** | .60*** | .27*** | .06 | .34*** | .55*** | −.02 | .20* | −.10 | −.83*** | |
3. BIS | – | .60*** | .34*** | .40*** | .39*** | .58*** | .22** | .03 | .27*** | .37*** | .08 | .11 | −.03 | −.61*** | ||
4. RRQ | – | .56*** | .51*** | .34*** | .51*** | .30*** | .05 | .36*** | .55*** | .02 | .12 | .06 | −.51*** | |||
5. WBSI | – | .45*** | .32*** | .39*** | .47*** | .30*** | .34*** | .55*** | .28*** | .18* | .10 | −.37*** | ||||
6. OT | – | .33*** | .48*** | .46*** | .28*** | .34*** | .52*** | .23** | .28*** | .11 | −.53*** | |||||
7. IUS-P | – | .56*** | .35*** | .28*** | .26** | .17* | .31*** | .14 | .19* | −.20* | ||||||
8. IUS-I | – | .25** | .14 | .29*** | .36*** | −.00 | .22** | .01 | −.52*** | |||||||
9. OCI-R | – | .71*** | .64*** | .59*** | .78*** | .60*** | .59*** | −.20** | ||||||||
10. Checking | – | .34*** | .16 | .56*** | .44*** | .25** | −.02 | |||||||||
11. Hoarding | – | .40*** | .31*** | .30*** | .16 | −.27*** | ||||||||||
12. Obsessing | – | .24** | .21** | .22** | −.44*** | |||||||||||
13. Ordering | – | .38*** | .46*** | .02 | ||||||||||||
14. Neutralizing | – | .30*** | −.19* | |||||||||||||
15. Washing | – | .12 | ||||||||||||||
16. SES | – | |||||||||||||||
M | 1.72 | 2.24 | 3.03 | 3.55 | 3.42 | 2.90 | 3.06 | 2.43 | 2.22 | 2.60 | 2.29 | 2.58 | 2.67 | 1.45 | 1.75 | 2.92 |
SD | 0.53 | 0.48 | 0.58 | 0.90 | 0.80 | 1.19 | 0.81 | 1.00 | 0.63 | 1.06 | 0.97 | 1.11 | 1.18 | 0.58 | 0.86 | 0.72 |
Min | 1.00 | 1.30 | 1.86 | 1.33 | 1.53 | 1.00 | 1.00 | 1.00 | 1.11 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Max | 3.29 | 3.40 | 4.00 | 5.00 | 5.00 | 5.40 | 4.86 | 4.80 | 3.72 | 5.00 | 5.00 | 5.00 | 5.00 | 3.00 | 4.33 | 4.00 |
Range | 1–4 | 1–4 | 1–4 | 1–5 | 1–5 | 1–7 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–5 | 1–4 |
Cronbach's α | 0.73 | 0.88 | 0.78 | 0.93 | 0.90 | 0.76 | 0.80 | 0.87 | 0.84 | 0.74 | 0.64 | 0.84 | 0.81 | 0.43 | 0.67 | 0.92 |
p value | .364 | .723 | .874 | .791 | .552 | .720 | .220 | .709 | .877 | .578 | .130 | .758 | .667 | .291 | .903 | .960 |
DASS-21 = Depression Anxiety Stress Scale-21, Anxiety subscale; STAI-T = State–Trait Anxiety Inventory, Trait subscale; RRQ = Rumination-Reflection Questionnaire, Rumination subscale; OT = Obsessive Beliefs Questionnaire-20, Overestimation of Threat subscale; IUS-P = Intolerance of Uncertainty Scale-12, Prospective subscale; IUS-I = Intolerance of Uncertainty Scale-12, Inhibitory subscale; OCI-R = Obsessive–Compulsive Inventory–Revised full score; Checking = OCI-R, Checking subscale; Hoarding = OCI-R, Hoarding subscale; Obsessing = OCI-R, Obsessing subscale; Ordering = OCI-R, Ordering subscale; Neutralizing = OCI-R, Neutralizing subscale; Washing = OCI-R, Washing subscale; M = mean; SD = standard deviation; Min = the lowest score in the data set; Max = the highest score in the data set; Range = range of possible scores; Cronbach's α was calculated for the combined training and testing data sets; p Value = result of the statistical test for the difference between the training and the testing data set, performed with permutation tests.
*p < .05.
**p < .01.
***p < .001.
Table 1—Source Data 1. https://bit.ly/3ZLwQb6.
Table 1—Source Code 1. https://bit.ly/3GSsUfU.
EEG Features
To improve the signal-to-noise ratio and distinguish ERP features other than just average amplitude, we applied PCA to the epoched signal. Component weights and scores for ERN and Pe are shown in Figure 4.
Grand averages of extracted PCA components with corresponding spatial patterns at the marked time (right upper corner) and component weights (left upper corner). PCA components extracted for ERN-ROI 1 (A). PCA components extracted for ERN-ROI 2 (B). PCA components extracted for Pe-ROI 3 (C). PCA components extracted for Pe-ROI 4 (D). For improved visibility, data are scaled to μV and is presented without the additional linear scaling that scikit-learn PCA performs. The patterns and relationships within the data remain the same. Figure 4—Source Data 1. Pickled data processing pipelines for ERN: https://bit.ly/3Xl7q2a. Figure 4—Source Data 2. Pickled data processing pipelines for Pe: https://bit.ly/3CVHJgq. Figure 4—Source Code 1. https://bit.ly/3w9TwUC.
Grand averages of extracted PCA components with corresponding spatial patterns at the marked time (right upper corner) and component weights (left upper corner). PCA components extracted for ERN-ROI 1 (A). PCA components extracted for ERN-ROI 2 (B). PCA components extracted for Pe-ROI 3 (C). PCA components extracted for Pe-ROI 4 (D). For improved visibility, data are scaled to μV and is presented without the additional linear scaling that scikit-learn PCA performs. The patterns and relationships within the data remain the same. Figure 4—Source Data 1. Pickled data processing pipelines for ERN: https://bit.ly/3Xl7q2a. Figure 4—Source Data 2. Pickled data processing pipelines for Pe: https://bit.ly/3CVHJgq. Figure 4—Source Code 1. https://bit.ly/3w9TwUC.
In our analysis, the first PCA component, which accounts for the largest variance, corresponded to the average of the EEG signal at the electrodes included in the ROI. Thus, the first PCA component is most likely a marker of the intensity of the error-monitoring process. We assume that the nonzero coefficient for this component indicates individual differences related to the intensity of error monitoring. The second, third, and fourth components assess the spatial distribution of the error-related brain signal: For each of these components, we assume that a nonzero coefficient corresponds to the expected spatial location and distribution of ERN/Pe, that is, a brain pattern with high similarity to the pattern shown in Figure 4. Because of lateralization to the right hemisphere, the fourth component can be considered a marker of lateralization of error-related brain activity. We assume that the nonzero coefficient for this component indicates individual differences related to the spatial dynamics associated with hemispheric dominance and cognitive processes other than the pure intensity of error monitoring. Descriptive statistics of the ERN and Pe scores quantified as the peak-to-peak amplitude on each of the PCA components for both training and testing data sets can be found in Appendix F.
We also conducted analyses on classic ERN waveforms. For ERN, quantified as the mean amplitude in the 0- to 100-msec time window at electrode Fz, the average score in the training set was −4.98 μV (SD = 4.65 μV); in the testing set, it was −4.48 μV (SD = 3.93 μV). For Pe, quantified as the mean amplitude in the 150- to 350-msec time window msec at electrode Cz, the average score in the training set was 1.51 μV (SD = 6.78 μV); in the testing set, it was 1.37 μV (SD = 8.16 μV). There were no differences in ERN and Pe scores between training and testing data sets (t = −.55, p = .595; t = .98, p = .337 for ERN and Pe, respectively).
Subject-level Internal Consistency
To allow comparison of study results and assess the degree of the results' stability, we report intra- and intervariability of ERN and Pe as well as internal consistency. Internal consistency across conditions (types of data sets and ROIs) ranged from .53 to .98. A summary of the ERN and Pe internal consistency statistics is shown in Figure 5. Detailed ERN and Pe score variability for each participant can be found in Figure 5 source data.
Estimates of the variability (σ) and subject-level dependability coefficient (jk) of ERN scores (A) and Pe scores (B) across participants within the training (internal) and testing (external) sets. Figure 5—Source Data 1. ERN score variability for each participant: https://bit.ly/3XAO2hu. Figure 5—Source Data 2. Pe score variability for each participant: https://bit.ly/3ku0JME. Figure 5—Source Code 1. https://bit.ly/3GRvWRt.
Estimates of the variability (σ) and subject-level dependability coefficient (jk) of ERN scores (A) and Pe scores (B) across participants within the training (internal) and testing (external) sets. Figure 5—Source Data 1. ERN score variability for each participant: https://bit.ly/3XAO2hu. Figure 5—Source Data 2. Pe score variability for each participant: https://bit.ly/3ku0JME. Figure 5—Source Code 1. https://bit.ly/3GRvWRt.
ERN and Questionnaire Scores
We tested how the ERN scores were associated with the self-report questionnaire scores. Most of the questionnaires showed a statistically significant relationship with ERN during the internal validation (p < .05), with mean cross-validated R2 ranging from .02 to .15 (Table 2), but the relationships differed in terms of the spatial features responsible for the results. Both linear and nonlinear models exhibited similar efficiency.
Detailed Results of ERN-based Models
. | KR–rbf . | ElasticNet . | Classic ERP Analysis . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | CV R2 . | Test R2 . | CV R2 . | Test R2 . | β1 . | β2 . | β3 . | β4 . | Train R2 . | CV R2 . | Test R2 . |
DASS-21 | −.015 | −.078 | −.022 | −.067 | −.01 | .04 | .03 | – | .000 | −.046 | −.032 |
STAI-T | .143** | −.087 | .132** | −.043 | 1.06 | 1.26 | .00 | 1.91 | .120** | .082** | −.317 |
BIS | .094** | −.015 | .090** | .013 | .07 | .05 | −.01 | .13 | .083** | .031* | −.189 |
RRQ | .069** | .071* | .055** | .023 | .08 | .10 | .00 | .10 | .041* | .016* | −.062 |
WBSI | −.021 | .000† | −.021 | −.004† | .01 | .01 | .00 | .02 | .039* | .004* | −.059 |
OT | .055** | .059† | .006† | .058* | .00 | 0 | −.13 | .17 | .000 | −.040 | −.018 |
IUS-P | .105** | −.251 | .070** | −.145 | .00 | −.02 | .00 | .20 | .000 | −.074 | −.068 |
IUS-I | .030* | .048† | .036* | .056† | −.01 | −.04 | .01 | .00 | .019† | −.005† | −.062 |
OCI-R | .007 | −.045 | −.007 | −.036 | .02 | −.04 | .02 | .03 | .001 | −.021 | −.002 |
Checking | .020* | −.042 | .021* | −.026 | .02 | −.18 | .03 | – | .044* | .028* | −.015 |
Hoarding | .004* | −.232 | .000* | −.125 | .00 | .00 | .00 | .01 | .000 | −.008 | −.114 |
Obsessing | .030* | .007 | .014* | −.007 | .07 | .03 | .05 | – | .024† | .018* | −.054 |
Ordering | .004† | −.104 | .004† | −.084 | .14 | −.24 | −.12 | .11 | .002 | −.056 | −.009 |
Neutralizing | −.001† | −.162 | .000† | −.165 | .01 | −.06 | .09 | .10 | .000 | −.074 | −.034 |
Washing | −.049 | −.001 | −.048 | −.001 | .00 | .00 | .00 | – | .002 | −.049 | −.007 |
SES | .072** | .020† | .069** | 0.02 | −0.09 | −0.07 | 0.05 | −0.14 | .118** | −.011 | −.255 |
. | KR–rbf . | ElasticNet . | Classic ERP Analysis . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | CV R2 . | Test R2 . | CV R2 . | Test R2 . | β1 . | β2 . | β3 . | β4 . | Train R2 . | CV R2 . | Test R2 . |
DASS-21 | −.015 | −.078 | −.022 | −.067 | −.01 | .04 | .03 | – | .000 | −.046 | −.032 |
STAI-T | .143** | −.087 | .132** | −.043 | 1.06 | 1.26 | .00 | 1.91 | .120** | .082** | −.317 |
BIS | .094** | −.015 | .090** | .013 | .07 | .05 | −.01 | .13 | .083** | .031* | −.189 |
RRQ | .069** | .071* | .055** | .023 | .08 | .10 | .00 | .10 | .041* | .016* | −.062 |
WBSI | −.021 | .000† | −.021 | −.004† | .01 | .01 | .00 | .02 | .039* | .004* | −.059 |
OT | .055** | .059† | .006† | .058* | .00 | 0 | −.13 | .17 | .000 | −.040 | −.018 |
IUS-P | .105** | −.251 | .070** | −.145 | .00 | −.02 | .00 | .20 | .000 | −.074 | −.068 |
IUS-I | .030* | .048† | .036* | .056† | −.01 | −.04 | .01 | .00 | .019† | −.005† | −.062 |
OCI-R | .007 | −.045 | −.007 | −.036 | .02 | −.04 | .02 | .03 | .001 | −.021 | −.002 |
Checking | .020* | −.042 | .021* | −.026 | .02 | −.18 | .03 | – | .044* | .028* | −.015 |
Hoarding | .004* | −.232 | .000* | −.125 | .00 | .00 | .00 | .01 | .000 | −.008 | −.114 |
Obsessing | .030* | .007 | .014* | −.007 | .07 | .03 | .05 | – | .024† | .018* | −.054 |
Ordering | .004† | −.104 | .004† | −.084 | .14 | −.24 | −.12 | .11 | .002 | −.056 | −.009 |
Neutralizing | −.001† | −.162 | .000† | −.165 | .01 | −.06 | .09 | .10 | .000 | −.074 | −.034 |
Washing | −.049 | −.001 | −.048 | −.001 | .00 | .00 | .00 | – | .002 | −.049 | −.007 |
SES | .072** | .020† | .069** | 0.02 | −0.09 | −0.07 | 0.05 | −0.14 | .118** | −.011 | −.255 |
DASS-21 = Depression Anxiety Stress Scale-21, Anxiety subscale; STAI-T = State–Trait Anxiety Inventory, Trait subscale; RRQ = Rumination-Reflection Questionnaire, Rumination subscale; OT = Obsessive Beliefs Questionnaire-20, Overestimation of Threat subscale; IUS-P = Intolerance of Uncertainty Scale-12, Prospective subscale; IUS-I = Intolerance of Uncertainty Scale-12, Inhibitory subscale; OCI-R = Obsessive–Compulsive Inventory–Revised full score; Checking = OCI-R, Checking subscale; Hoarding = OCI-R, Hoarding subscale; Obsessing = OCI-R, Obsessing subscale; Ordering = OCI-R, Ordering subscale; Neutralizing = OCI-R, Neutralizing subscale; Washing = OCI-R, Washing subscale. Coefficients of linear model: β1, β2, β3, β4. Significance levels: †p < .1; *p < .05; **p < .01.
Findings with p value below .05 are shown in bold. Each row reflects the (uncorrected for multiple comparisons) results from an independent analysis where ERN was regressed against each anxiety-related questionnaire score.
Table 2—Source Data 1. Pickled ERN-based models: https://bit.ly/4bCTPuE.
Table 2—Source Data 2. Detailed results of ERN ML-based models: https://bit.ly/3GSsExB.
Table 2—Source Data 3. Detailed results of ERN ERP-based models: https://bit.ly/3kvPeED.
Table 2—Source Code 1. Implementation of ERN-based models: https://bit.ly/3Xlb2Bw.
Within the external data set, ERN was significantly associated only with rumination (R2 = .071; p = .018), overestimation of threat (R2 = .058; p = .038), and marginally with the inhibitory dimension of IU (R2 = .056; p = .070). Of these, only rumination and inhibitory IU were related to ERN magnitude (the first PCA component). The associations were consistently present across the ElasticNet and KR-rbf estimators, but they showed a different level of overfitting to the data. Detailed results, including the number of components and ROIs selected by the model during hyperparameter tuning, along with mean absolute errors and mean squared errors of estimators, can be found in Table 2 source data. The results of the ERN-based ERP analysis for each of the selected questionnaires are shown in Table 2. None of the models yielded significant results on the testing set.
To test the possibility that one or more latent factors were driving the study's findings, we conducted an additional exploratory canonical correlation analysis (CCA). We included seven variables (four ERN-related PCA components and three anxiety-related symptoms) and three dimensions in the CCA. The results suggest the existence of two underlying latent variables; the third CCA dimension hardly differentiates the variables. There are four visible clusters formed by the first two CCA dimensions, which separate rumination, inhibitory IU, and threat overestimation and differentiate their relationship with error-related brain activity. Figure 6 shows the projection of variables into the CCA components' space.
CCA, 2-D ordination plots of four ERN-related PCA components (blue dots), and three anxiety-related dimensions (yellow dots). The direction and distance from the center of the graph with respect to each axis indicate the correlation coefficient between the variable and the ordination, that is, the axis. The radius of the inner circle represents a correlation of .5. The anxiety-related dimensions are RRQ = rumination, IUS-I = inhibitory intolerance of uncertainty, OT = threat overestimation. PCA components represent spatial features extracted from EEG signal. The proximity in the graph represents the strength of the relationships between the variables.
CCA, 2-D ordination plots of four ERN-related PCA components (blue dots), and three anxiety-related dimensions (yellow dots). The direction and distance from the center of the graph with respect to each axis indicate the correlation coefficient between the variable and the ordination, that is, the axis. The radius of the inner circle represents a correlation of .5. The anxiety-related dimensions are RRQ = rumination, IUS-I = inhibitory intolerance of uncertainty, OT = threat overestimation. PCA components represent spatial features extracted from EEG signal. The proximity in the graph represents the strength of the relationships between the variables.
Pe and Questionnaire Scores
As a secondary aim of our study, we tested how the Pe scores were associated with the self-report questionnaire scores to broaden the knowledge on error awareness. The mean cross-validated R2 of the Pe-questionnaire score associations ranged from −.3 to .2 (Table 3). Both the linear and the nonlinear models exhibited similar efficiency.
Detailed Results of the Pe-based Models
. | KR–rbf . | ElasticNet . | Classic ERP Analysis . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | CV R2 . | Test R2 . | CV R2 . | Test R2 . | β1 . | β2 . | β3 . | β4 . | Train R2 . | CV R2 . | Test R2 . |
DASS-21 | −.003† | −.083 | −.024 | −.048 | −.03 | −.02 | −.01 | – | .012 | −.054 | −.107 |
STAI-T | .008* | .018 | .008* | .018 | .45 | −.16 | .34 | .50 | .001 | −.009 | .005 |
BIS | .010* | .053† | .011* | .048† | .04 | −.05 | .03 | – | .000 | −.014 | −.008 |
RRQ | .046* | .103* | .048* | .099** | .02 | .00 | .20 | – | .006 | −.015 | .012 |
WBSI | −.015 | .022* | −.014 | .017* | .03 | −.02 | .03 | .02 | .000 | −.036 | −.023 |
OT | −.006 | −.048 | −.006 | −.023 | .05 | −.04 | −.04 | – | .001 | −.019 | −.011 |
IUS-P | .030* | .092* | .030* | .082* | .02 | −.13 | .07 | .05 | .005 | −.034 | −.053 |
IUS-I | −.033 | −.024 | −.033 | −.015 | .00 | .01 | .00 | – | .001 | −.008 | −.019 |
OCI-R | −.016 | −.001 | −.015 | −.001 | .00 | .00 | .00 | – | .011 | −.024 | −.006 |
Checking | −.002* | −.012 | −.002† | −.011 | .00 | .00 | .00 | – | .002 | −.061 | −.020 |
Hoarding | .003† | −.118 | .004* | −.115 | .00 | .04 | .00 | – | .006 | −.078 | −.145 |
Obsessing | .002† | −.047 | −.006 | −.004 | .04 | .00 | −.01 | – | .013 | −.041 | .058* |
Ordering | −.051 | −.006 | −.050 | −.005 | .00 | .00 | .00 | – | .000 | −.106 | −.003 |
Neutralizing | −.067 | −.042 | −.070 | −.033 | .00 | .00 | .00 | – | .017+ | −.070 | −.057 |
Washing | −.035† | −.285 | −.048 | −.001 | .00 | .00 | .00 | – | .004 | −.069 | −.059 |
SES | −.002† | −.076 | −.003† | −.047 | −.10 | .04 | .04 | – | .015 | −.028 | −.021 |
. | KR–rbf . | ElasticNet . | Classic ERP Analysis . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | CV R2 . | Test R2 . | CV R2 . | Test R2 . | β1 . | β2 . | β3 . | β4 . | Train R2 . | CV R2 . | Test R2 . |
DASS-21 | −.003† | −.083 | −.024 | −.048 | −.03 | −.02 | −.01 | – | .012 | −.054 | −.107 |
STAI-T | .008* | .018 | .008* | .018 | .45 | −.16 | .34 | .50 | .001 | −.009 | .005 |
BIS | .010* | .053† | .011* | .048† | .04 | −.05 | .03 | – | .000 | −.014 | −.008 |
RRQ | .046* | .103* | .048* | .099** | .02 | .00 | .20 | – | .006 | −.015 | .012 |
WBSI | −.015 | .022* | −.014 | .017* | .03 | −.02 | .03 | .02 | .000 | −.036 | −.023 |
OT | −.006 | −.048 | −.006 | −.023 | .05 | −.04 | −.04 | – | .001 | −.019 | −.011 |
IUS-P | .030* | .092* | .030* | .082* | .02 | −.13 | .07 | .05 | .005 | −.034 | −.053 |
IUS-I | −.033 | −.024 | −.033 | −.015 | .00 | .01 | .00 | – | .001 | −.008 | −.019 |
OCI-R | −.016 | −.001 | −.015 | −.001 | .00 | .00 | .00 | – | .011 | −.024 | −.006 |
Checking | −.002* | −.012 | −.002† | −.011 | .00 | .00 | .00 | – | .002 | −.061 | −.020 |
Hoarding | .003† | −.118 | .004* | −.115 | .00 | .04 | .00 | – | .006 | −.078 | −.145 |
Obsessing | .002† | −.047 | −.006 | −.004 | .04 | .00 | −.01 | – | .013 | −.041 | .058* |
Ordering | −.051 | −.006 | −.050 | −.005 | .00 | .00 | .00 | – | .000 | −.106 | −.003 |
Neutralizing | −.067 | −.042 | −.070 | −.033 | .00 | .00 | .00 | – | .017+ | −.070 | −.057 |
Washing | −.035† | −.285 | −.048 | −.001 | .00 | .00 | .00 | – | .004 | −.069 | −.059 |
SES | −.002† | −.076 | −.003† | −.047 | −.10 | .04 | .04 | – | .015 | −.028 | −.021 |
DASS-21 = Depression Anxiety Stress Scale-21, Anxiety subscale; STAI-T = State–Trait Anxiety Inventory, Trait subscale; RRQ = Rumination-Reflection Questionnaire, Rumination subscale; OT = Obsessive Beliefs Questionnaire-20, Overestimation of Threat subscale; IUS-P = Intolerance of Uncertainty Scale-12, Prospective subscale; IUS-I = Intolerance of Uncertainty Scale-12, Inhibitory subscale; OCI-R = Obsessive–Compulsive Inventory-Revised full score; Checking = OCI-R, Checking subscale; Hoarding = OCI-R, Hoarding subscale; Obsessing = OCI-R, Obsessing subscale; Ordering = OCI-R, Ordering subscale; Neutralizing = OCI-R, Neutralizing subscale; Washing = OCI-R, Washing subscale. Coefficients of linear model: β1, β2, β3, β4. Significance levels: †p < .1; *p < .05; ** p < .01.
Findings with a p value below .05 are shown in bold. Each row reflects the (uncorrected for multiple comparisons) results from an independent analysis where Pe was regressed against each anxiety-related questionnaire score.
Table 3—Source Data 1. Pickled Pe-based models with their detailed results: https://bit.ly/49wA7yX.
Table 3—Source Data 2. Detailed results of Pe ML-based models: https://bit.ly/3w9efYX.
Table 3—Source Data 3. Detailed results of Pe ERP-based models: https://bit.ly/3GVRGvS.
Table 3—Source Code 1. Implementation of Pe-based models: https://bit.ly/3Xlb2Bw.
Within the external data set, Pe was significantly associated only with rumination (R2 = .103; p = .016), BIS (R2 = .053; p = .053), and the prospective dimension of IU (R2 = .092; p = .021). These associations showed only a slight dependence on the magnitude of Pe (the first PCA component). Detailed results are described in Table 3. Full results, including the number of components and ROIs selected by the model during hyperparameter tuning, along with mean absolute error and mean squared error of estimators, can be found in Table 3 source data.
None of the Pe-based ERP models yielded significant results. The results of the Pe-based ERP analysis for each of the selected questionnaires are shown in Table 3.
To test the possibility that one or more latent factors were driving the study's findings, we conducted an additional exploratory CCA. We included six variables (three Pe-related PCA components and three anxiety-related symptoms) and three dimensions in the CCA. The results suggest the existence of one underlying latent variable that may drive the relationships between PCA components and anxiety symptoms; the second and third components loaded only PCA components. These results suggest that our results might be explained by one latent variable. Figure 7 shows the projection of variables into the CCA components' space.
CCA, 2-D ordination plots of three Pe-related PCA components (blue dots), and three anxiety-related dimensions (yellow dots). The direction and distance from the center of the graph with respect to each axis indicate the correlation coefficient between the variable and the ordination, that is, the axis. The radius of the inner circle represents a correlation of .5. The anxiety-related dimensions are RRQ = rumination, IUS-P = prospective intolerance of uncertainty, BIS = behavioral inhibition. PCA components represent spatial features extracted from the EEG signal. The proximity in the graph represents the strength of the relationships between the variables.
CCA, 2-D ordination plots of three Pe-related PCA components (blue dots), and three anxiety-related dimensions (yellow dots). The direction and distance from the center of the graph with respect to each axis indicate the correlation coefficient between the variable and the ordination, that is, the axis. The radius of the inner circle represents a correlation of .5. The anxiety-related dimensions are RRQ = rumination, IUS-P = prospective intolerance of uncertainty, BIS = behavioral inhibition. PCA components represent spatial features extracted from the EEG signal. The proximity in the graph represents the strength of the relationships between the variables.
DISCUSSION
The aim of the present article was to use ML predictive modeling to investigate whether variance in brain spatial patterns after error commission in a nonclinical sample can be explained by any specific dimension of anxiety. We were most interested in which dimensions of anxiety would be directly related to the main source of the signal variance and thus to the ERN/Pe amplitude magnitude, and which would be related to the remaining variance of the error-related brain signal.
The ability of ERN/Pe to predict anxiety symptoms was, as expected, specific and limited to just a few facets of anxiety, which helps directly define the phenomenon to which ERN/Pe variability is related. ERN was associated with rumination, overestimation of threat, and inhibitory IU, but it was not associated with OCD symptoms or thought suppression. This pattern of results is in line with findings suggesting that ERN is linked with anxious apprehension and anxiety that is specifically related to performance, behavior, and mistakes (Weinberg et al., 2016; Moser et al., 2013). Furthermore, the specific associations between ERN and overestimation of threat, inhibitory IU, and rumination indicate that ERN variability may reflect hypersensitivity to errors. The additional analysis revealed that lower self-esteem is associated with larger ERN, which seems to support such an interpretation. Our results fit within the model of the evaluative and compensatory components of the error-monitoring process, which links ERN magnitude with the degree to which errors are perceived as threatening (Weinberg et al., 2016; Weinberg et al., 2012). So far, there has been little work examining and confirming the links between Pe and anxiety dimensions. Our article is the first to determine the relationship between various dimensions of anxiety and Pe on a large scale. In the present study, more pronounced Pe turned out to be associated with higher levels of rumination, prospective IU, and behavioral inhibition; however, it again turned out to be only very slightly associated with trait anxiety. Such results confirm that Pe reflects different aspects of error monitoring than ERN, with particular emphasis on conscious awareness and error salience. Furthermore, we have shown that many anxiety-related individual differences are related not only to the magnitude of the ERN but also to the remaining variance that supposedly reflects some other (cognitive) mechanisms.
Because we were able to analyze the spatial features of the signal through the use of spatial filters, in the following sections, we will discuss in more detail the revealed relationships between error processing and different dimensions of anxiety: rumination, overestimation of threat, intolerance of uncertainty, behavioral inhibition, anxiety, OCD symptoms, and thought suppression. A summary of the relationships between anxiety-related individual differences and spatial features is shown in Appendix G.
ERN
The rumination model was more stable when using the nonlinear estimator than the linear one. An important consequence of this is that rumination-ERN models might yield different results when the rumination statistics of populations differ. This fact is worth considering in future studies. In the linear ERN-rumination model, the first, second, and fourth components were positively associated with rumination scores (see Table 2). These results indicate that increased rumination is associated with increased ERN magnitude (the first component) but also with the specific spatial pattern of the brain signal: frontal and fronto-central distribution (pattern of the second component; Figure 4) and right lateralization (the fourth component). The stable relationship found between rumination and ERN is not surprising. According to Smith and Alloy (2009), rumination is strongly connected with avoidance-coping strategies, such as experiential avoidance, which usually co-occur with increased ERN amplitudes; this is explained by the avoidance of negative situations and, hence, abnormal error control (Zambrano-Vazquez et al., 2019; Hajcak & Foti, 2008). On the other hand, our findings contrast with the attenuated ERN amplitudes associated with high rumination revealed by Tanovic and colleagues (2017) and Whitmer and Gotlib's (2012) attentional scope model of rumination. Whitmer and Gotlib link rumination with distraction and constant diminished performance monitoring. Nonetheless, rumination has many facets related to both distraction and error hypersensitivity. Contradictory results can be explained by differences in samples' characteristics or differences in the rumination construct measured by two different questionnaires. In this work, the lack of an association between ERN and thought suppression led us to the interpretation that ERN might be a marker not of compensation for distraction but of internal threat. The link between ERN and rumination requires further research that takes into account links to anxiety, worry, and distraction.
Overestimation of threat was better predicted with the nonlinear estimator within the internal set, although the linear estimator also yielded satisfactory results. The zeroed coefficient of the first component indicates that overestimation of threat is not associated with ERN magnitude, that is, error processing intensity. Instead, it is associated with a specific spatial pattern of brain signal: Individuals with a low level of overestimation of threat should be characterized by a brain signal after error commission that is lateralized and distributed over a very small area (pattern of the third component; Figure 4). In light of our results, threat overestimation does not directly increase ERN amplitude, which explains its lack of association with ERN in classic analyses of mean amplitude over a temporal window. However, threat overestimation changes the spatial characteristics of the brain signal after error commission, making it more spatially distributed; thus, the current findings are an important and novel extension of existing works (Chong & Meyer, 2019; Barke et al., 2017; Jackson, Nelson, & Hajcak, 2016; Weinberg et al., 2016).
Intolerance of uncertainty is considered to be an anxiety dimension that may be a transdiagnostic phenotype linked with ERN (Cavanagh & Shackman, 2015). Our results confirm the IU–ERN association, but only with the inhibitory dimension of IU as elevated inhibitory IU was slightly associated with reduced ERN amplitude. The coefficients of the linear model indicate that higher intolerance of uncertainty is characterized by a brain signal after error commission that is less spatially spread and of smaller magnitude, that is, lower error-monitoring intensity (see Table 2). Inhibitory IU is frequently linked to social and panic disorders, which are commonly characterized by maladaptive self-monitoring, avoidance, and inhibited behavior (McEvoy & Mahoney, 2012; Clark & Wells, 1995). Moreover, individuals with a high level of inhibitory IU show a reduced physiological response to both reward (Nelson, Shankman, & Proudfit, 2014) and threat (Nelson & Shankman, 2011). Thus, the negative association between inhibitory IU and ERN might be related to an individual's attempt to escape from experiencing and evaluating errors, that is, avoidance of uncertain threats. A similar pattern of results was reported by Jackson and colleagues (2016).
We did not confirm a robust association between ERN and commonly used questionnaires that assess anxiety, such as DASS-21 (anxiety subscale) and STAI-T. Although ERN predicted a significant increase in STAI-T scores on the internal data set, it failed external validation. Such results might imply that STAI-T scores aggregate various anxiety-related features. Furthermore, ERN did not predict any of the OCD symptoms or the overall OCI-R score. This pattern may be surprising, but it was also shown in a recent analysis by Seow and colleagues (2020), who did not find a significant association between ERN amplitude and OCD symptom severity. One possible explanation mentioned by Seow and colleagues is that this association might depend on a greater level of symptom severity and is difficult to capture in a nonclinical population. Our results also did not confirm the findings of Weinberg and colleagues (2016), where checking behavior was the specific dimension associated with increased ERN amplitude. Our results shed new light on the possible reason for the high variability of OCD-ERN results. At least in the nonclinical population considered in this work, the pattern of association between OCD and ERNs strongly depended on the subset of data. Contradictory and nonsignificant ERN-OCD results may reflect this ERN-OCD sample dependence. Interestingly, ERN consistently did not predict thought suppression on both the internal and the external data sets, thus suggesting the lack of a relationship between error monitoring and thought suppression.
To test the possibility that our results may be driven by one or more latent factors that are common to the identified symptoms, we conducted the CCA, the results of which suggest the existence of two underlying latent variables that drive the relationships between the PCA components and dimensions of anxiety associated with the ERN. This suggests that there are two probably distinct characteristics of error-related brain patterns associated with the anxiety-related dimensions. The CCA revealed a strong relationship between rumination and the first and second PCA components, thus indicating that both PCA components may represent the ERN magnitude. It also revealed a close relationship between inhibitory IU and the fourth PCA component, which is a marker of lateralization. The CCA results suggest that rumination, inhibitory IU, and threat overestimation are differently associated with error-related brain activity, and these associations can be explained by two underlying latent factors.
Pe
Both the linear and the nonlinear estimators yielded similar results. In the linear Pe-rumination model, the first and third components were positively associated with rumination scores (see Table 3). These results indicate that increased rumination is associated with increased Pe magnitude (the first component) and thus error-monitoring intensity; it is also associated with a highly fronto-central spatial brain signal pattern (the third component; Figure 4). The relationship between the fronto-central distribution of Pe and rumination suggests that early Pe, but not late Pe, is sensitive to rumination. Functionally, Pe is usually linked to conscious perception and both affective and motivational aspects of error monitoring (Hughes & Yeung, 2011; Ridderinkhof et al., 2009; Nieuwenhuis, Ridderinkhof, Blom, Band, & Kok, 2001; Falkenstein et al., 2000). Because rumination is characterized by repetitive thinking about negative emotional content, it is plausible that larger Pe amplitudes could be associated with higher rumination levels, but there is no evidence for this in the literature. Ruminative individuals, being more concerned about their mistakes, may react to errors more strongly. As far as we know, this is the first study that confirms a strong link between Pe and rumination.
In the BIS-Pe models, both the linear and the nonlinear estimators yielded similar results on the internal and external data sets. The coefficients of the linear models indicate that increased BIS is associated with both increased Pe magnitude (and thus error-monitoring intensity: the first component) and a specific spatial pattern: Individuals high in BIS are expected to have very central Pe (pattern of the third component combined with the antipattern of the second component; Figure 4). Our results indicate that emotional sensitivity, which is expressed by behavioral inhibition, strongly influences one's conscious processing of errors, and error awareness measured via Pe might be a good biomarker of behavioral inhibition.
Pe was significantly associated with prospective IU. Specifically, increased prospective IU was associated with increased Pe magnitude and with more centrally focused spatial distribution of Pe. Again, such a pattern of results indicates that the central location of Pe is a very important feature of the brain's response to error commission; indeed, it is even more important than the intensity of the error-monitoring process. Prospective UI refers to the desire for predictability and is directly related to anxiety and motivational–emotional processes. It is possible that the positive relationship between prospective UI and Pe reflects increased motivational sensitivity and the desire to act, that is, a conscious correction of behavior after making an error. It should be noted that Pe was not associated with self-esteem, neither in classic ERP nor in the presented PCA-based analysis. Such results suggest that the specific emotional sensitivity of which Pe might be a biomarker is a different phenomenon from self-esteem and that self-esteem does not contribute to the subjective value of feedback and the desire to correct behavior. Furthermore, self-esteem was moderately related to anxiety dimensions (correlations of .2 to .6), so it is probable that it is the variance that is not shared with self-esteem that drove the obtained effects.
We tested the possibility that our results on Pe may be driven by one or more latent factors that are common to the identified symptoms. Indeed, the results of the CCA suggest the existence of an underlying latent factor driving the relationships between rumination, behavioral inhibition, and prospective IU and the PCA components. Thus, both anxiety-related symptoms and Pe-based PCA components can be reduced to one dimension. The spatial features of Pe do not significantly differentiate the Pe-anxiety symptoms' relationships; however, rumination was most strongly associated with the third PCA component, which represents very central brain signal distribution.
Overall, such a pattern of results confirms that Pe may reflect or at least be strongly connected with emotional sensitivity. In line with a recent study that suggested the utility of Pe as a biomarker of psychotherapy outcomes in social anxiety disorder and MDD, our findings call for in-depth research into the functional significance of Pe in trajectories of psychopathology development (Kinney, 2021).
Replicability Problem
The third aim of the study was to investigate how the obtained results depend on the studied population. By employing a cross-validation strategy, we were able to check the consistency and generalizability of the results, that is, the consistency of the associations between the subsets of the population under study. Most ERN/Pe-phenomenon relationships are characterized by broad standard deviations of cross-validated results, namely, extremely diverse results that depend on the subset of data fitted to the model for both ML-based and ERP-based models. The importance of overfitting to the population is again strongly emphasized by the results from the external validation data set. Nearly half of the studied associations had the opposite direction when the model was evaluated on the internal and external data sets. Large standard deviations imply a high variability of ERN/Pe-phenomenon associations in the population. Such a pattern of results strongly suggests that it may not be possible to describe most of the studied associations by a single function that assumes consistency in the direction of the effect in the population and thus differences only in the intensity of the effect in the population. This is in contrast to the usual assumption of researchers conducting regression analysis or between-groups studies. To show the level of overfitting and the probability of false positives having been found in existing studies, we conducted an additional analysis based on classic ERP waveforms. Most of the results when estimating the ERN-phenomenon association using the training data set dropped in association strength by at least a factor of two after cross-validation. When testing the ERN/Pe-phenomenon associations using the testing data set, none of the findings remained significant for both ERN and Pe. This indicates that when estimating effects on different populations, there is a high chance of finding results that contradict those of previous studies. Our results might serve as an example of the difficulties faced by research on individual differences in error monitoring, where contradictory results can be obtained even on subsamples from a population performing the same experiment. Our results suggest that even the most powerful statistical approaches may not be able to improve the replicability of the results alone; therefore, the replicability problem must be carefully addressed at the statistical, experimental design, and theoretical levels.
Limitations
This work has certain limitations. The most important caveat of the presented study is that participants with any mental disorder were excluded; therefore, the variability of the analyzed anxiety symptoms is limited, especially highly clinical ones such as OCD symptoms. Our data set certainly did not cover the entire spectrum of severity of anxiety symptoms; thus, it inherently limits the clinical conclusions in two ways: (1) It is possible that the pattern of associations between error monitoring and anxiety symptoms changes in the population of people with clinical anxiety, and this pattern in clinical and nonclinical groups cannot be described using the same function; (2) our analysis may not have captured some of the important clinical relationships because of the low overall levels of some anxiety symptoms of the healthy volunteer population. To be clinically relevant, the study needs to be replicated in a clinical population. Moreover, our data set contained three times more female than male participants; this may influence the results, as ERN is known to be gender-sensitive. Some other issues also limit the conclusions of the presented study. There may be a single latent variable that drives the discovered effects for ERN and Pe. This possibility has been preliminarily addressed by CCA analysis, but this issue should be subject to critical investigation through a new analysis. The internal consistency of measurements in many of the participants was poor. Nonetheless, because ML models require as much data as possible, we decided not to limit the amount of data for the sake of internal consistency; instead, we have reported it and we leave it to other researchers to judge the reliability of the developed models. Our hold-out data set certainly has a very small sample size, which contributes to the low power of external testing. When the size of an external data set is small, the results of external testing should always be carefully interpreted. If significant results on an external data set probably confirm a relationship that generalizes to a group with subtly different characteristics, the absence of significant results does not mean that such a relationship is not possible. It would surely be worthwhile to replicate the present results using bigger samples. Finally, the conducted analysis was purely exploratory and data-driven; therefore, it has inherent disadvantages. This kind of ML approach is not only prone to biases for various reasons (origin of the data, choice of models used, numerical inaccuracy or complexity of the features), but it also usually suffers the consequences of black-box modeling assumptions. Therefore, some researchers argue (Bennett, Silverstein, & Niv, 2019; Huys, Maia, & Frank, 2016) that ML approaches might not be significantly useful in advancing knowledge in various fields of application. Exploratory ML in most cases provides only a black-box model that does not challenge existing theories as it does not assume the existence of a true model that should be found and validated. Therefore, to advance the theory base of ERN-related studies, the results we present should be transformed into new hypotheses to be verified in classic confirmatory studies.
Conclusions
In summary, the current study examined the associations between error-related brain activity and 15 different phenomena related to anxiety: rumination; overestimation of threat; prospective and inhibitory intolerance of uncertainty; behavioral inhibition; mixed anxiety measured with STAI-T; mixed anxiety measured with DASS-21; OCD checking, ordering, obsessing, washing, and neutralizing symptoms; general OCD symptoms; thought suppression; and additionally self-esteem. Specifically, increased ERN was associated with rumination and overestimation of threat, whereas attenuated ERN was associated with inhibitory IU. Pe was associated with behavioral inhibition, rumination, and prospective IU. These findings demonstrate the heterogeneity of the anxiety phenomenon and support the view that ERN is an index of sensitivity to uncertain threats. The revealed diverse patterns of ERN–anxiety associations call for an in-depth dimensional study. Furthermore, the obtained results show the benefits of using spatial filtering to extract spatial brain signal features. Our study showed that not only error-monitoring intensity but also other spatial dynamics contribute to individual differences in error monitoring. Specifically, our findings suggest that some of the variability in error-related brain activity might be explained by individual differences in lateralization variability, especially for behavioral inhibition, thought suppression, overestimation of threat, rumination, and self-esteem. This finding emphasizes that ERP ERN amplitude reflects various cognitive mechanisms and includes more than just the variability related to the error-processing intensity. Finally, the conducted analyses demonstrate the vulnerability of statistical models to overfitting and support the use of the ML paradigm in increasing the reliability and generalizability of brain–behavior models.
The data set of EEG recordings and questionnaire data along with additional online resources is available at https://osf.io/2bkzj/. The code for reproducing the analyses from the presented study, as well as the detailed results, are available at https://github.com/abelowska/erroneurous under the MIT License.
APPENDIX A. GO/NO-GO TASK
The whole experiment included (1) a practice block of 15 trials, (2) four experimental blocks of 84 trials each, and (3) two calibration blocks of 14 trials, during which a strict RT cutoff for fast (positive feedback) and slow (negative feedback) go responses was calculated. Calibration blocks preceded the first and third experimental blocks. Each trial in the experimental block began with the presentation of a central fixation cross for 500 msec, followed by a cue consisting of a black square or diamond, displayed for a variable time interval (between 1000 and 2000 msec) in the center of a gray screen. Then, the color of the cue changed to either green or orange, whereas its in-plane orientation either remained identical (diamond–diamond or square–square sequence) or changed shape (diamond–square or square–diamond sequence). This new green or orange geometric figure acted as a target (go/no-go) and was presented on the screen for 1000 msec or until a button press. If the figure kept its original shape and turned green (two thirds of the trials, all corresponding to go trials), participants had to press a predefined button on the keyboard as quickly as possible with their right thumb. In contrast, if the figure changed shape (one sixth of the trials) or turned orange (one sixth of the trials, all corresponding to no-go trials), participants were instructed to withhold their response. Note that the association between the target type (go/no-go) and the color (green/orange) was counterbalanced between participants. On each go trial, the RT was compared against an arbitrary cutoff that was calculated either during the first calibration block (and used in the first and second experimental blocks) or during the second calibration block (and used in the third and fourth experimental blocks). During the first two experimental blocks, participants had to be 10% faster than the mean RT calculated during the first calibration block; during the third and fourth experimental blocks, participants had to be respectively 10% or 20% faster than the mean RT calculated during the second calibration block. If the RT speed was above these limits (slow go trials), then negative feedback (a sad face) was provided 1000 msec after target onset. In turn, if the RT speed was below these limits (fast go trials), positive feedback (a smiling face) was presented. The RT cutoff was determined for each participant separately, without his/her knowledge, and was adjusted during the experimental session to overcome the interindividual variability in the speed of motor responses, as well as to deal with the effects of time and learning. No feedback was provided after inhibitory errors in no-go trials to enhance internal monitoring in these cases. Similarly, no feedback was given during calibration blocks. In the intertrial interval, a blank screen was presented for 1000 msec. Trial presentation was randomized within blocks. Both speed and accuracy were equally emphasized. The instructions and the entire experiment were in the participants' native language. The code for the task is available at https://github.com/ociepkam/Go_No-Go.
APPENDIX B. QUESTIONNAIRE SCORES EXTRACTION
For all scales and subscales selected for the analysis, we performed total scorings in line with the instructions provided by the authors of the questionnaires; this included reverse scoring when appropriate. Then, the total scores were divided by the number of items in the given scale/subscale, forming mean scores. All means, standard deviations, and Cronbach's α reported in Table 1 were calculated on the mean scores. For regression and CCA analyses, the mean scores were centered and scaled by dividing nonconstant features by their standard deviation.
APPENDIX C. INTERNAL CONSISTENCY OF PRELIMINARY TESTED ROIS
Estimates of the variability (σ) and subject-level dependability coefficient (φθjk) of PCA-based ERN scores (A) and Pe scores (B) for different ROIs across participants within the training set. ROIs selected for the main analyses are marked with frames. ROI 1 = Fpz, AFz, F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2, CP1, CPz, CP2, P1, Pz, P2; ROI 2 = Fpz, AFz, Fz, FCz, Cz, CPz, P1, Pz, P2; ROI 3 = Fpz, AFz, Fz, FCz, C1, Cz, C2, CPz, P1, Pz, P2; ROI 4 = Fpz, AFz, F1, Fz, F2, FCz, C1, Cz, C2, CPz, P1, Pz, P2; ROI 5 = F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2; ROI 6 = Fpz, AFz, F1, Fz, F2, FCz, Cz, CP1, CPz, CP2, Pz; ROI 7 = FC1, FCz, FC2, C1, Cz, C2, CP1, CPz, CP2.
Estimates of the variability (σ) and subject-level dependability coefficient (φθjk) of PCA-based ERN scores (A) and Pe scores (B) for different ROIs across participants within the training set. ROIs selected for the main analyses are marked with frames. ROI 1 = Fpz, AFz, F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2, CP1, CPz, CP2, P1, Pz, P2; ROI 2 = Fpz, AFz, Fz, FCz, Cz, CPz, P1, Pz, P2; ROI 3 = Fpz, AFz, Fz, FCz, C1, Cz, C2, CPz, P1, Pz, P2; ROI 4 = Fpz, AFz, F1, Fz, F2, FCz, C1, Cz, C2, CPz, P1, Pz, P2; ROI 5 = F1, Fz, F2, FC1, FCz, FC2, C1, Cz, C2; ROI 6 = Fpz, AFz, F1, Fz, F2, FCz, Cz, CP1, CPz, CP2, Pz; ROI 7 = FC1, FCz, FC2, C1, Cz, C2, CP1, CPz, CP2.
APPENDIX D. PCA
Blind decomposition methods such as PCA extract spatial signal features by reducing the spatial dimensionality of the signal without any significant loss of information, thereby reducing the likelihood of overfitting (Schurger, Marti, & Dehaene, 2013). The aim of PCA is to reduce the dimensionality of n-dimensional data to a k-dimensional subspace (where k < n), which preserves most of the information. Thus, we can describe PCA as a projection of a data set into a different feature subspace. PCA is usually described in terms of eigenvectors and eigenvalues. Eigenvectors determine the direction of the new feature subspace, while eigenvalues determine the magnitude of this subspace. Eigenvectors are vectors of the weights assigned to features of the data from the original space. Eigenvalues might be understood as variance explained in the direction of the eigenvector, that is, the variance explained by a given component.
In this work, the aim of PCA in the spatial dimension was to extract a set of scalp distributions (spatial components) that account for the spatial variance in the data. We can imagine each element of this set as a single “virtual channel”, and we can refer to the temporal signal waveforms on these “virtual channels” as “virtual ERPs” (similarly to Spencer, Dien, & Donchin, 2001). Note that in our work these “virtual ERPs” do not correspond to real or latent ERP components because the aim of PCA is to extract components that maximize variance in data. Application of spatial filters makes it possible to analyze spatial features of the brain signal that are usually discarded—and are thus rarely analyzed on their own—when selecting a single electrode or averaging the signal within the ROI. It stands to reason that the spatial distribution of the brain signal may encode information that is relevant to individual differences. Thus, PCA may give new insights into the spatial patterns of brain activity that are associated with specific events and individual differences.
Interpreting the signal after spatial decomposition might be challenging. Thus, to unify and ensure clarity in the vocabulary used, we decided to use the following terms:
- (1)
component weights, which denotes the weights vector (eigenvector), that is, the coefficients of a particular PCA component. In the ML community, these are sometimes called filters (e.g., Haufe et al., 2014). Component weights should not be confused with component loadings, which are eigenvectors multiplied by the square root of the eigenvalues, that is, bare weights loaded by the amount of explained variance. Loadings are typically used instead of weights when PCA is used as a factor analysis technique rather than as a spatial filter.
- (2)
component scores: the element-wise product of the EEG signal and the eigenvector, that is, the EEG signal on each channel, weighted by the eigenvector. Component scores can be viewed as the spatial pattern of a given component. Component scores, unlike component weights, are neurophysiologically interpretable and are expressed in μV.
- (3)
spatial feature: the dot product of the EEG signal and the eigenvector. The spatial feature denotes a new feature created by projecting original features (in this context, EEG channels) into lower-dimensional space. The spatial feature is simply a PCA component. This new feature corresponds to the aforementioned “virtual channel”. Thus, when referring to the spatial feature of a signal or component, we are referring to the signal on a “virtual channel”. For further information on the interpretation of spatial filters in neuroimaging, see (Haufe et al., 2014).
Regarding the choice of the optimal number of components, it should be noted that there is no guideline as to what number of PCA components is optimal, and it mostly depends on the data set. Usually, the choice of the maximal number of PCA components will be subject to interpretability, overfitting, and the amount of preserved variance trade-off. The basic assumption of PCA in ERP research is that the first component of PCA contains the most variance directly related to the ERP amplitude. However, subsequent statistical sources of variance in the EEG signal (subsequent PCA components) could also carry important information. Therefore, it is usually profitable to include more than one component in the analysis. However, the less variance a component explains, the more difficult it is to interpret as the lower-order components will mainly explain the variance, whose most likely source is noise of various origins. Furthermore, increasing the number of PCA components increases model complexity; complex models are strongly susceptible to overfitting and thus to poor performance on test data.
APPENDIX E. DISTRIBUTIONS OF ANXIETY-RELATED QUESTIONNAIRE SCORES IN TESTING AND TRAINING DATA SETS
Distributions of questionnaire scores in the training (blue) and testing (yellow) data sets. The questionnaire scores are normalized in such a way that 0 corresponds to the smallest and 1 to the largest possible value that can be obtained on a given scale. DASS-21 = Depression Anxiety Stress Scale-21, Anxiety subscale; STAI-T = State–Trait Anxiety Inventory, Trait subscale; RRQ = Rumination-Reflection Questionnaire, Rumination subscale; OT = Obsessive Beliefs Questionnaire-20, Overestimation of Threat subscale; IUS-P = Intolerance of Uncertainty Scale-12, Prospective subscale; IUS-I = Intolerance of Uncertainty Scale-12, Inhibitory subscale; OCI-R = Obsessive–Compulsive Inventory–Revised full score; Checking = OCI-R, Checking subscale; Hoarding = OCI-R, Hoarding subscale; Obsessing = OCI-R, Obsessing subscale; Ordering = OCI-R, Ordering subscale; Neutralizing = OCI-R, Neutralizing subscale; Washing = OCI-R, Washing subscale.
Distributions of questionnaire scores in the training (blue) and testing (yellow) data sets. The questionnaire scores are normalized in such a way that 0 corresponds to the smallest and 1 to the largest possible value that can be obtained on a given scale. DASS-21 = Depression Anxiety Stress Scale-21, Anxiety subscale; STAI-T = State–Trait Anxiety Inventory, Trait subscale; RRQ = Rumination-Reflection Questionnaire, Rumination subscale; OT = Obsessive Beliefs Questionnaire-20, Overestimation of Threat subscale; IUS-P = Intolerance of Uncertainty Scale-12, Prospective subscale; IUS-I = Intolerance of Uncertainty Scale-12, Inhibitory subscale; OCI-R = Obsessive–Compulsive Inventory–Revised full score; Checking = OCI-R, Checking subscale; Hoarding = OCI-R, Hoarding subscale; Obsessing = OCI-R, Obsessing subscale; Ordering = OCI-R, Ordering subscale; Neutralizing = OCI-R, Neutralizing subscale; Washing = OCI-R, Washing subscale.
APPENDIX F. DESCRIPTIVE STATISTICS OF THE ERN AND PE SCORES IN THE TRAINING AND TESTING DATA SETS
Descriptive Statistics (in μV) of the ERN and Error-positivity (Pe) Scores for Each of the PCA Components in the Training and Testing Data Sets
. | . | ERN . | Pe . | ||||||
---|---|---|---|---|---|---|---|---|---|
ROI 1 . | ROI 2 . | ROI 3 . | ROI 4 . | ||||||
M (SD) . | p Value . | M (SD) . | p Value . | M (SD) . | p Value . | M (SD) . | p Value . | ||
PCA1 | Train | 19.0 (10.6) | .770 | 24.30 (13.60) | .663 | 44.3 (19.1) | .386 | 56.3 (23.6) | .366 |
Test | 18.4 (09.6) | 23.3 (12.7) | 41.1 (13.8) | 52.3 (17.2) | |||||
PCA2 | Train | 11.8 (06.9) | .020 | 13.0 (7.7) | .031 | 11.3 (5.5) | .152 | 14.0 (6.8) | .100 |
Test | 08.9 (04.3) | 09.9 (5.0) | 09.7 (5.0) | 11.9 (5.7) | |||||
PCA3 | Train | 06.1 (03.1) | .881 | 06.1 (3.3) | .946 | 06.6 (3.3) | .715 | 06.8 (3.5) | .718 |
Test | 06.0 (02.8) | 06.1 (2.8) | 06.3 (2.9) | 06.6 (3.0) | |||||
PCA4 | Train | 02.1 (01.2) | .283 | 02.5 (1.4) | .119 | 02.7 (1.6) | .172 | 03.1 (1.5) | .253 |
Test | 01.9 (01.0) | 02.1 (1.2) | 02.3 (1.1) | 02.7 (1.6) |
. | . | ERN . | Pe . | ||||||
---|---|---|---|---|---|---|---|---|---|
ROI 1 . | ROI 2 . | ROI 3 . | ROI 4 . | ||||||
M (SD) . | p Value . | M (SD) . | p Value . | M (SD) . | p Value . | M (SD) . | p Value . | ||
PCA1 | Train | 19.0 (10.6) | .770 | 24.30 (13.60) | .663 | 44.3 (19.1) | .386 | 56.3 (23.6) | .366 |
Test | 18.4 (09.6) | 23.3 (12.7) | 41.1 (13.8) | 52.3 (17.2) | |||||
PCA2 | Train | 11.8 (06.9) | .020 | 13.0 (7.7) | .031 | 11.3 (5.5) | .152 | 14.0 (6.8) | .100 |
Test | 08.9 (04.3) | 09.9 (5.0) | 09.7 (5.0) | 11.9 (5.7) | |||||
PCA3 | Train | 06.1 (03.1) | .881 | 06.1 (3.3) | .946 | 06.6 (3.3) | .715 | 06.8 (3.5) | .718 |
Test | 06.0 (02.8) | 06.1 (2.8) | 06.3 (2.9) | 06.6 (3.0) | |||||
PCA4 | Train | 02.1 (01.2) | .283 | 02.5 (1.4) | .119 | 02.7 (1.6) | .172 | 03.1 (1.5) | .253 |
Test | 01.9 (01.0) | 02.1 (1.2) | 02.3 (1.1) | 02.7 (1.6) |
M = mean; SD = standard deviation; component; p value = result of statistical test for difference in scores between the training and testing data sets, performed with permutation tests.
Table F1—Source Data 1. https://bit.ly/3IYD29E.
Table F1—Source Code 1. https://bit.ly/3XGGLNo.
APPENDIX G. THE RELATIONSHIPS BETWEEN ANXIETY-RELATED INDIVIDUAL DIFFERENCES AND PCA SPATIAL FEATURES
The Relationships between Anxiety-related Individual Differences and Particular Spatial Features of the Brain Signal
. | ERN . | Pe . | ||||||
---|---|---|---|---|---|---|---|---|
Mag . | F + FC . | FC . | Lat . | Mag . | F + FC . | FC . | Lat . | |
DASS-21 Anx | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
STAI-T | + | + | 0 | + | + | − | + | + |
BIS | + | + | − | + | + | − | + | 0 |
RRQ | + | + | 0 | + | + | 0 | + | 0 |
WBSI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
OT | 0 | 0 | − | + | 0 | 0 | 0 | 0 |
IUS-P | 0 | − | 0 | + | + | − | + | + |
IUS-I | − | − | + | 0 | 0 | 0 | 0 | 0 |
OCI-R | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Checking | + | − | + | 0 | 0 | 0 | 0 | 0 |
Hoarding | 0 | 0 | 0 | 0 | 0 | + | 0 | 0 |
Obsessing | + | + | + | 0 | 0 | 0 | 0 | 0 |
Ordering | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Neutralizing | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Washing | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
SES | − | − | + | − | 0 | 0 | 0 | 0 |
. | ERN . | Pe . | ||||||
---|---|---|---|---|---|---|---|---|
Mag . | F + FC . | FC . | Lat . | Mag . | F + FC . | FC . | Lat . | |
DASS-21 Anx | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
STAI-T | + | + | 0 | + | + | − | + | + |
BIS | + | + | − | + | + | − | + | 0 |
RRQ | + | + | 0 | + | + | 0 | + | 0 |
WBSI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
OT | 0 | 0 | − | + | 0 | 0 | 0 | 0 |
IUS-P | 0 | − | 0 | + | + | − | + | + |
IUS-I | − | − | + | 0 | 0 | 0 | 0 | 0 |
OCI-R | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Checking | + | − | + | 0 | 0 | 0 | 0 | 0 |
Hoarding | 0 | 0 | 0 | 0 | 0 | + | 0 | 0 |
Obsessing | + | + | + | 0 | 0 | 0 | 0 | 0 |
Ordering | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Neutralizing | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Washing | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
SES | − | − | + | − | 0 | 0 | 0 | 0 |
The relationships are shown for associations that were found to be significant at least at the cross-validation level. Mag = first PCA component – magnitude of signal; F + FC = second PCA component – frontal and fronto-central brain signal distribution; FC = third PCA component – only fronto-central brain signal distribution; Lat = fourth PCA component – lateralized brain signal. Association types: + positive association; − negative (opposite) association; 0 no association.
Acknowledgments
The authors gratefully acknowledge the reviewers for their constructive comments. They also gratefully acknowledge the help of Michał Ociepka with task programming, students and collaborators from Jagiellonian University who helped with data recording, and the participants who volunteered to take part in the study.
Corresponding authors: Anna Grabowska, Doctoral School in the Social Sciences, Jagiellonian University, Main Square 34, 30–010 Krakow, Poland, or via e-mail: [email protected]; or Magdalena Senderecka, Institute of Philosophy, Jagiellonian University, Grodzka 52, 31–044 Krakow, Poland, or via e-mail: [email protected].
Author Contributions
The study concept and methodology were developed by M. Senderecka and A. Grabowska. Data analysis was completed by A. Grabowska and F. Sondej. A. Grabowska was also responsible for data curation and visualization. A. Grabowska wrote the initial draft of the manuscript and M. Senderecka provided critical revisions. F. Sondej provided feedback and editing. Supervision and funding acquisition was provided by M. Senderecka. All authors approved the final manuscript for submission.
Funding Information
This work was supported by Sonata Bis, grant number: 2020/38/E/HS6/00490 from the National Science Centre of Poland, granted to M. S. Proofreading of this publication was supported by a grant from the Faculty Research Support Module under the Strategic Programme Excellence Initiative at Jagiellonian University, granted to A. G. The article processing charges were funded by a grant from the Strategic Program Excellence Initiative at the Jagiellonian University awarded to M. S. A CC-BY public copyright license has been applied by the authors to the present document in accordance with the grant's open access conditions. The funding source had no impact on any part of the present study.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance. The authors of this article report its proportions of citations by gender category to be: M/M = .595; W/M = .214; M/W = .095; W/W = .095.