Sustained attention is a cognitive ability to maintain task focus over extended periods of time (Mackworth, 1948; Chun, Golomb, & Turk-Browne, 2011). In this study, scalp electroencephalography (EEG) signals were processed in real time using a 32 dry-electrode system during a sustained visual attention task. An attention training paradigm was implemented, as designed in DeBettencourt, Cohen, Lee, Norman, and Turk-Browne (2015) in which the composition of a sequence of blended images is updated based on the participant's decoded attentional level to a primed image category. It was hypothesized that a single neurofeedback training session would improve sustained attention abilities. Twenty-two participants were trained on a single neurofeedback session with behavioral pretraining and posttraining sessions within three consecutive days. Half of the participants functioned as controls in a double-blinded design and received sham neurofeedback.
During the neurofeedback session, attentional states to primed categories were decoded in real time and used to provide a continuous feedback signal customized to each participant in a closed-loop approach. We report a mean classifier decoding error rate of 34.3% (chance 50%). Within the neurofeedback group, there was a greater level of task-relevant attentional information decoded in the participant's brain before making a correct behavioral response than before an incorrect response. This effect was not visible in the control group (interaction e4), which strongly indicates that we were able to achieve a meaningful measure of subjective attentional state in real time and control participants' behavior during the neurofeedback session. We do not provide conclusive evidence whether the single neurofeedback session per se provided lasting effects in sustained attention abilities.
We developed a portable EEG neurofeedback system capable of decoding attentional states and predicting behavioral choices in the attention task at hand. The neurofeedback code framework is Python based and open source, and it allows users to actively engage in the development of neurofeedback tools for scientific and translational use.
Neurofeedback uses real-time modulation of brain activity to regulate or enhance brain function and behavioral performance (for reviews, see Gruzelier, 2014b; Ordikhani-Seyedlar, Lebedev, & Sorensen, 2016; Jiang, Abiri, Zhao, & Ros, 2017; Sitaram et al., 2017). A closed-loop feedback approach refers to a continuous monitoring of brain activity that forms the basis for a signal that is sent back to the user of the system in real time (for reviews, see Zrenner, Belardinelli, Müller-dahlhaus, & Ziemann, 2016; Sitaram et al., 2017). Hence, closed-loop neurofeedback allows for interference with brain activity in real time and thereby enables investigations of how brain network dynamics contribute to task performance in a causal manner, making it a powerful neuroscientific tool (Kangassalo, Spapé, & Ruotsalo, 2020, for reviews, see Bagdasaryan & Le Van Quyen, 2013; Stoeckel et al., 2014). Moreover, based on recent methodological and technical advances in brain-computer interfaces (BCI), there is increasing interest in using neurofeedback for cognitive training, especially if it is employed in accessible and user-friendly modalities (Narayana, Prasad, & Warmerdam, 2019; Jafri et al., 2019, for recent reviews, see Baek, Chang, Heo, & Park, 2019; Kosmyna & Lécuyer, 2019; Lupu, Ungureanu, & Cimpanu, 2019).
Neurofeedback for cognitive training relies on self-regulation to modulate specific neurophysiological patterns. The concept was first established in the late 1950s with experiments showing that humans were able to self-control electroencephalography (EEG) signals in real time (Kamiya, 2011). Learning effects, quantified as changes in neural dynamics in response to neurofeedback, have since been demonstrated (Clancy, Koralek, Costa, Feldman, & Carmena, 2014; Ishikawa, Matsumoto, Sakaguchi, Matsuki, & Ikegaya, 2014; Sakurai & Takahashi, 2013) (rodent animal models). The specificity of neural training has been demonstrated using single-neuron activity in both nonhuman primates (Fetz, 1969; Musallam, Corneil, Greger, Scherberger, & Andersen, 2004; Velliste, Perel, Spalding, Whitford, & Schwartz, 2008; Schafer & Moore, 2011; Hwang, Bailey, & Andersen, 2013) and humans (Guenther et al., 2009; Cerf et al., 2010; for review, see Fetz, 2007), thus providing evidence of the applicability of neurofeedback for cognitive training.
The most frequently used brain imaging modalities for neurofeedback are EEG and functional magnetic resonance imaging (fMRI), which in combination with a feedback signal are used to facilitate self-regulation of the putative brain regions that cause a specific behavior or pathology. Neurofeedback based on fMRI enables participants to regulate their brain activity with high spatial resolution and provides the experimenter with information on the brain regions implicated in the task at hand. EEG represents a low-cost, robust, and potentially mobile alternative measurement modality. The high temporal resolution of EEG can be advantageous for real-time neurofeedback applications.
Most published studies demonstrating the efficacy of neurofeedback in clinical settings are EEG based. Particularly, neurofeedback for treatment of attention deficit hyperactivity disorder in children has been successful (for reviews, see Arns, De Ridder, Strehl, Breteler, & Coenen, 2009; Lofthouse, Arnold, Hersch, Hurt, & DeBeus, 2012; Holtmann, Sonuga-Barke, Cortese, & Brandeis, 2014; Micoulaud-Franchi et al., 2014; Van Doren et al., 2019). EEG-based neurofeedback has been demonstrated to have an effect on cognitive abilities after traumatic brain injury (for reviews, see May, Benson, Balon, & Boutros, 2013; Gray, 2017). Additionally, neurofeedback and BCIs have been shown to improve physical performance and neurological rehabilitation of movement disorders (for reviews, see Daly & Wolpaw, 2008; Machado, Almada, & Annavarapu, 2013; Broccard et al., 2014; Chaudhary, Birbaumer, & Ramos-Murguialday, 2016). However, it is important to note that meta-studies and reviews highlight the need for standardization of neurofeedback protocols, homogeneity of evaluation metrics, and double-blinded study designs to yield further clinical applications in the near future.
Regarding the use of neurofeedback in healthy individuals, also denoted as the “optimal” or “peak performance” field (Gruzelier, 2014b), there is evidence in support of cognitive performance enhancement (for a three-part review, see Gruzelier, 2014a, 2014b, 2014c). Another line of research that is still relatively unexplored and holds great potential is the use of neurofeedback for mitigating cognitive impairment in the aging population (for review, see Jiang, Abiri, Zhao, & Ros, 2017).
While neurofeedback using fMRI has been validated in scientific settings to regulate activity in target brain regions with high spatial resolution (DeCharms et al., 2004; Bray, Shimojo, & O'Doherty, 2007; Caria et al., 2007; Sitaram et al., 2011; Shibata, Watanabe, Sasaki, & Kawato, 2011; Andersson, Ramsey, Raemaekers, Viergever, & Pluim, 2012; Ekanayake et al., 2018, for reviews, see Watanabe, Sasaki, Shibata, & Kawato, 2017; Thibault, MacPherson, Lifshitz, Roth, & Raz, 2018), neurofeedback in fMRI has not reached the same potential in clinical settings as EEG-based systems, potentially due to its extensive and expensive acquisition procedures. Nonetheless, multiple studies investigate the efficacy of real-time fMRI for translational use. For example, a rapidly growing field is the use of neurofeedback for treatment of psychiatric disease. Three recent reviews (Linhartová et al., 2019; Barreiros, Almeida, Baía, & Castelo-Branco, 2019; Chiba et al., 2019) find evidence for regulation of brain regions related to emotional control using real-time fMRI—for instance, to alleviate symptoms of depression or posttraumatic stress disorder. These recent advances in neurofeedback underline its potential for cognitive training in both healthy individuals and patients, especially if implemented in more accessible and adaptable modalities such as EEG.
Thus, developing easy-to-use software tools for EEG-based systems can accelerate investigations and implementation of neurofeedback paradigms for both cognitive training and investigation of brain-behavior relationships.
In this study (see section 5), we implement a closed-loop neurofeedback attention training paradigm and present the code framework for real-time EEG neuroimaging, cognitive state classification, and feedback, which is fully available to the community.
The current available software for real-time EEG neurofeedback (for research) includes BCI20001 (Schalk et al., 2004; closed-source C++ software with possible integration with external programs), OpenVIBE2 (Renard et al., 2010; open-source C++ software with possible integration with external programs), Brainstorm3 (Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011; open-source Java/Matlab software), BCILAB4 (Delorme et al., 2011; open-source Matlab toolbox extension for EEGLAB software; Delorme et al., 2011), FieldTrip5 (Oostenveld, Fries, Maris, & Schoffelen, 2011; open-source Matlab software), and NFBLab6 (Smetanin, Volkova, Zabodaev, Lebedev, & Ossadtchi, 2018; open-source XML/Python-based software).
In addition to this software framework, several commercial clinical software platforms exist, such as BrainMaster,7 Myndlift,8 BioGraph Infiniti,9 Cygnet,10 NeuroPype,11 and several others. However, these software solutions lack flexibility, the possibility of adapting paradigms, and are closed source.
Many of these software packages are written in noninterpreted programming languages that require more advanced programming skills compared to interpreted languages such as Matlab and Python. For interpreted programming languages, Python is rising in popularity as an alternative to Matlab due to its quickly evolving open-source libraries (Nagpal & Gabrani, 2019). NFBLab provides integration with Python, but to our best knowledge, the availability of EEG-based neurofeedback software packages is limited.
Therefore, a separate aim of this study is to implement an attention training paradigm using solely Python-based code and easy compatibility with consumer-grade EEG equipment with dry electrodes. Moreover, we prioritized an automated, real-time implementation, reducing the need for manual control or EEG recordings prior to the neurofeedback session.
The following section first provides a brief overview of major components of neurofeedback systems for cognitive state classification in general. Next, we explain the experimental protocol of the attention training paradigm, followed by the implementation of the closed-loop neurofeedback framework.
2.1 Neurofeedback System Components
In this section, we provide a brief overview of the core components of neurofeedback systems. Principally, a neurofeedback system consists of the following five components (Enriquez-Geppert, Huster, & Herrmann, 2017):
Acquisition of brain signal data. Measurements of brain activity can be obtained through several modalities that fall into two major categories: invasive recording technologies such as electrocorticography (ECoG) and noninvasive technologies such as fMRI, EEG, magnetoencephalography (MEG), and functional near-infrared spectroscopy (fNIRS) (for reviews, see Ordikhani-Seyedlar et al., 2016; Zafar & Ullah, 2019). While noninvasive imaging modalities lack temporal or spatial resolution and exhibit low signal-to-noise ratio (SNR), the main advantage of this imaging (besides the noninvasiveness per se) is acquisition of data across the entire brain. Hence it is possible to image entire functional networks of regions involved in specific functions using approaches such as multivariate pattern analysis (MVPA) (Norman, Polyn, Detre, & Haxby, 2006) (e.g., for visual and semantic processing, see Kay, Naselaris, Prenger, & Gallant, 2008; Naselaris, Stansbury, & Gallant, 2012; Huth et al., 2016). MVPA methods have been used with success in fMRI neurofeedback studies (DeBettencourt, Cohen, Lee, Norman, & Turk-Browne, 2015; DeBettencourt, Turk-Browne, & Norman, 2019; Cortese, Amano, Koizumi, Kawato, & Lau, 2016; Oblak, Sulzer, & Lewis-Peacock, 2019).
Real-time data preprocessing. Depending on the brain imaging modality, different data preprocessing steps are necessary. However, the major signal processing challenge across modalities is the detection and rejection of artifacts of technical and/or physiological origin in order to acquire measurements of brain, and not artifactual, signals. Most common artifacts in EEG recordings arise from other electrical equipment, changing electrode impedance, and eye and muscle movements (for reviews, see Rowan & Tolunsky, 2003; Enriquez-Geppert et al., 2017). Approaches for noise rejection include filtering, linear regression, and source decomposition methods (for review, see Islam, Rastegarnia, & Yang, 2016; Minguillon, Lopez-Gordo, & Pelayo, 2017). Importantly, the acquisition and preprocessing of data have to occur online and with limited latency.
Feature extraction and classification. The selection, extraction, and classification of brain signals of interest form the core of a neurofeedback system. The characteristics of interest (features) extracted from the brain recordings have to represent the brain patterns that one wants to modulate and provide feedback based on. In EEG, these features are often spectral bands of EEG signals (Egner & Gruzelier, 2004; Grosse-Wentrup & Schölkopf, 2014; Liu, Hou, & Sourina, 2015; Zhigalov, Kaplan, & Palva, 2016; Wei et al., 2017) or event-related potential (ERP) components of the signal (Farwell & Donchin, 1988; Treder & Blankertz, 2010; Zhang, Zhao, Jin, Wang, & Cichocki, 2012). Over the past 20 years, the number of journal articles about machine learning in neuroscience has grown continuously (Glaser, Benjamin, Farhoodi, & Kording, 2019). For neurofeedback systems, machine learning methods allow for extraction, decoding, and thus training of increasingly complex features from noisy brain recordings (for reviews, see Lemm, Blankertz, Dickhaus, & Müller, 2011; Enriquez-Geppert et al., 2017; Glaser et al., 2019).
Feedback signal generation. The extracted and decoded features of the brain signal have to be converted into a stimulus that can be continuously presented to the user of the system. This dynamic generation of a feedback signal based on an individual's brain state is referred to as a closed-loop approach (for reviews, see Zrenner et al., 2016; Sitaram et al., 2017). The feedback component can consist of sensory signals (e.g., visual, auditory or haptic) or brain stimulation such as transcranial current brain stimulation (for reviews, see Ruffini et al., 2012; Summers, Kang, & Cauraugh, 2016; Otal et al., 2016).
Adaptive user/learner. To close the loop, the feedback is transmitted back to the user or, in case of brain stimulation, to a stimulator device. In this manner, output from the brain influences the feedback, and the input to the brain arises from the interaction with the feedback signal within the system, thus creating a “behavior in the loop” paradigm (Zrenner et al., 2016).
2.2 Experimental Attention Training Protocol
Images consisted of grayscale photographs of female and male faces and indoor and outdoor scenes that were combined into composite stimuli by interpolating two images using a variable interpolation factor (). All image trials were generated from an interpolation of two randomly drawn images, choosing from 1000 images for each subcategory, for a total of 4000 unique images. The face images were from the FERET database12 (NIST, 2016) and scene images were from the SUN database13 (Xiao, Hays, Ehinger, Oliva, & Torralba, 2010). All images were resized and cropped to 175 175 pixels. To reduce the influence of low-level visual features, we ensured that all images had a similar brightness level. The pixel values of the images were linearly transformed such that the following two groups of pixel values each accounted for 10% of the pixel values: 0 to 46 and 210 to 255. For the face images, the quantiles were estimated based on the center of the images, such that the dead space around the face was not included. A white fixation cross was superimposed on the images and presented during breaks, except when text instructions were displayed. The images were presented against a mid-gray background. Participants were seated 57 cm from the computer monitor, corresponding to a 5 degree visual angle.
Twenty-two healthy adults (10 female, 22 right-handed, mean age 22.9, age range 20–26) with normal or corrected-to-normal vision participated in the study for monetary compensation. This included an EEG neurofeedback group and a control group receiving sham neurofeedback. Each participant in the control group was matched randomly to a participant in the neurofeedback group based on gender. The matched pairs were presented to identical stimuli on all three days, and control participants thus received yoked feedback based on the decoded neural response profile of their matched neurofeedback participant. The yoking ensured that control participants were exposed to the same variations in image mixture and task difficulty. No participants were excluded from the study. The study was performed with automated randomization and group assignment to achieve a double-blinded experimental design. The first few participants were necessarily assigned to the neurofeedback group to provide matched, yoked feedback to the control group, and similarly, the last few participants were assigned to the control group. Thus, the experimenter was blind to group assignment in 16 of 22 participants (72.7%). All participants received the same scripted instructions.
All participants provided written informed consent to a protocol approved by the Institutional Ethical Review Board, University of Copenhagen, Department of Psychology (approval number: IP-IRB / 26112018) following the 1975 Helsinki Declaration as revised in 2008.
An attention training paradigm was implemented, as designed in DeBettencourt et al. (2015). Participants were asked to pay attention to a subcategory of faces or scenes in a sequence of composite images (e.g., a mixture of 50% of an image of a female face and 50% of an image of an indoor scene). The neurofeedback element consisted of updating the image proportion (interpolation factor, ) of the composite images (e.g., a mixture of 80% of an image of a female face and 20% of an image of an indoor scene) based on the participant's decoded attentional level to the primed image category.
Participants underwent six task runs, where each task run contained eight blocks with 50 trials in each block (total of 2400 trials). Each block began with a text cue for 6 s that instructed participants which subcategory was the target to which they should focus their attention and respond to by a keypress. The text cues were indoor, outdoor, female, and male. For each block, the text cue was followed by 2 s of fixation (white cross centered on a mid-gray background) before the sequence of 50 image trials started. The image trials consisted of composite face and scene images that were displayed for 1 s each with no interstimulus interval.
The first task run only contained blocks with acquisition trials, that is, an equal mixture of faces and scenes for each image (see Figure 1b). Starting with the second run, the first four blocks were acquisition, and the following four blocks were feedback blocks with a variable mixture proportion depending on the attentional level decoded from EEG recordings of each participant (in the neurofeedback group).
Participants were informed by text on the screen each time a new type of block was initialized. For each feedback block, the first three image trials had equal mixture proportions to allow time for classifier decoding of presented trials. The mixture proportions for the remaining trials (47 trials) within each block were based on the real-time EEG classifier output of decoded attentional level. If participants attended well (high levels of task-relevant information decoded in their signals), the task-relevant image became easier to see, and vice versa. Thus, the feedback worked as an amplifier of participants, attentional state, with the goal of making participants aware of attention fluctuations and hence improve sustained attention abilities.
2.2.4 Experimental Procedure
All participants completed two behavioral sessions and one neurofeedback session on three consecutive days. The first day was a behavioral pretraining session with two runs of the sustained attention task without recording of EEG. The second day was an EEG session consisting of a single run with stable, acquisition stimuli as in the pretraining session and five runs with real-time neurofeedback. The third day was a behavioral posttraining session with two runs of the attention task without recording of EEG, similar to the first day.
Participants were asked to pay attention to the primed target category and respond to a response inhibition task (Logan, 1982; Rieger & Gauggel, 1999). Ninety percent of the images (45 images in each block) contained the target category and required a keypress response, while 10% (5 images in each block) contained the nontarget category to which responses had to be withheld (lure trials).
For each task run, four of eight blocks involved attending to faces, and the remaining four blocks involved attending to scenes. To avoid an unbalanced category distribution, the target categories were randomly assigned to each block with the constraint that two blocks of each target category had to be present within the first and last four blocks.
The target subcategories presented for each participant throughout the three-day experiment were held constant, so if the target categories were indoor and female for a particular participant during the pretraining session on day 1, the participant would be primed to the same categories on the following days. The category assignment for each participant was random but counterbalanced across all participants. Since the task runs contained randomly generated target categories and composite image trials, each run was unique to avoid a habituation and recognition effect.
For behavioral sessions during the pretraining (day 1) and posttraining (day 3) sessions, all participants completed two task runs of eight blocks (total of 800 trials). For these behavioral sessions, all composite image trials had an equal mixture proportion of face (50%) and scene (50%).
During both the behavioral and EEG sessions, participants were given 30 s breaks after four blocks. The total experiment time was 17 min for the behavioral session and 52.6 min for the EEG session, including breaks, text cues, and fixation time.
For each of the three sessions, participants were instructed to sit relaxed in a chair with their right hand resting on the table with a finger on a keyboard for providing behavioral responses. Participants were instructed to keep their eyes focused on the fixation cross on the screen. Prior to the EEG session, participants were asked to avoid excessive movements during stimuli presentation. They were informed that the feedback trials would be updated based on their attention toward the target category and not based on their keypress response. Specifically, the task-relevant image would become easier to see if they were paying attention and harder to see if their attentional level was decreasing. For the pretraining and EEG sessions, participants were shown short examples of, respectively, the behavioral experimental paradigm and neurofeedback paradigm. All instructions were scripted and identical across participants.
2.3 Real-Time Neurofeedback Implementation
2.3.1 EEG Acquisition Equipment
A consumer-grade, portable EEG equipment, Enobio (Neuroelectrics14) with 32 dry-electrode channels, was used for data acquisition. The EEG was electrically referenced using a CMS/DRL ear clip. The system recorded EEG data with a sampling rate of 500 Hz with EEG electrodes positioned according to the 10”20 system. The data were transmitted to a Lenovo Legion Y520 laptop via a USB cable. Visual stimuli were presented on a dual monitor. Lab Streaming Layer15 (LSL) was used for streaming and synchronizing EEG data with feedback signal generation. LSL supports short latency data exchange and buffering for most EEG/MEG devices, for example, OpenBCI,16 Cognionics,17 and mBrainTrain18 (for a full list of supported hardware, see https://labstreaminglayer.readthedocs.io/info/supported_devices.html). The LSL interface makes the code framework compatible with different hardware devices without additional configuration.
2.3.2 Real-Time EEG Preprocessing
EEG epochs of 900 ms (100 ms prestimulus onset, 800 ms poststimulus onset) were extracted for each image trial and preprocessed epoch-wise in real time. The EEG signal was linearly detrended and low-pass-filtered with a cut-off frequency of 40 Hz using a finite impulse response (FIR) filter with zero-double phase. The EEG signal was downsampled from 500 Hz to 100 Hz and referenced to an average of the 23 preselected channels (see below). Baseline correction was performed based on 100 ms of the prestimulus signal. EEG epochs were -scored such that each trial had a mean of 0 and a standard deviation of 1.
Feedback estimates were based on preselected spatial areas (EEG channels). Based on the recommendations of Wang, Xiong, Hu, and Yao (2012), nine frontal and temporal channels (Fp1, Fp2, Fz, AF3, AF4, T7, T8, F7, F8) were rejected in all real-time analyses. Besides these nine channels, no channels or epochs were rejected in real time. No epochs were rejected during preprocessing. The signal was artifact-corrected using signal-space projection (SSP) (see section 2.3.3).
The preprocessing schemes were largely based on implementations provided by MNE software (Gramfort et al., 2014).
2.3.3 Real-Time Artifact Rejection: Signal-Space Projection
We employed a data decomposition approach using signal-space projection (SSP) for detecting and rejecting artifacts in real time. SSP relies on the fact that the electric field distributions generated by the sources in the brain have spatial distributions sufficiently different from those generated by external noise sources. SSP does not require additional reference sensors to record the disturbance fields (Uusitalo & Ilmoniemi, 1997; Ramírez, Kopell, Butson, Hiner, & Baillet, 2011), which makes it suitable for real-time implementation.
For the mathematical modeling of the acquired signals, we use two representations. We use a matrix representation where the multichannel signal is kept in a channel-by-time matrix . For a multiepoch signal, we concatenate the channel time series. When convenient, we also use a vector representation in which we concatenate the time series of the channels, typically for a single epoch.
In essence, SSP decomposes the noisy data into components by a singular value decomposition (SVD) of the signal. It is important to note that these components might not be statistically independent, and therefore there is a risk that brain signals of interest and artifacts may be reduced in the denoising process (Uusitalo & Ilmoniemi, 1997).
2.3.4 Real-Time EEG Classification
The classification task consisted of categorizing the top-down attentional states toward faces and scenes, a binary classification task.
The implemented model was a logistic regression classifier with L1 norm regularization from the scikit-learn toolbox (Pedregosa et al., 2011) using the saga solver (Defazio, Bach, & Lacoste-Julien, 2014) with a regularization of 1 (default setting). Due to the accessibility of the scikit-learn toolbox, the code framework provides easy adaptability to various classification models to suit the experimenter's needs.
As described in section 2.2.3, participants completed six task runs consisting of eight blocks each (see Figure 1b). For each task run besides the very first one (which only contained acquisition blocks), a classifier was trained in the 30 s break between acquisition and feedback blocks. For the first task run, the classifier was trained on the first 600 acquisition EEG trials (12 blocks). For the following task runs, the classifier was trained on the 600 most recent acquisition EEG trials and the 200 most recent feedback trials (illustrated as a dashed line in Figure 1a) (16 blocks in total). In this manner, the training data for the classifier were continuously updated throughout the experimental session to include the most recent 16 blocks (both acquisition and feedback blocks) corresponding to 800 trials.
For artifact rejection, SSP projectors were computed based on the 12 acquisition blocks in the training data (see section 2.3.3). Following artifact rejection, two consecutive epochs were averaged over a moving window across all training epochs. Finally, a classifier was trained on the artifact-corrected and averaged data.
The trained classifier was tested in real time on EEG data epochs recorded during the feedback blocks. A test epoch was preprocessed and corrected using the SSP projectors computed based on the training data as already described.
Hence, the classifier was tested on a weighted moving average of the most recent trials with 50% weight on the current trial and exponentially decreasing weight on earlier trials.
The classifier output ranged from to 1, with values above 0 indicating task-relevant category information decoded in the participant's brain and values below 0 indicating task-irrelevant decoded information. The classifier output was fed into a transfer function and used to generate a feedback signal to the participant, as described in section 2.3.5.
Based on pilot data, it was observed that the trained classifier was frequently biased toward one of the categories (often the face category). This could possibly be explained by selective processing of face-like objects in the human brain compared to nonface objects (Kanwisher, Mcdermott, & Chun, 1997) and the fact that images of faces have more uniformity in their image statistics than scenes do. Thus, the prediction probabilities of the feedback epochs were adjusted using an offset calculated based on the training data. The offset for the first feedback run was computed on the training data in a three-fold cross-validation procedure. For the remaining feedback runs, the offset was based on the bias of the four most recent feedback blocks (200 trials). The bias offset was limited to an absolute value of 0.125.
2.3.5 Real-Time Feedback Generation
The transfer function was designed to map the classifier output to a sensitive range of values for updating stimuli yet avoiding saturation. The transfer function can be constructed based on desired sensitivity range for various types of feedback. A classifier output of 1, when the classifier assigned maximum prediction probability to the task-relevant category, would yield the task-relevant image to be 98% visible. Conversely, a classifier output of , when the classifier assigned maximum prediction probability to the task-irrelevant category, would yield the task-relevant image to be 17% visible. Thus, the output of the function ranged from 0.17 to 0.98. The minimum value of 0.17 ensured that the task-relevant category was always visible to some degree in the composite image, giving the participant a chance to recover from an attention lapse.
Since sustained attention has been shown to fluctuate more slowly than on a trial-to-trial basis (Esterman et al., 2012; Rosenberg, Noonan, Degutis, & Esterman, 2013; Rosenberg, Finn, Constable, & Chun, 2015), and the classifier output contains misclassifications that can be described as high-frequency noise, the three most recent -values from the transfer function were averaged for updating image trials.
2.4 Offline Data Analysis
2.4.1 Offline EEG Classification
The ability to decode attentional states within participants was assessed by classification of acquisition blocks (28 blocks) in a leave-one-run-out (LORO) cross-validation. Feedback blocks were not included in the cross-validation, since participants received different types of feedback. Similar to the real-time pipeline, epochs in the training blocks were preprocessed epoch-wise, and SSP projectors were computed and applied to the full training set. Following SSP correction, two consecutive trials were averaged over a moving window across the entire training set. For classifier testing, the SSP projectors computed based on the training set were applied to the test set, followed by a weighted moving averaging (see equation 2.4) across test epochs. To provide an estimate of classifier bias, the offset was computed based on a three-fold cross-validation of the training data and applied to the prediction probabilities of test epochs.
For the LORO approach, all acquisition blocks within a single run were in turn used as a test set once (besides the very first run, which consisted only of acquisition blocks; this run was always included in the training set to avoid variability in the size of test set).
The classification performance was evaluated as decoding error rate (i.e., how often the classifier predicted the wrong category).
2.4.2 Sensitivity Mapping of Spatial and Temporal Features Important to Classification
2.4.3 Behavioral Performance Metrics
True positives (TP) were responses during non-lure trials, false negatives (FN) were rejections during non-lure trials, true negative (TN) were rejections during lure trials, and false positives (FP) were responses during lure trials.
3 Results and Discussion
We implemented an attention training paradigm and recorded scalp EEG using a consumer-grade, 32-dry-electrode EEG system from 22 participants in a double-blinded sham design. In this section, we evaluate the outcome of the attention training experiment and neurofeedback system performance of the framework presented in section 2.3. First, we report the unbiased decoding error rates of the system obtained from offline cross-validation analyses. Next, we report the decoding error rates obtained in real time. Moreover, we visualize which parts of the EEG signature were exploited by the classifiers trained in real time to predict attentional states. Finally, we report and discuss the training effects in participants' sustained attention abilities during and after the neurofeedback session.
3.1 Classifier Decoding Error Rate
Hence, we achieved a mean classifier error rate of 0.343 () using a consumer-grade, dry-electrode EEG equipment. In comparison to the original paradigm implemented in fMRI, DeBettencourt and colleagues (2015) reported a mean classifier error rate of 0.22 () using MVPA methods. These classifier error rates represent the ability to decode top-down attention toward faces and scenes. Participants were exposed to the exact same type of composite images for both categories (equal image mixture proportion), and thus EEG signals associated with subjective attentional states, and not solely visual responses, were decoded.
3.2 EEG Decoding Error Rates Obtained in Real Time
We evaluate the system's ability to decode fluctuating attentional states in real time. Note that the decoding error rates obtained in real time were influenced by participants' receiving either true or sham neurofeedback and hence experiencing different levels of task difficulty.
3.3 Sensitivity Map of Real-Time Classification
We investigated the temporal and spatial EEG signatures exploited by the real-time classification models. Throughout the neurofeedback session, five logistic classifiers were trained on the most recently recorded EEG signals in order to appropriately decode visual top-down attentional states and provide relevant feedback. A sensitivity map was computed based on the averaged L1 logistic model weights across all real-time classifiers for all participants. For sensitivity map effect size evaluation, we implemented the NPAIRS resampling scheme (Strother et al., 2002). The splitting procedure was repeated 10,000 times to obtain the standard error of the sensitivity maps for computing effect sizes (see section 2.4.2 and Figure 5).
As expected in the decoding of visual attentional states, occipital EEG channels (O1, O2, Oz) during early visual processing (at time points around 110–160 ms) were evident on the sensitivity map (see Figure 5) consistent with the visual ERP literature (Gonzalez, Clark, Fan, Luck, & Hillyard, 1994; Hillyard & Anllo-Vento, 1998, for review, see Luck, Woodman, & Vogel, 2000). Moreover, temporal components around 160 to 190 ms (P7, P8) were of importance in the classification task, potentially corresponding to the signature of face processing in the human brain (N170 component) (Bentin, Allison, Puce, Perez, & McCarthy, 1996).
Interestingly, the classifiers exploited later temporal components besides the early visual response, which might correspond to processing of subjective visual states as proposed in previous work (Treder & Blankertz, 2010; Kasper, Das, Eckstein, & Giesbrecht, 2010; Thiery, Lajnef, Jerbi, Arguin, & Aubin, 2016; List et al., 2017). Treder and Blankertz (2010) reported optimal decoding of overt attention based on early evoked potentials and optimal decoding of covert attention based on later components around parietal channels. Similarly, Kasper and colleagues (2010) showed that attentional failures could be distinguished from successes during an attention task on a single-trial basis using a 400 to 500 ms temporal window.
Thiery et al. (2016) predicted covert visuospatial attention from ERPs (albeit using manually pre-defined temporal windows and occipital regions) and demonstrated that both an early (0–100 ms) and late time (410–530 ms) window were optimal for attention decoding (Thiery et al., 2016). Moreover, List and colleagues (2017) showed that the time interval 500 to 625 ms across occipital and parietal brain regions contained discriminative information for distinguishing locally from globally focused attentional states.
In terms of spatial features, we observe activity across not only occipital visual areas, but also distributed activity across frontocentral brain regions (see Figure 5), which is in agreement with several other M/EEG studies demonstrating patterns of neural activity moving from the primary visual cortex to centroparietal and frontal cortices after stimuli onset in visual attention tasks (Salti et al., 2015; King, Pescetelli, & Dehaene, 2016). In fMRI, activity over frontoparietal regions has been linked to cognitive control and goal-directed behavior (Woolgar, Hampshire, Thompson, & Duncan, 2011) and brain regions supporting attention training process (DeBettencourt et al., 2015).
Hence, the signatures of the sensitivity map (see Figure 5) provide evidence that we were able to successfully decode subjective attentional states in real time throughout the neurofeedback session.
3.4 Training Effects of Neurofeedback
The training effect of neurofeedback was investigated both during the neurofeedback session itself and the day following the neurofeedback session.
During the neurofeedback session, we investigated whether the classifier output (i.e., the amount of task-relevant information decoded in a participant's brain) before a lure trial in the response inhibition task predicted whether participants correctly withheld their response or incorrectly responded to a lure trial. The classifier output was averaged over the three trials preceding the lure trial. A low classifier output indicated that the participant had a low task-relevant neural representation, and hence was inattentive. For neurofeedback participants, this low degree of task-relevant information could indeed be used to predict when a participant would make an erronous response (false alarm). Vice versa, a larger degree of task-relevant information in a neurofeedback participant's brain was linked to a correct response (correct rejection of a lure trial). The difference in classifier output between false alarms and correct rejections was significant for the neurofeedback group and insignificant for the control group with an interaction of (see Figure 6). This provides evidence that we were able to control participants' behavior using neurofeedback.
To investigate whether the single neurofeedback session provided lasting effects, we compared the behavioral performance of participants during the pretraining session (day before the neurofeedback session) and the posttraining session (day after the neurofeedback session). First, there was no baseline difference (pretraining session) between the neurofeedback and control group in terms of A' (), RERR (), or response time ().
Second, the neurofeedback group significantly improved their relative error rate reduction (RERR) from pre- to posttraining, as opposed to controls (see Figure A3, panel a, in the appendix). Both groups improved in terms of behavioral sensitivity performance, A' (see Figure A3, panel b, in the appendix), but insignificantly so. There was no significant difference between groups. Hence, there is limited evidence whether neurofeedback per se provided lasting effects in sustained attention abilities. This could be due to the neurofeedback design itself (rewarding good attentional states by reducing task difficulty and punishing bad attentional states by increasing difficulty), the use of a single neurofeedback session, or lack of reliable evaluation criteria (discussed in section 4.1).
4 Conclusion and Perspectives
In this study, we implemented an attention training paradigm and demonstrated real-time EEG neuroimaging, cognitive state classification, and feedback in a wearable setting. We share our Python-based code framework as open source for reproducibility. We reported a mean classifier decoding error rate of 34.3% (chance level error rate 50%). We showed that the EEG event-related brain responses differ based on attentional state, with evidence for late components in the temporal dynamics corresponding to brain processing of subjective attentional state. We demonstrated an ability to control participants' behavior during the neurofeedback session, but provide limited evidence that a single neurofeedback session provided lasting effects in sustained attention abilities.
4.1 Challenges and Prospects for Neurofeedback Systems
This section consolidates our contribution to the EEG neurofeedback field and highlights areas for optimization, structured as four issues that were identified by Ordikhani-Seyedlar et al. (2016) as the main future challenges for visual attention-based neurofeedback systems.
Filtering out noise. Physiological, mechanical, and electrical noise artifacts is a major issue in all EEG-based systems. Artifacts, of which eye artifacts are most common, can be partly solved by interstimulus intervals (ISI) for blinking. Besides implementation of ISI, artifacts are usually handled by rejection or correction of EEG epochs with, for example, suspiciously high signal amplitude values. However, both ISI and rejection of epochs reduce the ecological validity of neurofeedback systems as delay is imposed (for reviews, see Rowan & Tolunsky, 2003; Lotte, Congedo, Lécuyer, Lamarche, & Arnaldi, 2007; Huster, Mokom, Enriquez-Geppert, & Herrmann, 2014).
Another approach to minimize the effect of artifacts is by proper feature selection representing the signal of interest. However, the challenge is to perform this feature selection in a completely automated and not computationally expensive manner. The goal of neurofeedback training is often uptraining of specific spectral bands (Egner & Gruzelier, 2004; Grosse-Wentrup & Schölkopf, 2014; Liu et al., 2015; Zhigalov et al., 2016; Wei et al., 2017), hence diminishing the need for feature extraction besides identification of the spectral component of interest. This approach is not optimal for neurofeedback paradigms that aim to decode more complex cognitive states, as it diminishes the possibility of using potentially relevant neural processing taking place in spectral bands other than the preselected band.
We do not limit our analyses to specific frequency bands, but exploit the signal within 0 and 40 Hz. Therefore, we propose a decomposition-based artifact rejection method, SSP, in which noise components are computed throughout the experiment and automatically rejected for all EEG epochs. Specifically, through decomposition of the EEG signal, components explaining more than 10% variance were identified and discarded from the signal. The EEG components were projected back, thereby reconstructing an artifact-free signal. The components were reestimated continuously throughout the session to ensure an updated estimate of the signal composition. In this manner, the computed noise components were participant specific and were allowed to change throughout the time course of the neurofeedback session. The technique is, however, not perfect, and as previously mentioned, relevant brain signal might have been filtered out and some noise signals might still be present. Moreover, the choice of threshold for rejection of components is a challenge for real-time systems. Typically, manual intervention is used to reject components with visually detected artifacts in offline studies (for reviews, see Islam et al., 2016; Jiang et al., 2017). Methods have been proposed to quantify the number of components to reject; however, some of these are computationally time-consuming (Viola et al., 2009; Radüntz, Scouten, Hochmuth, & Meffert, 2017), and even semiautomatic approaches require prior information about the constrained topographies to initialize algorithms (Akhtar, Mitsuhashi, & James, 2012) or previously defined template matching (Li, Ma, Lu, & Li, 2006). As our main goal was to implement a fully unsupervised denoising method that can operate on a trial-by-trial basis in real time, the choice of 10% variance was based on prior work identifying the largest variance components as artifacts (Dammers et al., 2008; Mantini, Franciotti, Romani, & Pizzella, 2008; Kong et al., 2013; Barthélemy et al., 2017) and visual inspection of the components during pilot testing. The 10% threshold might not have been the optimal choice across all participants. In the future, this challenge may be solved by automatic approaches that exploit deep neural networks (Croce et al., 2018).
Reliable criteria to quantify neurofeedback training effects. There is no consensus on how to define success in a neurofeedback training study. Cognitive improvement of, for example, attention is challenging to quantify using behavioral tests due to factors unrelated to training: interindividual abilities in learning capacity, general ability to concentrate, motivation, mood, and personality (for review, see Kadosh & Staunton, 2019). Moreover, certain individuals fail to regulate brain activity even after repeated training sessions; they are denoted as “BCI illiterate.” This has been suggested to apply to 20% of individuals (Allison, Neuper, Tan, & Nijholt, 2010) or even up to 37% depending on the success threshold (Hammer et al., 2012).
We conducted a pre- and posttraining session for quantification of neurofeedback training outcome. Since these sessions were conducted on different days, the performance will naturally differ due to factors innate to attentional improvement, possibly outweighing the subtle effects of training. Optimally, these factors would be controlled for, or new and more robust evaluation metrics need to be implemented.
Finally, besides a standardization of evaluation metrics, recent reviews (Sitaram et al., 2017; Van Doren et al., 2019; Barreiros et al., 2019; Linhartová et al., 2019) call for homogeneity of double-blinded experimental procedures and a number of training sessions. While DeBettencourt, Cohen, Lee, Norman, and Turk-Browne (2015) report a significant training effect after a single fMRI neurofeedback session using the implemented paradigm (DeBettencourt et al., 2015), the number of neurofeedback sessions generally range from 5 to 12 training sessions to produce lasting training effects (Egner & Gruzelier, 2004; Shibata et al., 2011; Wang & Hsieh, 2013; Stopczynski et al., 2014; Wei et al., 2017). We report no significant difference between neurofeedback participants and control participants after a single EEG neurofeedback training session.
Accounting for intra- and interindividual variability. The use of neurofeedback systems outside laboratories depends on robustness to changing responses to identical conditions within individuals and diverse mental representations across individuals. Effort is devoted to the understanding of intraindividual and intrastudy variability in BCI applications (for reviews, see Lotte & Jeunet, 2015; Kadosh & Staunton, 2019). In the current study, the real-time decoding error rates were obtained across a wide range of EEG responses (see Figure A1 in the appendix). Our system demonstrated a high interindividual robustness and the possibility of adapting to a wide range of mental representations of visual attention across participants in an automated manner. However, the intraindividual variability was less robust: participants who displayed highly variable responses to the same stimuli conditions (evidenced by large confidence intervals in Figure A1 in the appendix) displayed higher decoding error rates. The issue of high intraindividual variability was partly solved by two aspects of the proposed framework: implementation of a weighted moving average of EEG epochs and inclusion of the most recent feedback blocks in the classifier training set (as opposed to solely using acquisition blocks). Since the EEG signals during feedback blocks were influenced by a nonstationarity of feedback and task difficulty, we speculate that the inclusion of these blocks increased system robustness by training a classifier on a more dynamic set of brain responses.
Individual use. Ordikhani-Seyedlar et al. (2016) highlight the restriction of BCI usage by dependence on experts for EEG setup, running training protocols and system maintenance. We highly prioritized engineering an automated, adaptable system for individual use in three ways. First, we used a consumer-grade, user-friendly EEG system with dry electrodes. The portability and simple system setup provide an opportunity for studying the brain in natural settings and for applications in real-life scenarios. Second, we engineered a system without the need for any manual control before or throughout the neurofeedback session. The system can easily be set up by the user with no prior technical EEG experience, thus increasing the ecological validity and usage potential of our system. Third, our system does not require any EEG recordings prior to the neurofeedback session. The system uses the first approximately 10 minutes of recorded EEG data to train a participant-specific classifier, which is used and updated throughout the session, diminishing the need for prior EEG recordings, calibration, or customization.
Code and Data Availability
The code and sample data for the proposed neurofeedback framework are available at https://github.com/gretatuckute/ClosedLoop/. In accordance with our participant consent form, data are available for research purposes upon request.
A supplementary video of the neurofeedback system is available at https://www.youtube.com/watch?v=Ns8AHIg_Wtc&feature=youtu.be.
Conflicts of Interest
We declare that we have no conflicts of interest regarding the publication of this letter.
This work was supported by the Novo Nordisk Foundation Interdisciplinary Synergy Program 2014 (Biophysically Adjusted State-Informed Cortex Stimulation (NNF14OC0011413)). We thank Ivana Konvalinka, Per Bækgaard and Tobias Andersen at DTU Cognitive Systems for feedback on the experimental and technical aspects of the neurofeedback system. Finally, we thank the anonymous reviewers for their insightful and constructive comments.