Driver mental fatigue leads to thousands of traffic accidents. The increasing quality and availability of low-cost electroencephalogram (EEG) systems offer possibilities for practical fatigue monitoring. However, non-data-driven methods, designed for practical, complex situations, usually rely on handcrafted data statistics of EEG signals. To reduce human involvement, we introduce a data-driven methodology for online mental fatigue detection: self-weight ordinal regression (SWORE). Reaction time (RT), referring to the length of time people take to react to an emergency, is widely considered an objective behavioral measure for mental fatigue state. Since regression methods are sensitive to extreme RTs, we propose an indirect RT estimation based on preferences to explore the relationship between EEG and RT, which generalizes to any scenario when an objective fatigue indicator is available. In particular, SWORE evaluates the noisy EEG signals from multiple channels in terms of two states: shaking state and steady state. Modeling the shaking state can discriminate the reliable channels from the uninformative ones, while modeling the steady state can suppress the task-nonrelevant fluctuation within each channel. In addition, an online generalized Bayesian moment matching (online GBMM) algorithm is proposed to online-calibrate SWORE efficiently per participant. Experimental results with 40 participants show that SWORE can maximally achieve consistent with RT, demonstrating the feasibility and adaptability of our proposed framework in practical mental fatigue estimation.

## 1  Introduction

Mental fatigue, a common physiological phenomenon (Borghini, Astolfi, Vecchiato, Mattia, & Babiloni, 2014), induces suboptimal functioning and may even lead to accidents with severe consequences (Van Cutsem et al., 2017). The National Highway Traffic Safety Administration estimates that about 100,000 official reports of crashes are the direct result of driver mental fatigue each year, resulting in an estimated 1,550 deaths, 71,000 injuries, and \$12.5 billion in monetary losses. In response to these critical issues, several algorithms have been developed to detect mental fatigue using electrocardio signal (ECG) (Fallahi, Motamedzade, Heidarimoghadam, Soltanian, & Miyake, 2016), functional near infrared spectroscopy (fNIRS), electrooculogram (EOG) (Laurent et al., 2013), and electroencephalographic (EEG) (Lin, Tsai, & Ko, 2013; Jagannath & Balasubramanian, 2014; Sauvet et al., 2014; Wang, Zhang, Wu, Darvas, & Chaovalitwongse, 2015), among others. Of these signals, EEG signals are assumed to be most accurate and valid for providing information related to drivers' mental fatigue owing to their high temporal resolution and the availability of a vast variety of preprocessing methods (Graimann, Allison, & Pfurtscheller, 2009; Sahayadhas, Sundaraj, & Murugappan, 2012; Palanivel Rajan & Dinesh, 2015).

Previous methods for developing automatic systems to detect driver drowsiness from EEG signals can be broadly classified into two categories: non-data driven or data driven. Non-data driven approaches, such as power spectrum-based analysis (Jap, Lal, Fischer, & Bekiaris, 2009; Wang et al., 2018), entropy-based analysis (Kar, Bhagat, & Routray, 2010), and brain network-based analysis (Li, Li, Wang, Zhang, & Wang, 2017), usually resort to handcrafted estimators, like changes in power or statistically related features, to evaluate mental fatigue using EEG signals from multiple channels (Gurudath & Riley, 2014; Gharagozlou et al., 2015). However, these evaluation metrics require expert interpretation and complex calculation processes. In addition, EEG signals are known to be highly specific and vary in great detail among individuals. Thus, non-data-driven approaches relying on predefined criteria are not robust enough to account for individual variability, reducing their applications in practical implementations.

In terms of data-driven mental fatigue evaluation, the reaction time (RT) to a certain assigned task is widely adopted as supervision, to indicate the fatigue level. Some linear (Lin et al., 2010; Resalat & Saba, 2015) and nonlinear (Liu, Lin, Wu, Chuang, & Lin, 2016; Cui & Wu, 2017; Pan, Tsang, Singh, Lin, & Sugiyama, 2020) methods show that it is possible to detect mental fatigue with high accuracy. It is impressive but also blind to the wealth of the dynamics and behavioral variability (Müller et al., 2008; Ratcliff, Philiastides, & Sajda, 2009; Yarkoni, Barch, Gray, Conturo, & Braver, 2009; Xu, Min, & Hu, 2018). Although some recent work (Wei, Lin, Wang, Lin, & Jung, 2018; Cui, Xu, & Wu, 2019) suggested addressing the concerns of inter- and intrasubject variability through transfer learning, those techniques are available only to offline analysis methods with sufficient training samples.

Figure 1:

Poor prediction performance of SVR using 20, 40, and 60 training trials, respectively. EEG signals from multiple channels are simply concatenated into a long feature vector, and SVR is trained using this feature vector. For the sake of fair comparison, we collect the mean absolute error (MAE) of three models on the remaining trials starting from the 71st trial. We present the result of only the first participant for this example.

Figure 1:

Poor prediction performance of SVR using 20, 40, and 60 training trials, respectively. EEG signals from multiple channels are simply concatenated into a long feature vector, and SVR is trained using this feature vector. For the sake of fair comparison, we collect the mean absolute error (MAE) of three models on the remaining trials starting from the 71st trial. We present the result of only the first participant for this example.

Previous offline analysis methods often result in poor fatigue detection performance due to limited training data in practical implementation (see Figure 1). For example, deep learning (Goodfellow, Bengio, Courville, & Bengio, 2016) methods, requiring massive training data, and Riemannian methods (Barachant, Bonnet, Congedo, & Jutten, 2012; Congedo, Barachant, & Bhatia, 2017), incurring high computation costs, fail to meet the harsh requirement in actual situations. In addition, mental fatigue, drop in mental alertness, and poor driving performances are a reflection of brain dynamics among different brain areas. Recent work demonstrates its efficacy by discriminating functional interactions among different brain regions based on heuristic metrics (Wang et al., 2018; Richer, Zhao, Amores, Eskofier, & Paradiso, 2018) or complex analysis (Li et al., 2017). However, it cannot fully reveal functional interactions among multiple channels in terms of mental fatigue since the analysis is independent of the mental fatigue evaluation that takes place later.

To address these concerns, we introduce a data-driven methodology, self-weight ordinal regression (SWORE), for online driver mental fatigue detection that models functional interactions among brain regions. Instead of formulating SWORE as a regression task with RT being the direct supervision, we consider a more general problem setting: learning to rank. SWORE learns from brain dynamics preferences and aims to achieve consistency with RT indirectly in the sense of ranking. The brain dynamics preferences can be constructed via some objective fatigue indicator, such as RT if available, or some power spectral features (Wang et al., 2018; Bose et al., 2019). Preferences-based, indirect mental fatigue evaluation is proved to alleviate the overfitting issue of directly predicting RT in a regression task (Pan et al., 2020). In particular, SWORE models the brain's dynamic preferences in terms of two states: shaking state and steady state. It automatically discriminates the reliable channels from the noninformative ones by modeling the shaking state and suppresses the mental fatigue nonrelevant fluctuation within each channel by modeling the steady state. Moreover, an online generalized Bayesian moment matching (online GBMM) algorithm is proposed for Bayesian posterior update. Once a new sample (the reaction time corresponding to the newly recorded EEG signals) is available, online GBMM can efficiently calibrate the SWORE model with simple updating rules. In summary, the main contributions of this letter are as follows:

• We propose an online mental fatigue monitoring system that can evaluate mental fatigue quickly with a high prediction performance.

• We propose the SWORE model to reliably aggregate brain dynamics-related preferences from multiple noisy channels in terms of two states: shaking state and steady state.

• We propose an online generalized Bayesian moment matching (online GBMM) algorithm for online calibrating the SWORE model with the analytic update rules.

• We conduct comprehensive experimental results on 40 participants to verify the reliability of our system in online mental fatigue monitoring scenarios. Further, we explore the parameter sensitivity and model uncertainty of SWORE with regard to the online GBMM algorithm.

This letter is organized as follows. Section 2 introduces the background of mental fatigue monitoring and motivates the practice of online mental fatigue monitoring. In section 3, we introduce SWORE, an indirect mental fatigue monitoring model, to model the heterogeneous brain dynamic preferences in terms of two states. Section 4 describes an analytic update strategy for online calibrating the SWORE model. Section 5 discusses the details of mental fatigue evaluation in the online scenario. Section 6 demonstrates the reliability of the proposed SWORE model with EEG signals collected from 40 participants. Section 7 concludes the letter and envisions future work.

## 2  Background and Problem Statement

In this section, we introduce mental fatigue monitoring and discuss previous approaches in the online scenario. We list, several subgoals that are necessary for achieving a robust online mental fatigue evaluation model.

### 2.1  Mental Fatigue Monitoring

The reaction time (RT) to an emergency is generally accepted as the most intuitive and resourceful metric to evaluate mental fatigue. The EEG (Lin et al., 2013; Wang et al., 2015) signals as the feature vectors are adopted, which is well known to be accurate and valid to supply the information related to the driver's mental fatigue (Graimann et al., 2009; Sahayadhas et al., 2012; Palanivel Rajan & Dinesh, 2015), compared to, for example, an EEG, ECG or fNIRS (Nguyen, Ahn, Jang, Jun, & Kim, 2017), etc.

Therefore, a common practice for mental fatigue monitoring is to build a learning model that can predict humans' reaction time to an emergency using the EEG signals recorded beforehand (Lal, Craig, Boord, Kirkup, & Nguyen, 2003; Dornhege, del R. Millán, Hinterberger, McFarland, & Müller, 2007; Soon, Brass, Heinze, & Haynes, 2008; Jap et al., 2009).

### 2.2  Impaired Performance on Nonstationary Brain Dynamics in Online Applications

Some previous work derived from linear (Resalat & Saba, 2015; Lin et al., 2010) and nonlinear (Liu et al., 2016; Cui & Wu, 2017; Pan et al., 2020) methods show that it is possible to detect mental fatigue with high accuracy. It is impressive, but it would be rather blind to the wealth of the dynamics and behavioral variability (Müller et al., 2008; Ratcliff et al., 2009; Yarkoni et al., 2009; Wei et al., 2018; Cui et al., 2019). Brain dynamics are nonstationary; they are characterized by significant trial-by-trial and subject-by-subject variability (Ratcliff et al., 2009; Yarkoni et al., 2009). However, the above methods, designed for offline analysis with sufficient training samples, would result in poor generalization performance in actuality without efficient online calibration.

For better illustration, we trained three support vector machine regressions (SVR)1 using 20, 40, and 60 sequential trials and visualized their prediction performances on the rest trials, respectively. From Figure 1, we find that (1) apart from a few local mispredictions, SVR can exactly predict the RT on the training trials and is insensitive to the extreme values. It proves that SVR has sufficient fitting capability for mental fatigue monitoring and has superior robustness to extreme values compared to deep regression models (Pan et al., 2020). (2) All SVR models show poor prediction performance2 on the remaining trials. This is consistent with our conjecture that previous offline analysis will suffer from severe generalization issues in online scenarios. (3) Increasing the training trials marginally improves prediction accuracy. In particular, when the number of training trials increases from 20 to 60, the mean absolute error decreases only by 0.34 s.

### 2.3  Online Mental Fatigue Evaluation

In pursuit of online mental fatigue evaluation, computational efficiency in terms of time and memory has been a major concern. Deep learning (Goodfellow et al., 2016) methods achieve superior performance (Pan et al., 2020) but require massive training data. Riemannian methods (Barachant et al., 2012; Congedo et al., 2017) achieve good performance with a small number of training trials but incur an overhead computational cost. Another important factor among existing and proposed methods is lack of efficient aggregation mechanisms to distill reliable predictions from multiple noisy channels. In particular, majority voting and concatenation suffer from overfitting and poor generalization performance (Pan et al., 2020).

Based on this analysis, we summarize three subproblems, which we address in this letter to develop a robust online mental fatigue evaluation model:

• How to reliably detect metal fatigue using the EEG signals as well as the corresponding RTs

• How to automatically eliminate noninformative channels during the learning process

• How to effectively calibrate the learning model with an EEG signal when its truth RT is available

We propose our SWORE mode in section 3 to address the first two problems and introduce efficient online calibration strategies in section 4 to answer the third problem.

## 3  Self-Weighted Ordinal Regression for Brain Dynamics

In this section, we evaluate mental fatigue indirectly with brain dynamics-related preferences to avoid overfitting to the extreme RTs instead of modeling it as a regular regression task (Resalat & Saba, 2015; Lin et al., 2010).

### 3.1  Brain Dynamic Preferences

As shown in Figure 1, it is usually difficult for a learning model to get the exact estimation of RT since RT values do not change smoothly, and the relationship between RT values and fatigue levels is not exact but relative due to time and subject. The performance would worsen in the online setting when only a few training trials are available. Meanwhile, a rough but reliable estimation is acceptable in real-world situations of mental fatigue monitoring (Colosio, Shestakova, Nikulin, Blagovechtchenski, & Klucharev, 2017). Therefore, we model the brain dynamics-related preferences instead of the exact values of RT.

Remark 1

(From Regression to Ordinal Regression). Let's revisit the prediction of RT in the perspective of ordinal regression. RT is actually defined in the complete ordered field $R$, which owns its structure meanings. The relative structure information is entirely preserved among the pairwise comparisons of RTs. Therefore, if there exists a learning model that can maximally preserve all structure information, a new trial can find its own position (a rough estimation of RT) by its comparisons with previous recorded EEG signals. See section 5.3 for more details.

Instead of modeling the global pattern of brain dynamics within a regression model, we consider the local discrepancy between the current and next brain dynamic states. First, the difference between RTs is leveraged as the indicator for the local discrepancy,
$y=up:1RTtRTt+1,$
3.1
where $RTt$ and $RTt+1$ denote the current and next reaction time, respectively. The brain dynamic preference3$(xtn,xt+1n)$ (typically a pair of $d$-dimensional feature vectors) can be constructed with the corresponding pairwise EEG signals recorded from each channel ($∀n=1,2,…,N$). Every EEG sensor used for recording is assumed to record independently from the scalp without influencing other sensors (Homan, Herman, & Purdy, 1987; Teplan, 2002), so brain dynamic preferences are constructed for each channel independently. Therefore, $N$ brain dynamic preferences are constructed for each comparison.
Remark 2

(Indirect Mental Fatigue Monitoring). The word indirect is adopted for comparing with the use of RT as the direct supervision in the regression task. Meanwhile, the objective fatigue indicator used for constructing brain dynamics preferences is not limited to RT. Other well-studied and easily accessible power spectral features (Borghini, Astolfi, Vecchiato, Mattia, & Babiloni, 2014; Chai et al., 2016), such as dynamic time warping, entropy, and functional connectivity, can also be adopted as fatigue supervision (Wang et al., 2018; Bose et al., 2019) for constructing the preferences.

Meanwhile, due to individual variability, the mental fatigue criteria defined by a specific RT value vary from person to person. Ranking-based criteria can avoid this since it can capture the normal level by modeling the ordering connection of several EEG signals.

Figure 2:

Gradient flattening with regard to sigmoid function. The dashed line represents the original sigmoid function.

Figure 2:

Gradient flattening with regard to sigmoid function. The dashed line represents the original sigmoid function.

### 3.2  Heterogeneous Brain Dynamic Preferences

The prediction of brain dynamic preferences can be formulated as a learning-to-rank problem in which our goal is to estimate the optimal classifier in Figure 2. Many binary classification models can be adopted to model this problem, including the logistic ordinal regression:
$P(y|w,xtn,xt+1n)=σ(ywTΔxn),whereΔxn=xt+1n-xtnandσ(z)=1/(1+e-z).$
3.2
However, the vanilla logistic classification model lacks reliability when applied to brain dynamics, since a subtle discrepancy around classification boundary $P(y|x)=0.5$ leads to the steepest gradient (see Figure 2). Note that the subtle difference between the RTs may be caused not by the intrinsic difference between two brain dynamics but unknown noise.

To improve model stability, we introduce an insensitive zone, which flattens the steepest gradient around the boundary and therefore enables the classification model to be less sensitive to the subtle difference between the response times.

We categorize the brain dynamic preferences into two states according to the discrepancy between the RTs: the shaking state ($Y1$), where the discrepancy between the brain dynamics is significant, and the steady state ($Y2$), where the brain dynamics remain stable,
$y∈ShakingStateY1:upRTtRTt+1|RTt-RTt+1|>τSteadyStateY2|RTt-RTt+1|≤τ,$
3.3
where $τ$ is the predefined parameter controlling the model sensitivity.

#### 3.2.1  Shaking State

The shaking state $y∈Y1$ has two cases: an up ($RTt+1>RTt$) and a down ($RTt+1), which can be formulated as the learning-to-rank problem.

Considering the functions of different regions in the human brain, the relative contributions of different channels to human reaction time may vary considerably. If we simply aggregate the $N$ brain dynamic preferences recorded in different channels without making any distinctions about channel reliability, the performance of the learning model would inevitably degrade (Pan et al., 2020). Inspired by Raykar et al. (2010), which aggregates the noisy annotations from multiple crowd workers while considering worker ability, we propose to estimate the reliability of each channel explicitly during the aggregation process. In particular, we formulate a robust pairwise learning-to-rank model:
$P(y|w,π1:N,x01:N,x11:N)=∏n=1Nπnσ(ywTΔxn)+(1-πn)σ(-ywTΔxn)y∈Y1.$
3.4
This equation models each brain dynamic preference as the weighted arithmetic mean of two cases. The weight $πn∈[0,1]$, estimated during the training process, denotes the relative contribution of the $n$th channel with regard to the learning task, $∀n=1,2,…,N$.
Remark 3

(Superiority over the Regular Weighted Average). From the perspective of EEG channel analysis, equation 3.4 provides a new aggregation mechanism to combine the information from different channels. Different from majority voting, which simply categorizes the channels into reliable and noisy ones, this equation performs a fine-grained analysis and categorizes the noisy channels into nonrelevant ones and negative reliable ones. Therefore, three types of channels can be recognized with the channel reliability $πn∈[0,1]$: positive reliable ones ($πn→1-$),4 nonrelevant ones ($πn≈0.5$), and negative reliable ones ($πn→0+$), $∀n=1,2,…,N$.

#### 3.2.2  Steady State

The steady state $y∈Y2$ denotes the brain dynamic preferences with comparable RTs.

To improve the robustness of the learning model with regard to the easily corrupted brain dynamics, gradient flattening is introduced to model the insensitive zone in Figure 2: it flattens the steepest gradient at the classification boundary $P(y|x)=0.5$, enabling the learning model to be less sensitive to subtle noise. In particular, we model the brain dynamic preferences at steady state as follows,
$P(y|w,x0n,x1n)=σ(wTΔxn)σ(-wTΔxn)y∈Y2,$
3.5
which is the geometric mean of an up ($RTt) and a down ($RTt>RTt+1$). (Refer to the learning-to-rank literature for other options, e.g., Zhou, Xue, Zha, & Yu, 2008). Furthermore, equation 3.5 can be integrated to the robust aggregation model equation 3.4, which considers the channel reliability $πn$. Due to the symmetry of equation 3.5, we define
$P(y|w,πn,x1n,x2n)=ΔP(y|w,x1n,x2n)∀y∈Y2.$
Remark 4

(Gradient Flattening Enhances Model Robustness). The gradient flattening used in equation 3.5 can be understood as a regularization. It enables our model to be robust to the fluctuation between brain dynamics, which is not relevant to RTs.

### 3.3  Self-Weighted Ordinal Regression Model

In summary, our SWORE for heterogeneous brain dynamic preferences can be formulated as follows:
$P(y|w,π1:N,x01:N,x11:N)=∏n=1Nπnσ(ywTΔxn)+(1-πn)σ(-ywTΔxn)y∈Y1,∏n=1Nσ(wTΔxn)σ(-wTΔxn)y∈Y2.$
3.6
Remark 5

(Reliability of the SWORE Model). (1) Interchannel reliability: SWORE trusts the brain dynamic preferences from only positive and negative reliable channels. Since SWORE trains a mixture of two complementary classifiers with shared parameter $w$, it categorizes the channels into positive channels ($πn→1-$), negative channels ($πn→0+$), and nonrelevant channels ($πn≈0.5$). Based on channel reliability $π$, SWORE can automatically choose the suitable classifier to extract the correct information from the positive and negative channels and update the shared parameter $w$ accordingly. Further, it ignores information from nonrelevant channels by assigning a constant likelihood (i.e., 0.5) to each brain dynamic preference from the nonrelevant channels.

(2) Intrachannel reliability: SWORE extracts only task-related information from each brain dynamic preference. Since the probability of the steady state $y∈Y2$ (see equation 3.5) does not depend on channel reliability $πn$, gradient flattening actually performs as a regularization on the regression weight $w$ and enables SWORE to be robust to the random fluctuations that exist in brain dynamics.

## 4  Efficient Online Updating Strategy

As we discussed in section 2.2, brain dynamics are nonstationary. If the SWORE model cannot be updated, it would suffer from low generalization performance. Therefore, in this section, we introduce an efficient online updating strategy for it. It can update SWORE with high accuracy while introducing marginal computation cost.

### 4.1  Bayesian Moment Matching

Bayesian moment matching (BMM) is used to estimate the model parameters. Specifically, it estimates the parameters of the approximated posterior by matching a set of sufficient moments of the exact complex posterior. Moreover, it can be extended to the sequential update paradigm for large-scale or streaming data sets such as onlineBMM (Jaini et al., 2017). That is, the approximated posterior is updated with each sample, rather than the entire data set, each time.

First, SWORE is extended to its Bayesian version. Specifically, a gaussian prior is introduced for weight vector $w$, $w∼N(μ,Σ)$, while a beta prior is introduced for each channel reliability $πn$, namely, $p0(π)=∏n=1NBeta(πn|αn,βn)$. Given a brain dynamic preference ($xtn,xt+1n$) recorded in the $n$th channel with its ordinal supervision $y$, the posterior of the model parameters can be represented as
$P(w,π|y,Δxn)=P(y|w,π,Δxn)p0(w)p0(π)P(y|Δxn).$
4.1
Note that we only consider the posterior distribution with regard to brain dynamic preferences from one channel. Since different channels are modeled independently, the following equations can be easily extended to the posterior distribution with regard to the brain dynamic preferences from all channels.

The main issue with equation 4.1 is that the joint posterior distribution $P(w,π|y,Δxn)$ is complicated or even intractable. To keep the computation tractable, we adopt the mean-filed assumption and project the posterior into the same form with the prior (product of a Normal with betas, that is, $P(w,π|y,Δxn)≈q(w)q(π)=N(w|μ,Σ)∏n=1NBeta(πn|αn,βn)$). Then the posterior parameters are estimated by matching a set of sufficient moments of the approximate posterior with the exact posterior:

• Match the moments between $q(w)$ and $P(w|y,Δxn):∫wq(w)dw$$=∫wP(w|y,Δxn)dw$ and $∫wwTq(w)dw=∫wwTP(w|y,Δxn)dw$. Due to the nonconjugation between the marginalized likelihood $P(y|w,Δxn)$5 and the normal prior $N(w|μ,Σ)$, the posterior $P(w|y,Δxn)$ is complex. Therefore, the posterior parameters $(μnew,Σnew)$ cannot be computed analytically because of the intractability of the integrals in the moment constraints.

• Match the moments between $q(π)$ and $P(π|y,Δxn):∫πnq(π)dπ$$=∫πnP(π|y,Δxn)dπ$ and $∫πn2q(π)dπ=∫πn2P(π|y,Δxn)dπ$, $n=1,2,…,N$. Fortunately, we can solve the moment constraints with closed-form integrals and get the posterior parameters $(αnnew,βnnew),∀n=1,2,…,N$ accordingly.

### 4.2  Generalized Bayesian Moment Matching

Inspired by the Bayesian approximation method proposed by Weng and Lin (2011), which extended Stein's lemma (Woodroofe, 1989), we propose to estimate the posterior parameters $(μnew,Σnew)$ of the approximate posterior $q(w)$ by differential operations instead of integral operations. Therefore, the BMM algorithm is extended to a general situation where the likelihood function is twice differentiable.

Theorem 1.
Assume $f(w)$ is the marginalized likelihood of one brain dynamic preference and almost twice differentiable. Upon updating this preference, the posterior parameters $(μnew,Σnew)$ of weight $w$ can be estimated as
$μnew≈μ+Σ×dlogf(w)dw|w=μ,$
4.2a
$Σnew≈Σ+Σ×d2logf(w)dwdwT|w=μ×Σ.$
4.2b

We set $w=μ$ as we expect that the posterior density of $w$ to be concentrated on $μ$ (Weng & Lin, 2011). See the appendix for the detailed proof of theorem 6.

In the following, we resort to the generalized Bayesian moment matching (GBMM) method to estimate the posterior parameters. We take the brain dynamic preference ($xtn,xt+1n$) at the shaking state $y∈Y1$ as an example. The equations can be easily extended to the brain dynamic preference at the steady state.

We first update the hyperparameters $(μ,Σ)$ of $w$ and then the hyperparameter $(αn,βn)$ of $πn$$∀n=1,2,…,N$. To update $w$, we integrate out $πn$ to obtain the marginalized likelihood function $f(w)$:
$f(w)=∫P(y|w,π,Δxn)Beta(π|α,β)dπ=αnαn+βnσ(ywTΔxn)+βnαn+βnσ(-ywTΔxn).$
According to equation 4.2a, we can update $μ$ as follows:
$μnew≈μ+Σ×dlogf(w)dw|w=μ=μ+(A-a)×Σ×Δxn,$
4.3
where $A=αnαn+βne-μTΔxn$ and $a=11+e-μTΔxn$. According to equation 4.2b, we can update $Σ$ as follows:
$Σnew≈Σ+Σ×d2logf(w)dwdwT|w=μ×Σ≈Σ+κI+[A(1-A)-a(1-a)]×Σ×ΔxnΔxnT×Σ$
4.4
where $κ$ is a small, positive value to ensure a positive-definite variance matrix. $I$ is an identity matrix.
Similarly, to update $π$, we first integrate out $w$ to obtain the marginalized likelihood $f(πn),∀n=1,2,…,N$ for each preference:
$f(πn)=∫P(y|w,πn,Δxn)N(w|μ,Σ)dw=πn×EN(w|μ,Σ)[σ(ywTΔxn)]+(1-πn)×EN(w|μ,Σ)[σ(-ywTΔxn)].$
4.5
Letting $R1=EN(w|μ,Σ)[σ(ywTΔxn)]$, we calculate $R1$ by the second-order Taylor approximation of $σ(ywTΔxn)$ at $μ$. Then we have $R2=EN(w|μ,Σ)[σ(-ywTΔxn)]=1-R1$, and the normalization constant $P(y|Δxn)$ can be represented as
$R=P(y|Δxn)=∫f(πn)Beta(πn|αn,βn)dπ=αnR1+βnR2αn+βn.$
4.6
According to the Bayesian theorem, the posterior distribution of $πn$ is $P(πn|y,Δxn)=f(πn)Beta(πn|αn,βn)R$; the moments $E[πn]$ and $E[πn2]$ with regard to to $P(πn|y,Δxn)$ can be computed as
$EP(πn|y,Δxn)[πn]=R1(αn+1)αn+R2αnβnR(αn+βn+1)(αn+βn),$
4.7a
$EP(πn|y,Δxn)[πn2]=αn(αn+1)[R1(αn+2)+R2βn]R(αn+βn+2)(αn+βn+1)(αn+βn),$
4.7b
where $n=1,2,…,N$. (See the appendix for detailed derivations.) Then we can update the hyperparameter $(αn,βn)$ of $πn$:
$αnnew=(E[πn]-E[πn2])E[πn]E[πn2]-(E[πn])2,$
4.8a
$βnnew=(E[πn]-E[πn2])(1-E[πn])E[πn2]-(E[πn])2,$
4.8b
where we omit the subscript $P(πn|y,Δxn)$ of the expectation operator for simplicity.

### 4.3  Online GBMM for the Calibration of SWORE

According to the analysis, we summarize an online GBMM for SWORE in algorithm 1. It is notable that both the weight update and channel reliability update can be completed following the analytic rules (see equations 4.3, 4.4, 4.8a, and 4.8b). As a result of the efficient posterior updating procedure, online GBMM enables SWORE naturally to handle streaming preferences.

## 5  Online Mental Fatigue Evaluation

In this section, we apply the SWORE model (see equation 3.6) to perform online mental fatigue monitoring. First, we introduce data augmentation tricks to address the low data volume in the online scenario. Then we propose to maintain a brain dynamic table (BDtable), which sequentially stores the representative EEG signals. Finally, we summarize the entire framework for online mental fatigue evaluation.

### 5.1  Blank-Out Noise Model for Data Augmentation

Due to the limited size of available trials, the learning model is prone to be overfitting during the training process. Therefore, we adopt a data augmentation trick: we replace the original EEG signals with $T$ corrupted versions from the predefined corrupting distribution $P(Δx˜|Δxn)$.6 For simplicity, we are going to focus on the blank-out noise model (a.k.a dropout) as the corrupting distribution, which randomly omits subsets of neurons (or features)—more precisely,
$P(Δx˜l|Δxl;θ)=θΔx˜l=01-θΔx˜l=Δxl,$
5.1
where $Δx˜l∈{0,Δxl}∀l=1,2,…,d$ and $d$ is the feature dimension.

Note that each dimension of the input $Δxn$ is corrupted independently. Equation 5.1 is also a promising technique to break up the complex coadaptations caused by high correlation among different dimensions of the EEG signals (in either the time or frequency domain). Since the presence of any particular dimension is unreliable, a dimension cannot rely on other specific dimensions to correct its mistakes. It must perform well in a wide variety of contexts provided by the other dimensions.

Therefore, adding data augmentation (see equation 5.1) before line 5 of algorithm 1 can improve the generalization of our SWORE in more complex situations (see Figure 10).

### 5.2  Online Reservoir Sampling for BDtable

Our SWORE model requires brain dynamics-related preferences, which are constructed using current EEG signals and previously observed ones, for an update. Accordingly, brain dynamic table (BDtable) is introduced to store the EEG signals, which can help to calibrate our evaluations and guide the model updating process. Considering the requirement of the high computational efficiency in online applications, BDtable should provide a good summary of previous EEG signals.

Since no prior knowledge about each subject is available, we propose to build BDtable with random sampling: each element of the BDtable is uniformly sampled from the EEG signals seen so far. In particular, reservoir sampling is proven to meet the requirement for BDtable (Vitter, 1985). It is then carried out to sequentially maintain the BDtable following algorithm 2, where $S$ denotes the number of BDtable.

### 5.3  The Framework for Online Mental Fatigue Evaluation

Assume the SWORE model ${w,π1:N}$7 and the BDtable ${xi1:N,RTi}i=1:S$ are updated to time $t-1$ following algorithms 1 and 2, respectively. Online mental fatigue monitoring refers to predicting $RTt$ with the EEG signals $xt1:N$, extracted at time $t$, using the up-to-date SWORE model and BDtable.

As stated in section 3.1, the relative ordinal structure of RT is revealed through brain-dynamic related preferences, which are maximally preserved by our SWORE model ${w,π1:N}$. The relative ordinal structure of RT is consistent with that of $wTx$ according to the definition of the SWORE model,
$RTi>RTj⟺wTxin>wTxjnπn>0.5,wTxin
or we can simply formulate it as
$RTi>RTj⟺sgn(πn-0.5)wTxin>sgn(πn-0.5)wTxjn,$
5.2
where $i,j$ denote the index of different trials, and $n$ represents the index of channels.
Then we compare the newly collected EEG signals $xt1:N$ to the reference EEG signals stored in the BDtable following equation 5.2 and derive $N$ full ranking lists over $S+1$ trials regarding each channel, respectively:
$sgn(π1-0.5)wT×{xt1,x11,x21,…,xS1}n=1,sgn(π2-0.5)wT×{xt2,x12,x22,…,xS2}n=2,…sgn(πN-0.5)wT×{xtN,x1N,x2N,…,xSN}n=N.$
5.3
Figure 3:

Online mental fatigue evaluation.

Figure 3:

Online mental fatigue evaluation.

Let $Sort(xtn)$ output the ranking position of the estimated $RTt$ for the EEG signals $xtn$ in terms of the $n$th channel, compared to EEG signals stored in the BDtable. The ranking position of the estimated $RTt$ over all $N$ channels can be derived by aggregating the results from all channels while considering channel reliability:
$Sort(xt1:N)=∑n=1N|2πn-1|×Sort(xtn)∑k=1N|2πk-1|.$
5.4
Correspondingly, we sort the $S$ reaction times ${RTi}i=1:S$ stored in the BDtable and derive the full ranking list. Following the consistency of the relative ordinal structure between RTs and brain dynamics-related preferences (see equation 5.2), the corresponding $RTt$ for the newly recorded EEG signals $xt1:N$ can be estimated by the average of the RTs with the ranking position being close to $Sort(xt1:N)$.

The framework of online mental fatigue evaluation is summarized in Figure 3. In the first $S$ trials, we build the BDtable with the S EEG signals and their corresponding RTs and then initialize SWORE. For a newly collected EEG signal $xt1:N$, the SWORE model conducts the indirect mental fatigue evaluation by giving a coarse estimation of $RTt$ following equation 5.4. When the reaction time $RTt$ is available, we calibrate the SWORE model following algorithm 1 and online update the BDtable following algorithm 2.

### 5.4  Complexity Analysis of the Framework

In this section, we analyze the space complexity and computational complexity of our framework. In particular, let $d$, $S$, and $N$ denote the dimension of the feature vector, the size of the BDtable, and the number of channels, respectively. Meanwhile, we use $T$ to denote the number of data augmentations in equation 5.1 and $M$ to denote the number of sequential trials.

The storage of the online system consists of two parts: $O(d+N)$ for the model parameters $(μ,Σ,{αn,βn}n=1N)$8 and $O(SNd)$ for the BDtable ${xi1:N,RTi}i=1:S$. Therefore, the overall space complexity is $O(SNd)$.

We analyze the complexity of the online system from the aspects of prediction and calibration, respectively:

• Prediction complexity. The predication consists of two steps: equations 5.3 and 5.4. The computational complexity is $O(SNd)$ for equation 5.3 and $O(SNlogS)$ for equation 5.4. Therefore, the overall computational complexity of predication for $M$ sequential trials is $O(MSN(d+logS))$.

• Calibration complexity. According to algorithm 1, the most time-consuming step is the matrix multiplication in equation 4.4 with $O(d3)$ time complexity. Therefore, the overall time complexity for $M$ calibration steps is $O(MTSNd3)$. It would decrease to $O(MTSNd)$ if a diagonal covariance matrix is adopted.

According to our analysis, the proposed online fatigue monitoring system is both space and time efficient since they are both lineary related to each factor.

## 6  Numerical Experiments

In this section, we first introduce the experimental setup of mental fatigue monitoring. Then we explore the reliability of our SWORE model in online mental fatigue evaluation tasks. We also analyze the parameter sensitivity and the model uncertainty of SWORE with regard to the proposed online GBMM algorithm.

### 6.1  Experiment Setup

#### 6.1.1  Data Collection

This letter uses the EEG data introduced in Huang, Jung, and Makeig (2009). Forty healthy male adults aged 20 to 30 years were recruited to participate in the sustained-attention driving experiment in a virtual driving simulating environment (see Figure 4). All subjects participated in the sustained-attention driving experiment for 90 minutes, beginning between 1:00 p.m. and 2:00 p.m. At the beginning of the experiment, a 5 minute pretest was performed to ensure that every subject understood the instructions and did not suffer from simulator-induced nausea. During this sustained attention driving task, the experimental paradigm simulated a nighttime driving situation on a four-lane highway, and lane changing was randomly triggered to make the car drift from the original cruising lane toward the left or the right. Each participant was instructed to quickly compensate by steering the wheel. A complete trial in this study, including a 1 s baseline, deviation onset, response onset, and response offset, is shown in Figure 4. EEG signals were recorded simultaneously. The next trial occurs within an interval of 5 s to 10 s after the completion of the current trial in which the subject has to drive back to the centerline of the third car lane. If a subject fell asleep during the experiment, there was no feedback to alert him. For each trial $t$, the 10 s EEG signals ${xn,t}n=1N$ from $N$(=33)9 different EEG channels before the deviation onset were recorded simultaneously, and the corresponding reaction time $RTt$ was collected.

Figure 4:

Event-related lane-departure driving paradigm.

Figure 4:

Event-related lane-departure driving paradigm.

A wired EEG cap with 33 Ag/AgCl electrodes, including 30 EEG electrodes, 2 reference electrodes (A1 and A2), and 1 vehicle position channel (VP), was used to record the electrical activity of the brain from the scalp during the driving task. The EEG electrodes were placed according to a modified international $10-20$ system. The contact impedance between all electrodes and the skin was kept below 5 k. The EEG recordings, amplified by the Scan SynAmps2 Express system (Compumedics Ltd., VIC, Australia), were digitized at 500 Hz (resolution: 16 bits). Before data analysis, the raw EEG data were preprocessed. First, we used a digital bandpass ($1--50$ Hz) zero-phase FIR filter (the eegfilt.m routine from the EEGLAB toolbox) to remove the power line noise and low-frequency drift. Then the signals were downsampled to 250 Hz to reduce the volume of data. Finally, we did a manual removal of some artifacts such as random and persistent disturbance from body motion, eye movement, eye blinking, muscle activity, EEG channel malfunction, and environmental noise.

#### 6.1.2  Data Preprocessing and Preferences Construction

Following Huang, Pal, Chuang, and Lin (2015) and Pan et al. (2020), we preprocessed the EEG signals as follows. Considering the time delay among the channels in the time domain, Fourier transforms (Welch, 1967) were applied to EEG signals to transform time series into the frequency domain. Further, to avoid overhead computation, EEG power within 0 to 30 Hz was selected, which is considered to be the most relevant to the RTs (Huang et al., 2015).

Two types of preferences were constructed following Pan et al. (2020). The shaking-state preferences $Y1$ were constructed with RT comparisons $(RT0,RT1)$, namely ($RT0), where $RT0, and vice versa for $RT0>RT1$. The steady-state preferences $Y2$ were constructed with RT comparisons $(RT0,RT1)$, namely ($RT0), which satisfies $RT0, and vice versa when $RT0>RT1$. It is notable that $τ1>τ3>0$ and $τ2>τ4>1$ control the sensitivity of mental fatigue evaluation. We empirically set $τ1=0.15;τ2=1.2;τ3=0.1;τ4=1.1$ for all participants in our experiment.

In terms of the scenarios where RT is not available, other well-studied power spectral features can be adopted as fatigue indicators. For example, dynamic time warping used in Wang et al. (2018) and Bose et al. (2019) proved to be consistent with RT and can also be adopted for constructing the brain dynamics preferences.

#### 6.1.3  Evaluation Metric

Since mental fatigue monitoring is formulated as an ordinal regression task, the efficacy of SWORE can be evaluated by the consistency between its prediction and RTs in terms of order maintenance. In particular, we adopted the Wilcoxon-Mann-Whitney statistics (Yan, Dodier, Mozer, & Wolniewicz, 2003) and calculated the prediction accuracy of each trial as follows:
$ACC=1S∑s=1SI(ys=y^s),y^s=sgn∑n=1Ny^s(n)I(πn>κ)-I(πn<1-κ),$
where $S$ is the number of EEG signals stored in the BDtable and $N$ is the number of channels. $I()$ is an indicator that returns one if the argument is valid and returns zero otherwise. $ys$ is the ground-truth order between the new trial and the $s$th EEG signal in BDtable (see equation 3.3), which is derived from RT or other fatigue indicators. $y^s(n)=2σ(wTΔxs,n)-1$ is the predicted order for the brain dynamic preference $(x0n,x1n)s$ from the $n$th channel: 1 denotes an up, and $-1$ denotes a down. Then we derive the final order by aggregating the predictions of all $N$ channels by majority voting. We consider the prediction from reliable channels only, where a channel is recognized as reliable if it satisfies $πn>κ$ or $πn<1-κ$. $κ$ is set to 0.85 in this letter.

$ACC∈[0,1]$ denotes the consistency between the prediction and the ordering of the response time—the higher the better. $ACC=1$ means the learning model can correctly capture the level of current mental fatigue based on the reference EEG signals. $ACC=0.5$ means the learning model can capture nothing about the level of current mental fatigue. $ACC=0$ means the learning model can capture the level of current mental fatigue based on the reference EEG signals but in the reverse order. Ranking-based metric ACC is a relaxation of mean absolute error (MAE). ACC is more robust to extreme RTs and local perturbation around RT compared to MAE. The optima status derived by MAE is also the optima for ACC, but not vice versa. Therefore, achieving the optima of ACC should be much easier than that of MAE.

#### 6.1.4  Baselines

We consider only the data-driven mental fatigue evaluation approaches in previous literature. Among the regression methods, support vector regression (SVR) (Bose et al., 2019) and neural network-based regression (Pan et al., 2020) been verified achieving superior performance. Since neural network-based regression requires very large samples for training, we consider SVR (Bose et al., 2019) because a small number of samples is sufficient for training. In terms of classification methods, CArank (Pan et al., 2020), which requires large numbers of training samples, is not suitable for our online scenario. Alternatively, we consider support vector machine (SVM) and random forest, and logistics ordinal regression (LOR). LOR is a special case of SWORE that models only the shaking state ($Y1$).

All baselines—SVR, SVM, Random Forest, LOR, and our SWORE—were implemented with Matlab. In particular, we adopt the RBF kernel for SVR and SVM following Bose et al. (2019). Since SVR SVM, Random Forest, and LOR have no mechanism to evaluate the channel state beforehand, we simply concatenate the EEG signals from all channels as the feature vector. It achieved better performance than aggregating the output from different channels using majority voting. For all the methods, we use the first 20 trials for pretraining. We fixed SVR, SVM, and Random Forest after pretraining since no efficient online calibration methods are available. LOR and SWORE can be efficient online calibrated with the update strategy introduced in section 4. The size of BDtable is set to 10 for LOR and SWORE—that is, $S=10$, in algorithm 2. For a fair comparison, we calculate the prediction accuracy of all methods regarding each trial using the same dynamic-updated BDtable, respectively. Note two kinds of LOR: LOR, denoting LOR without online calibration, and online LOR, denoting LOR with online calibration, are considered for better comparison.

### 6.2  Comparison with Offline Regression/Classification Methods

Following the online mental fatigue evaluation framework proposed in Figure 3, we explored the reliability of SWORE in the online monitoring scenario. Specifically, we leveraged the prerecorded 20 trials to pretrain the embryonic SVR, SVM, Random Forest, LOR, and SWORE, respectively. In terms of SWORE, we randomly initialized $μ$ in $[-10-2,10-2]$ and $Σ$ in $[0,10-4×I]$ and set $αn=βn=5$ according to our parameter sensitivity analysis in section 6.5. The data augmentation size $T$ is set to 1 during pretraining and 3 during online updating according to Figure 10 since we encountered a sufficient and an insufficient scenario, respectively. The brain dynamics table size is fixed at 10. Therefore, we sequentially get the coarse estimation of RT for each new trial, determine the prediction accuracy, and update the SWORE model when the truth RT is available. We ran our SWORE model following the procedure in Figure 3 100 times and recorded the prediction accuracy for each EEG signal accordingly.

Table 1:

Comparison of Average Prediction Accuracy.

MethodAverage Prediction Accuracy
Offline Regression SVR $69.1±0.36%$
Offline Classification SVM $67.4±0.26%$
Random Forest $67.7±0.39%$
LOR $63.1±0.35%$
Online Classification LOR $72.6±0.33%$
SWORE $76.0±0.30%$
MethodAverage Prediction Accuracy
Offline Regression SVR $69.1±0.36%$
Offline Classification SVM $67.4±0.26%$
Random Forest $67.7±0.39%$
LOR $63.1±0.35%$
Online Classification LOR $72.6±0.33%$
SWORE $76.0±0.30%$

Notes: We run each baseline independently 100 times and calculate the mean and 95% confidence interval. The best result is in bold.

From Table 1, we can find that the offline regression method, that is, SVR, achieves higher prediction accuracy than offline classification methods, that is, Random Forest and LOR. It is consistent with the result in section 6.4, which makes sense since a classification task is easier to overfit on a small data set than a regression task. We also find that online calibration is necessary for reliable mental fatigue evaluation. Both online methods, SWORE and LOR, achieve significant improvement over the offline regression/classification tasks. And our SWORE achieves the highest prediction performance among all baselines.

### 6.3  Online Mental Fatigue Evaluation on One Participant

According to the experimental results in section 6.2, the offline regression method, SVR, achieves superior prediction accuracy to the offline classification methods: SVM, Random Forest, and offline LOR. In the following, we only consider the comparison between the offline regression method, SVR, and online classification methods, online LOR and SWORE. In particular, we conducted more detailed comparisons following the experimental setting in section 6.2.

#### 6.3.1  PDF and CDF of Online Prediction Accuracy

We estimated the probability density function (PDF) and cumulative distribution function (CDF) regarding the prediction accuracy of each EEG signal (see Figure 5).

From Figure 5, we can observe first that SWORE gives the most reliable evaluation ($76%$ average prediction accuracy) with the lowest variance for any new EEG signal, compared to LOR ($72.6%$) and SVR ($69.1%$). Second, in terms of half of the samples ($y=0.5$), SWORE gives a prediction accuracy of more than $77%$ compared to $75%$ for LOR and $68%$ for SVR. Third, SWORE gives a prediction accuracy of more than $x=70%$ for $65%$ ($1-y3$) of samples. SVR and LOR are much worse: only $60%$ ($1-y2$) and $48%$ ($1-y1$) of samples can be predicted, respectively, when requiring a prediction accuracy of more than $x=70%$.

Figure 5:

The PDF and CDF of online prediction accuracy with regard to each trial. We collected the result of only the first participant for a showcase.

Figure 5:

The PDF and CDF of online prediction accuracy with regard to each trial. We collected the result of only the first participant for a showcase.

Table 2:

$t$-Test at the $5%$ Significance Level.

Equal Means without Assuming Equal VariancesTest Decision$p$-value
SWORE versus LOR Reject 2.38e-42
SWORE versus SVR Reject 1.039e-153
LOR versus SVR Reject 1.03e-37
Equal Means without Assuming Equal VariancesTest Decision$p$-value
SWORE versus LOR Reject 2.38e-42
SWORE versus SVR Reject 1.039e-153
LOR versus SVR Reject 1.03e-37

To further demonstrate our claim, we conducted a two-sample $t$-test under the assumption of equal means without assuming equal variances using the ttest2 function in Matlab. In particular, we conducted testing between any two of SWORE, LOR, and SVR at the $5%$ significance level. The comparison results are listed in Table 2.

All test results in Table 2 indicate that $t$-test rejects all three null hypotheses at the $5%$ significance level without assuming equal variances. There is a significant difference between the results of any two of SWORE, LOR, and SVR.

#### 6.3.2  Showcase of Online Mental Fatigue Evaluation

To give an intuitive comparison, we calculated the prediction accuracy of each trial for one random experiment with three methods and then plotted the performance improvement of SWORE and SVR compared to LOR (see Figure 6).

Figure 6 shows that SWORE consistently achieves superior or at least comparable performance compared to LOR and SVR, which demonstrates our claim that channel reliability indeed affects the efficacy of the learning model. In terms of (regression-based) SVR, it suffers from high generalization errors for new EEG signals compared to (classification-based) SWORE and LOR. And SWORE achieves an average prediction accuracy of $80.6%$ for each trial in one random experiment, which is higher than that of LOR ($75%$) and SVR (70.5%).

Figure 6:

Online mental fatigue evaluation. We used the online prediction accuracy of LOR as the base and plotted the improvement of SVR and SWORE over LOR for each trial, respectively. Note the first 25 trials are used to pretrain each model. We collected the result of only the first participant for this example.

Figure 6:

Online mental fatigue evaluation. We used the online prediction accuracy of LOR as the base and plotted the improvement of SVR and SWORE over LOR for each trial, respectively. Note the first 25 trials are used to pretrain each model. We collected the result of only the first participant for this example.

#### 6.3.3  Channel Reliability Estimation

Following our analysis in remark 3, the model parameter $πn$ reveals the reliability of the $n$th channel, $n=1,2,…,N$. Therefore, we visualize the estimated $πn$ for each channel in Figure 8. Note that the 33-channel EEG data in our experiment consist of 30 EEG channels, 2 reference channels (A1 and A2), and the vehicle position channel, VP. We also visualized the relative contribution of each channel via the 30-channel layout of Topoplot in Figure 7. According to our analysis in remark 5, both positive and negative channels are considered informative and contribute equally to SWORE. We introduce a new metric, R ($R=2*|π-0.5|$), to denote the contribution of each channel with regard to the SWORE model. Similar to $π$, $R$ ranges between 0 and 1, but a higher value ($↗1$) indicates that the corresponding EEG channel is more informative.

From Figures 7 and 8, we find that the relative contributions of different channels are different, which is consistent with our motivation that different regions of the human brain perform different functions. We also find that majority of channels ($29/33$) are considered reliable. All three known nonrelevant channels—A1, A2, and VP—can be automatically detected and removed during the learning process (Pan et al., 2020).

Figure 7:

Topoplot visualization of the channels' relative contribution.

Figure 7:

Topoplot visualization of the channels' relative contribution.

Figure 8:

Channel reliability estimation and nonrelevant channel detection. The color bar denotes the estimated channel reliability. A channel with comparable values in the two rows is considered a task nonrelevant channel. See remark 3 for more details. We only collected the result of only the first participant for a showcase.

Figure 8:

Channel reliability estimation and nonrelevant channel detection. The color bar denotes the estimated channel reliability. A channel with comparable values in the two rows is considered a task nonrelevant channel. See remark 3 for more details. We only collected the result of only the first participant for a showcase.

#### 6.3.4  Model Reliability with Fewer Channels

We always prefer fewer EEG channels in the online scenario, which means lower computation cost, a small amount of storage, and minimal impact on drivers. Therefore, we explored the reliability of SWORE when fewer channels are available. According to Figure 8, we retrained SWORE with 5, 10, and 20 randomly selected reliable channels and compared them with the original SWORE model using all channels in Figure 9. We collected the prediction accuracy of all variants of SWORE using the same dynamic updating BDtable for a fair comparison.

Figure 9:

Performance comparison with reduced number of channels. SWORE-$K$ denotes we retrain SWORE with only $K$ randomly selected reliable channels following the estimated channel reliability shown in Figure 8. SWORE-all denotes SWORE trained with all channels. We collected the result of only the first participant as an example.

Figure 9:

Performance comparison with reduced number of channels. SWORE-$K$ denotes we retrain SWORE with only $K$ randomly selected reliable channels following the estimated channel reliability shown in Figure 8. SWORE-all denotes SWORE trained with all channels. We collected the result of only the first participant as an example.

Figure 9 shows that all SWORE variants achieve comparable prediction accuracy in terms of overall average for each trial. It proves that the superior performance of our SWORE model is barely affected when fewer EEG channels are used and that the reliable channels detected by SWORE are trustworthy.

### 6.4  Ablation Study for Exploring the Efficacy of Our Contributions

In this letter, we have made contributions from different perspectives. To be specific, C1 denotes the robust multichannel aggregation method in equation 3.4. C2 denotes the gradient flattening in equation 3.5. C3 denotes the efficient online updating strategy (i.e., online GBMM) in section 4. C4 denotes the blank-out noise mode for data augmentation in equation 5.1.

To explore the efficacy of these four contributions, we introduce five new baselines: LOR w/o C3: LOR without online calibration; SWORE w/o C3: SWORE without online calibration; SWORE w/o C2&C4: SWORE without modeling steady state and using data augmentation; SWORE w/o C2: SWORE without modeling steady-state; SWORE w/o C4: and SWORE without using data augmentation. We ran all methods 100 times on the first participant and calculated the mean and standard deviation in Table 3.

Table 3:

Comparison of Average Prediction Accuracy for Different SWORE Variants over.

MethodCalibration OptionAverage Prediction Accuracy
LOR w/o C3 $63.1±0.35%$
Full $72.6±0.33%$
w/o C3 $64.5±0.33%$
w/o C2 $75.7±0.30%$
SWORE w/o C4 $74.6±0.32%$
w/o C2 & C4 $74.1±0.32%$
Full $76.0±0.30%$
MethodCalibration OptionAverage Prediction Accuracy
LOR w/o C3 $63.1±0.35%$
Full $72.6±0.33%$
w/o C3 $64.5±0.33%$
w/o C2 $75.7±0.30%$
SWORE w/o C4 $74.6±0.32%$
w/o C2 & C4 $74.1±0.32%$
Full $76.0±0.30%$

Notes: We run each baseline independently 100 times and calculate the mean and 95% confidence interval. The best result is in bold.

From Table 3, we can find that online calibration (C3) is necessary for reliable mental fatigue evaluation. SWORE and LOR achieve significant improvement (around $10%$) over their offline versions via adopting efficient online calibration, respectively. Comparing SWORE w/o C2&C4 to LOR, we find that the robust multichannel aggregation strategy (C1) enables higher predication accuracy by automatically eliminating the task-nonrelevant channels. In addition, the data augmentation (C4) and the gradient flattening (C2) are useful for improving model performance, comparing SWORE w/o C2 and SWORE w/o C4 to SWORE w/o C2&C4. Finally, the importance ranking of all four contributions on the first participant is C3$>$C1$>$C4$>$C2.

### 6.5  Offline Analysis of Parameter Sensitivity and Model Uncertainty

In this section, we explore the parameter sensitivity of SWORE with regard to the hyperparameters $(μ,Σ)$ and $(α,β)$. In particular, we generated the offline brain dynamic preferences as follows: the trials of each participant were randomly divided into two parts—$50%$ for training and $50%$ for testing—and offline brain dynamic preferences were constructed according to the pairwise comparisons between the RTs regardless of their sequential property.

#### 6.5.1  Sensitivity Analysis with Regard to Hyperparameters $(μ,Σ)$

For simplicity, we considered the diagonal covariance matrix here. Specifically, we randomly initialized $μ$ in $[-10-a,10-a]$ and $Σ$ in $[0,10-bI]$. The values of $a$ and $b$ are set within ${0,2,4}$, respectively. Further, we adopted a noninformative prior for $πn$, namely, $αn=βn=5$, to eliminate the effects of noisy channels. The data augmentation size $T$ is set to 1 since the training data are sufficient. The testing performances of SWORE under different parameter setting are presented in Table 4.

Table 4:

Test Accuracy (in %, the Larger the Better) th Regard to Hyperparameter $(μ,Σ)$ and $(α,β)$ with Dropout Rate $θ=0.5$ and Data Augmentation $T=1$.

$(a,b)$ with $μ=10-a$ and $Σ=10-bI$. $(α,β)$ fixed to (5,5)$(α,β)$ with $(μ,Σ)$ fixed to $(10-2,10-4)$
Test ACC(0,0)(2,0)(4,0)(0,2)(2,2)$(4,2)$$(0,4)$(2,4)$(4,4)$(1,1)(1,3)(1,5)(3,1)(3,3)$(3,5)$(5,1)(5,3)(5,5)
P1 50.00 50.00 79.89 77.23 78.29 78.75 50.00 78.71 79.28 78.48 78.26 78.29 78.71 78.52 78.37 78.64 78.83 78.71
P2 50.00 50.00 50.00 79.55 81.48 83.00 50.00 82.03 81.85 81.62 82.13 82.09 82.03 81.75 82.17 82.04 81.98 82.03
P3 50.00 50.00 50.00 85.32 84.54 84.29 70.07 82.89 83.14 83.47 83.47 83.47 82.89 83.02 83.55 83.55 83.72 82.89
P4 50.00 50.00 50.00 72.16 76.45 75.72 50.00 73.26 73.26 73.34 73.65 73.68 73.26 73.42 73.65 73.38 73.46 73.26
P5 50.00 50.00 50.00 85.34 85.52 85.21 50.00 85.07 85.00 85.17 85.14 85.13 85.07 85.06 85.17 85.06 85.07 85.07
P6 50.00 50.00 50.00 86.76 83.25 84.76 50.00 84.56 84.31 84.23 84.31 84.31 84.56 84.64 84.40 85.13 85.05 84.56
P7 50.00 50.00 50.00 75.44 75.21 75.10 50.00 75.18 75.31 75.19 75.12 75.19 75.18 75.22 75.10 75.19 75.23 75.18
P8 50.00 50.00 50.00 84.38 84.06 84.10 50.00 84.70 84.62 84.30 84.54 84.5 84.70 84.66 84.46 84.61 84.53 84.70
P9 50.00 50.00 50.00 83.12 83.01 83.12 50.00 83.26 83.46 83.25 82.86 82.87 83.26 83.22 83.00 82.85 83.14 83.26
P10 50.00 50.00 50.00 79.17 88.00 88.51 50.00 89.50 89.47 89.14 88.56 88.56 89.50 89.21 88.44 88.43 88.41 89.50
P11 50.00 50.00 50.00 80.45 75.99 76.39 50.00 77.49 77.42 77.06 76.14 76.14 77.49 77.29 76.26 77.11 77.09 77.49
P12 50.00 50.00 50.00 79.97 80.18 80.28 50.00 80.09 80.09 80.13 80.09 80.13 80.09 80.13 80.09 79.94 79.94 80.09
P13 50.00 50.00 50.00 81.09 80.75 80.75 50.00 81.52 81.52 81.33 81.23 81.23 81.52 81.33 81.14 81.04 81.18 81.52
P14 50.00 50.00 50.00 50.00 78.58 78.65 50.00 80.03 80.07 79.70 80.00 80.03 80.03 79.70 80.03 79.46 79.49 80.03
P15 50.00 50.00 50.00 89.58 89.92 89.86 50.00 89.93 89.97 89.95 89.95 89.95 89.93 89.90 89.95 90.04 89.97 89.93
P16 50.00 50.00 50.00 73.02 72.75 72.59 50.00 72.41 72.17 72.44 72.37 72.37 72.41 72.42 72.33 72.50 72.42 72.41
P17 50.00 50.00 50.00 50.00 76.99 77.63 50.00 78.05 78.09 78.09 77.94 77.79 78.05 78.24 77.45 77.86 77.75 78.05
P18 50.00 50.00 50.00 50.00 78.03 85.38 50.00 89.36 93.52 88.38 87.82 87.79 89.36 88.88 88.00 87.31 87.92 89.36
P19 50.00 50.00 50.00 77.97 77.94 77.80 50.00 77.92 77.89 77.69 77.62 77.61 77.92 77.92 77.83 77.68 77.89 77.92
P20 50.00 50.00 50.00 80.78 79.96 79.80 50.00 80.32 80.48 80.29 80.29 80.28 80.32 80.29 80.30 80.33 80.34 80.32
P21 50.00 50.00 50.00 50.00 69.92 73.51 50.00 75.74 78.23 79.13 78.89 78.90 75.74 74.38 79.05 79.29 79.37 75.74
P22 50.00 50.00 50.00 78.42 77.96 78.08 50.00 78.05 78.22 77.96 77.95 77.96 78.05 77.99 77.98 77.95 77.96 78.05
P23 50.00 50.00 50.00 84.72 84.34 84.47 50.00 84.66 84.58 84.55 84.55 84.55 84.66 84.64 84.55 84.80 84.69 84.66
P24 50.00 50.00 50.00 80.08 80.12 80.11 50.00 80.04 80.09 79.98 79.99 79.99 80.04 80.01 79.98 80.12 80.09 80.04
P25 50.00 50.00 50.00 82.12 82.18 82.26 50.00 82.17 82.39 82.18 82.20 82.21 82.17 82.31 81.99 82.18 81.99 82.17
P26 50.00 50.00 50.00 86.71 86.52 86.55 50.00 86.61 86.63 86.61 86.61 86.61 86.61 86.60 86.61 86.59 86.60 86.61
P27 50.00 50.00 50.00 81.17 77.35 82.60 50.00 82.71 83.20 82.79 82.94 82.93 82.71 82.77 82.86 82.97 83.00 82.71
P28 50.00 50.00 50.00 85.36 64.69 85.40 50.00 85.34 83.73 85.37 85.34 85.29 85.34 85.31 85.33 85.23 85.32 85.34
P29 50.00 50.00 50.00 84.12 84.06 84.13 50.00 83.87 83.86 83.83 83.85 83.85 83.87 83.85 83.86 83.85 83.84 83.87
P30 50.00 50.00 50.00 50.00 82.30 84.32 50.00 84.27 84.40 84.07 84.08 84.14 84.27 84.19 84.03 84.08 84.08 84.27
P31 50.00 50.00 50.00 82.28 83.80 83.33 50.00 83.55 83.60 83.55 83.53 83.52 83.55 83.57 83.52 83.51 83.53 83.55
P32 50.00 50.00 50.00 84.54 86.02 85.19 50.00 85.69 86.69 85.66 86.62 86.56 85.69 85.65 86.54 86.46 86.51 85.69
P33 50.00 65.27 50.00 80.05 80.59 80.62 50.00 80.90 81.34 80.83 81.00 80.98 80.90 80.83 80.79 81.26 81.18 80.90
P34 50.00 50.00 69.92 87.27 86.98 87.37 50.00 87.65 87.65 87.47 87.65 87.65 87.65 87.59 87.62 87.40 87.47 87.65
P35 50.00 50.00 50.00 74.24 75.32 74.28 50.00 74.77 74.95 74.81 74.77 74.74 74.77 74.76 74.69 74.90 74.82 74.77
P36 50.00 50.00 50.00 86.17 85.58 85.42 50.00 85.55 85.58 85.50 85.55 85.55 85.55 85.52 85.50 85.47 85.52 85.55
P37 50.00 50.00 50.00 90.96 89.81 90.25 50.00 90.20 90.64 89.81 89.43 89.43 90.20 90.03 89.49 89.98 89.92 90.20
P38 50.00 50.00 50.00 90.30 90.06 90.14 50.00 90.52 90.40 90.48 90.28 90.28 90.52 90.52 90.28 90.44 90.48 90.52
P39 50.00 50.00 50.00 85.09 84.65 84.65 50.00 84.90 84.98 84.94 84.90 84.90 84.90 84.98 84.98 84.68 84.79 84.90
P40 50.00 50.00 50.00 75.80 75.90 75.86 50.00 75.93 75.96 75.93 75.93 75.92 75.93 75.92 75.91 75.86 75.89 75.93
$(a,b)$ with $μ=10-a$ and $Σ=10-bI$. $(α,β)$ fixed to (5,5)$(α,β)$ with $(μ,Σ)$ fixed to $(10-2,10-4)$
Test ACC(0,0)(2,0)(4,0)(0,2)(2,2)$(4,2)$$(0,4)$(2,4)$(4,4)$(1,1)(1,3)(1,5)(3,1)(3,3)$(3,5)$(5,1)(5,3)(5,5)
P1 50.00 50.00 79.89 77.23 78.29 78.75 50.00 78.71 79.28 78.48 78.26 78.29 78.71 78.52 78.37 78.64 78.83 78.71
P2 50.00 50.00 50.00 79.55 81.48 83.00 50.00 82.03 81.85 81.62 82.13 82.09 82.03 81.75 82.17 82.04 81.98 82.03
P3 50.00 50.00 50.00 85.32 84.54 84.29 70.07 82.89 83.14 83.47 83.47 83.47 82.89 83.02 83.55 83.55 83.72 82.89
P4 50.00 50.00 50.00 72.16 76.45 75.72 50.00 73.26 73.26 73.34 73.65 73.68 73.26 73.42 73.65 73.38 73.46 73.26
P5 50.00 50.00 50.00 85.34 85.52 85.21 50.00 85.07 85.00 85.17 85.14 85.13 85.07 85.06 85.17 85.06 85.07 85.07
P6 50.00 50.00 50.00 86.76 83.25 84.76 50.00 84.56 84.31 84.23 84.31 84.31 84.56 84.64 84.40 85.13 85.05 84.56
P7 50.00 50.00 50.00 75.44 75.21 75.10 50.00 75.18 75.31 75.19 75.12 75.19 75.18 75.22 75.10 75.19 75.23 75.18
P8 50.00 50.00 50.00 84.38 84.06 84.10 50.00 84.70 84.62 84.30 84.54 84.5 84.70 84.66 84.46 84.61 84.53 84.70
P9 50.00 50.00 50.00 83.12 83.01 83.12 50.00 83.26 83.46 83.25 82.86 82.87 83.26 83.22 83.00 82.85 83.14 83.26
P10 50.00 50.00 50.00 79.17 88.00 88.51 50.00 89.50 89.47 89.14 88.56 88.56 89.50 89.21 88.44 88.43 88.41 89.50
P11 50.00 50.00 50.00 80.45 75.99 76.39 50.00 77.49 77.42 77.06 76.14 76.14 77.49 77.29 76.26 77.11 77.09 77.49
P12 50.00 50.00 50.00 79.97 80.18 80.28 50.00 80.09 80.09 80.13 80.09 80.13 80.09 80.13 80.09 79.94 79.94 80.09
P13 50.00 50.00 50.00 81.09 80.75 80.75 50.00 81.52 81.52 81.33 81.23 81.23 81.52 81.33 81.14 81.04 81.18 81.52
P14 50.00 50.00 50.00 50.00 78.58 78.65 50.00 80.03 80.07 79.70 80.00 80.03 80.03 79.70 80.03 79.46 79.49 80.03
P15 50.00 50.00 50.00 89.58 89.92 89.86 50.00 89.93 89.97 89.95 89.95 89.95 89.93 89.90 89.95 90.04 89.97 89.93
P16 50.00 50.00 50.00 73.02 72.75 72.59 50.00 72.41 72.17 72.44 72.37 72.37 72.41 72.42 72.33 72.50 72.42 72.41
P17 50.00 50.00 50.00 50.00 76.99 77.63 50.00 78.05 78.09 78.09 77.94 77.79 78.05 78.24 77.45 77.86 77.75 78.05
P18 50.00 50.00 50.00 50.00 78.03 85.38 50.00 89.36 93.52 88.38 87.82 87.79 89.36 88.88 88.00 87.31 87.92 89.36
P19 50.00 50.00 50.00 77.97 77.94 77.80 50.00 77.92 77.89 77.69 77.62 77.61 77.92 77.92 77.83 77.68 77.89 77.92
P20 50.00 50.00 50.00 80.78 79.96 79.80 50.00 80.32 80.48 80.29 80.29 80.28 80.32 80.29 80.30 80.33 80.34 80.32
P21 50.00 50.00 50.00 50.00 69.92 73.51 50.00 75.74 78.23 79.13 78.89 78.90 75.74 74.38 79.05 79.29 79.37 75.74
P22 50.00 50.00 50.00 78.42 77.96 78.08 50.00 78.05 78.22 77.96 77.95 77.96 78.05 77.99 77.98 77.95 77.96 78.05
P23 50.00 50.00 50.00 84.72 84.34 84.47 50.00 84.66 84.58 84.55 84.55 84.55 84.66 84.64 84.55 84.80 84.69 84.66
P24 50.00 50.00 50.00 80.08 80.12 80.11 50.00 80.04 80.09 79.98 79.99 79.99 80.04 80.01 79.98 80.12 80.09 80.04
P25 50.00 50.00 50.00 82.12 82.18 82.26 50.00 82.17 82.39 82.18 82.20 82.21 82.17 82.31 81.99 82.18 81.99 82.17
P26 50.00 50.00 50.00 86.71 86.52 86.55 50.00 86.61 86.63 86.61 86.61 86.61 86.61 86.60 86.61 86.59 86.60 86.61
P27 50.00 50.00 50.00 81.17 77.35 82.60 50.00 82.71 83.20 82.79 82.94 82.93 82.71 82.77 82.86 82.97 83.00 82.71
P28 50.00 50.00 50.00 85.36 64.69 85.40 50.00 85.34 83.73 85.37 85.34 85.29 85.34 85.31 85.33 85.23 85.32 85.34
P29 50.00 50.00 50.00 84.12 84.06 84.13 50.00 83.87 83.86 83.83 83.85 83.85 83.87 83.85 83.86 83.85 83.84 83.87
P30 50.00 50.00 50.00 50.00 82.30 84.32 50.00 84.27 84.40 84.07 84.08 84.14 84.27 84.19 84.03 84.08 84.08 84.27
P31 50.00 50.00 50.00 82.28 83.80 83.33 50.00 83.55 83.60 83.55 83.53 83.52 83.55 83.57 83.52 83.51 83.53 83.55
P32 50.00 50.00 50.00 84.54 86.02 85.19 50.00 85.69 86.69 85.66 86.62 86.56 85.69 85.65 86.54 86.46 86.51 85.69
P33 50.00 65.27 50.00 80.05 80.59 80.62 50.00 80.90 81.34 80.83 81.00 80.98 80.90 80.83 80.79 81.26 81.18 80.90
P34 50.00 50.00 69.92 87.27 86.98 87.37 50.00 87.65 87.65 87.47 87.65 87.65 87.65 87.59 87.62 87.40 87.47 87.65
P35 50.00 50.00 50.00 74.24 75.32 74.28 50.00 74.77 74.95 74.81 74.77 74.74 74.77 74.76 74.69 74.90 74.82 74.77
P36 50.00 50.00 50.00 86.17 85.58 85.42 50.00 85.55 85.58 85.50 85.55 85.55 85.55 85.52 85.50 85.47 85.52 85.55
P37 50.00 50.00 50.00 90.96 89.81 90.25 50.00 90.20 90.64 89.81 89.43 89.43 90.20 90.03 89.49 89.98 89.92 90.20
P38 50.00 50.00 50.00 90.30 90.06 90.14 50.00 90.52 90.40 90.48 90.28 90.28 90.52 90.52 90.28 90.44 90.48 90.52
P39 50.00 50.00 50.00 85.09 84.65 84.65 50.00 84.90 84.98 84.94 84.90 84.90 84.90 84.98 84.98 84.68 84.79 84.90
P40 50.00 50.00 50.00 75.80 75.90 75.86 50.00 75.93 75.96 75.93 75.93 75.92 75.93 75.92 75.91 75.86 75.89 75.93

Notes: The best parameter settings are in gray. Some parameter settings do not consistently perform very well and may fail on some participants (marked in bold).

Table 4 shows that SWORE consistently performs very well with testing accuracy greater than $70%$ on all participants under small initialization ($2) for ($μ,Σ$). The SWORE model suffers from spurious overflow and underflow problems with large initializations at each updating step due to the high-dimension feature ($L=492$) and the exponential operator (within the sigmoid function). Also, although the performance of SWORE has minor differences for different participants, it is robust to the small initialization and shows comparable performance for the same participant under different initializations.

Figure 10:

The negative log-likelihood of brain dynamic preferences on training and test data set with regard to different levels of data augmentation size. Aug-N denotes the data augmentation size $T$ is set to $N$.

Figure 10:

The negative log-likelihood of brain dynamic preferences on training and test data set with regard to different levels of data augmentation size. Aug-N denotes the data augmentation size $T$ is set to $N$.

#### 6.5.2  Sensitivity Analysis with Regard to Hyperparameters $(α,β)$

To explore the effects of hyperparameter $(αn,βn)$ with regard to the SWORE model, we randomly initialized $αn,βn$ in ${1,3,5}$, respectively. We randomly initialized $μ$ in $[-10-2,10-2]$ and $Σ$ in $[0,10-4×I]$. The corrupting size $T$ was set to 1 as before. The performance of SWORE on the testing data is reported in Table 4.

It is worth noting that SWORE is insensitive to the initialization of hyperparameters $(α,β)$. In particular, it achieves comparable performance for each participant under different initializations of $(α,β)$. SWORE consistently performs very well on all 40 participants, regardless of the different initializations for $(α,β)$.

#### 6.5.3  Sensitivity Analysis with Regard to Data Augmentation Size $T$

According to Table 4, we randomly initialized $μ$ in $[-10-2,10-2]$ and $Σ$ in $[0,10-4×I]$, and we initialized hyperparameters $(α,β)$ to $(5,5)$. Then we collected the negative log-likelihood of brain dynamic preferences on training and test data set (see Figure 10) with data augmentation size $T$ being set to ${0,1,3,5}$, respectively. We show only the results of the first participant due to space concerns.

From Figure 10, we can observe that: (1) the SWORE model is prone to be overfitting on the original EEG signal, since the dimensions in the EEG signals (either in time domain or frequency domain) are closely related to each other. (See section 5.1 for more details.) We found as well that the feature corruption trick $(T=1)$ achieves the best performance compared to other settings, including the data augmentation methods $(T>1)$. The larger the data augmentation size $T$, the worse the generalization performance of SWORE is. It is interesting to note too that SWORE with data augmentation methods $(T>1)$ performs extremely well with only a few samples (less than $20%$ training data), but it starts overfitting when updated with more samples.

Figure 11:

Box plot of the prediction accuracy on the test data set. The $+$ symbold denotes the outliers.

Figure 11:

Box plot of the prediction accuracy on the test data set. The $+$ symbold denotes the outliers.

Here, we empirically analyzed the stability of the online GBMM algorithm. According to our sensitivity analysis with regard to hyperparameters $(μ,Σ)$ and $(α,β)$ (for both, see Table 4), we randomly initialized $μ$ in $[-10-2,10-2]$, $Σ$ in $[0,10-4×I]$. Further, we initialized hyperparameters $(α,β)$ to $(5,5)$. The corrupting size $T$ is set to 1. Then we repeated the online GBMM algorithm on the training data 20 times and summarized the prediction accuracy on the test data (see Figure 11).

#### 6.5.4  Stability Analysis of the Online GBMM Algorithm

It can be observed from Figure 11 that the test accuracies of each participant are quite stable in different runnings. In addition, SWORE consistently achieves high generalization performance (test accuracy above $80%$) on 26 of 40 participants with $95%$ confidence. Note that the performance of each participant can be improved by tailor-designed brain dynamic preferences for each participant.

### 6.6  Online Mental Fatigue Evaluation on Forty Participants

Following the online experiment setting in section 6.3, we explored the reliability of SWORE on 40 participants in the online monitoring scenario. Similarly, we leveraged the prerecorded 25 trials to pretrain the embryonic SWORE, LOR, and SVR models, respectively. Then we ran SWORE and other baselines independently 100 times and calculated the average prediction accuracy for all 40 participants in Table 5.

Table 5:

Comparison of Average Prediction Accuracy on 40 Participants (in %).

ACCP1P2P3P4P5P6P7P8
SVR 69.1 $±$ 0.36 76.9 $±$ 0.30 74.0 $±$ 0.31 70.4 $±$ 0.35 72.7 $±$ 0.34 70.6 $±$ 0.39 51.5 $±$ 0.47 76.6 $±$ 0.31
LOR 72.6 $±$ 0.33 78.3 $±$ 0.31 73.1 $±$ 0.37 73.8 $±$ 0.33 73.7 $±$ 0.36 74.5 $±$ 0.32 75.1 $±$ 0.33 74.8 $±$ 0.32
SWORE 76.0 $±$ 0.30 79.9 $±$ 0.29 76.5 $±$ 0.31 75.9 $±$ 0.31 76.7 $±$ 0.31 74.6 $±$ 0.33 78.3 $±$ 0.32 76.9 $±$ 0.30
ACC P9 P10 P11 P12 P13 P14 P15 P16
SVR 69.7 $±$ 0.36 71.0 $±$ 0.37 73.7 $±$ 0.37 74.4 $±$ 0.33 73.7 $±$ 0.32 72.0 $±$ 0.35 74.2 $±$ 0.36 45.8 $±$ 0.44
LOR 73.2 $±$ 0.35 73.3 $±$ 0.36 76.5 $±$ 0.32 78.3 $±$ 0.30 74.3 $±$ 0.34 72.6 $±$ 0.35 72.5 $±$ 0.35 76.3 $±$ 0.32
SWORE 75.7 $±$ 0.33 75.2 $±$ 0.32 78.2 $±$ 0.30 78.4 $±$ 0.31 74.1 $±$ 0.36 74.0 $±$ 0.32 74.0 $±$ 0.36 79.9 $±$ 0.31
ACC P17 P18 P19 P20 P21 P22 P23 P24
SVR 63.3 $±$ 0.41 65.6 $±$ 0.43 67.0 $±$ 0.38 68.8 $±$ 0.36 47.4 $±$ 0.45 63.7 $±$ 0.40 50.9 $±$ 0.38 75.3 $±$ 0.35
LOR 79.8 $±$ 0.29 74.4 $±$ 0.34 77.6 $±$ 0.32 73.7 $±$ 0.36 76.6 $±$ 0.34 73.1 $±$ 0.34 84.2 $±$ 0.27 84.6 $±$ 0.24
SWORE 80.0 $±$ 0.29 76.7 $±$ 0.31 79.0 $±$ 0.31 76.7 $±$ 0.35 76.4 $±$ 0.32 75.2 $±$ 0.35 84.0 $±$ 0.29 84.2 $±$ 0.25
ACC P25 P26 P27 P28 P29 P30 P31 P32
SVR 71.7 $±$ 0.34 72.8 $±$ 0.32 75.2 $±$ 0.33 69.3 $±$ 0.39 72.6 $±$ 0.36 79.0 $±$ 0.32 70.1 $±$ 0.39 39.1 $±$ 0.41
LOR 71.1 $±$ 0.37 76.8 $±$ 0.31 72.3 $±$ 0.35 76.3 $±$ 0.33 79.7 $±$ 0.30 78.9 $±$ 0.30 77.1 $±$ 0.33 80.8 $±$ 0.31
SWORE 75.2 $±$ 0.32 77.8 $±$ 0.29 74.0 $±$ 0.32 76.9 $±$ 0.31 80.0 $±$ 0.29 79.5 $±$ 0.28 77.5 $±$ 0.30 81.1 $±$ 0.28
ACC P33 P34 P35 P36 P37 P38 P39 P40
SVR 72.6 $±$ 0.35 59.3 $±$ 0.41 70.8 $±$ 0.35 75.5 $±$ 0.30 78.9 $±$ 0.29 74.0 $±$ 0.32 74.1 $±$ 0.35 56.1 $±$ 0.43
LOR 74.5 $±$ 0.33 80.2 $±$ 0.28 77.1 $±$ 0.31 74.9 $±$ 0.31 79.1 $±$ 0.28 79.4 $±$ 0.29 79.5 $±$ 0.29 76.5 $±$ 0.33
SWORE 77.7 $±$ 0.31 80.9 $±$ 0.29 77.3 $±$ 0.31 76.3 $±$ 0.31 79.8 $±$ 0.28 81.0 $±$ 0.28 80.6 $±$ 0.28 78.3 $±$ 0.32
ACCP1P2P3P4P5P6P7P8
SVR 69.1 $±$ 0.36 76.9 $±$ 0.30 74.0 $±$ 0.31 70.4 $±$ 0.35 72.7 $±$ 0.34 70.6 $±$ 0.39 51.5 $±$ 0.47 76.6 $±$ 0.31
LOR 72.6 $±$ 0.33 78.3 $±$ 0.31 73.1 $±$ 0.37 73.8 $±$ 0.33 73.7 $±$ 0.36 74.5 $±$ 0.32 75.1 $±$ 0.33 74.8 $±$ 0.32
SWORE 76.0 $±$ 0.30 79.9 $±$ 0.29 76.5 $±$ 0.31 75.9 $±$ 0.31 76.7 $±$ 0.31 74.6 $±$ 0.33 78.3 $±$ 0.32 76.9 $±$ 0.30
ACC P9 P10 P11 P12 P13 P14 P15 P16
SVR 69.7 $±$ 0.36 71.0 $±$ 0.37 73.7 $±$ 0.37 74.4 $±$ 0.33 73.7 $±$ 0.32 72.0 $±$ 0.35 74.2 $±$ 0.36 45.8 $±$ 0.44
LOR 73.2 $±$ 0.35 73.3 $±$ 0.36 76.5 $±$ 0.32 78.3 $±$ 0.30 74.3 $±$ 0.34 72.6 $±$ 0.35 72.5 $±$ 0.35 76.3 $±$ 0.32
SWORE 75.7 $±$ 0.33 75.2 $±$ 0.32 78.2 $±$ 0.30 78.4 $±$ 0.31 74.1 $±$ 0.36 74.0 $±$ 0.32 74.0 $±$ 0.36 79.9 $±$ 0.31
ACC P17 P18 P19 P20 P21 P22 P23 P24
SVR 63.3 $±$ 0.41 65.6 $±$ 0.43 67.0 $±$ 0.38 68.8 $±$ 0.36 47.4 $±$ 0.45 63.7 $±$ 0.40 50.9 $±$ 0.38 75.3 $±$ 0.35
LOR 79.8 $±$ 0.29 74.4 $±$ 0.34 77.6 $±$ 0.32 73.7 $±$ 0.36 76.6 $±$ 0.34 73.1 $±$ 0.34 84.2 $±$ 0.27 84.6 $±$ 0.24
SWORE 80.0 $±$ 0.29 76.7 $±$ 0.31 79.0 $±$ 0.31 76.7 $±$ 0.35 76.4 $±$ 0.32 75.2 $±$ 0.35 84.0 $±$ 0.29 84.2 $±$ 0.25
ACC P25 P26 P27 P28 P29 P30 P31 P32
SVR 71.7 $±$ 0.34 72.8 $±$ 0.32 75.2 $±$ 0.33 69.3 $±$ 0.39 72.6 $±$ 0.36 79.0 $±$ 0.32 70.1 $±$ 0.39 39.1 $±$ 0.41
LOR 71.1 $±$ 0.37 76.8 $±$ 0.31 72.3 $±$ 0.35 76.3 $±$ 0.33 79.7 $±$ 0.30 78.9 $±$ 0.30 77.1 $±$ 0.33 80.8 $±$ 0.31
SWORE 75.2 $±$ 0.32 77.8 $±$ 0.29 74.0 $±$ 0.32 76.9 $±$ 0.31 80.0 $±$ 0.29 79.5 $±$ 0.28 77.5 $±$ 0.30 81.1 $±$ 0.28
ACC P33 P34 P35 P36 P37 P38 P39 P40
SVR 72.6 $±$ 0.35 59.3 $±$ 0.41 70.8 $±$ 0.35 75.5 $±$ 0.30 78.9 $±$ 0.29 74.0 $±$ 0.32 74.1 $±$ 0.35 56.1 $±$ 0.43
LOR 74.5 $±$ 0.33 80.2 $±$ 0.28 77.1 $±$ 0.31 74.9 $±$ 0.31 79.1 $±$ 0.28 79.4 $±$ 0.29 79.5 $±$ 0.29 76.5 $±$ 0.33
SWORE 77.7 $±$ 0.31 80.9 $±$ 0.29 77.3 $±$ 0.31 76.3 $±$ 0.31 79.8 $±$ 0.28 81.0 $±$ 0.28 80.6 $±$ 0.28 78.3 $±$ 0.32

Notes: We run each baseline independently 100 times and calculate the mean and 95% confidence interval. The best results are in bold. Best results are in bold.

It can be observed from Table 5 that SWORE can give the most reliable evaluation with the lowest variance for the new EEG signal, compared to LOR and SVR. In particular, SWORE achieves the highest average prediction accuracy on 34 of 40 participants and comparable results on the rest participants. SWORE also achieves consistent reliable evaluations on different participants. There are 35 participants for SWORE whose average prediction accuracy is above $75%$, while there are 7 participants for SVR and 22 participants for LOR, respectively. The nononline method, SVR, is not trustworthy, since it does not consider the nonstationary properties of brain dynamics. There are 7 participants—P7, P16, P21, P23, P32, P34, and P40—on which SVR achieves an average prediction accuracy below $60%$. LOR and SWORE, equipped with efficient online calibration strategies, can consistently achieve an average prediction accuracy above $75%$ on the same participants.

## 7  Conclusion

This letter takes an initial step to calibrate prediction models on nonstationary brain dynamics. We proposed the self-weight ordinal regression (SWORE) model with brain dynamics table (BDtable) for online mental fatigue monitoring. SWORE can aggregate the information from multiple noisy channels based on the brain dynamic preferences, while BDtable is used to online calibrate the SWORE model utilizing a generalized Bayesian moment matching algorithm. Empirical results demonstrate that the proposed framework achieves significantly better performance than baseline approaches like SVR and LOR. As a direction for future research, we are committed to assessing the feasibility of performing the online mental-fatigue monitoring system with EEG signals and other mental fatigue indicators.

## Appendix A: Proof for Therorem 1

Assume $f(w)$ is the marginalized likelihood of preference $y$ and almost twice differentiable. Upon updating this preference, the posterior parameters $(μnew,Σnew)$ of weight $w$ can be estimated as
$μnew≈μ+Σ×dlogf(w)dw|w=μ,$
A.1a
$Σnew≈Σ+Σ×d2logf(w)dwdwT|w=μ×Σ.$
A.1b
Before introducing our proof, we introduce lemma 7 as a building block.
Lemma 1
(Weng & Lin, 2011). Let $z$ be a random vector, where each entry is independent and $zi∼N(0,1)$, $i=1,2,…,L$. Suppose that $f(z)$ is the likelihood function and almost twice differentiable. Then the first- and second-order moments of the posterior distribution can be estimated as
$E[z]=E∇f(z)f(z),$
A.2a
$E[zizj]=Iij+E∇2f(z)f(z)ij,i,j=1,…,L,$
A.2b
where $Iij=1$ if $i=j$ and 0 otherwise, and $[.]ij$ indicates the $(i,j)$ component of a matrix.

Based on lemma 7, we give the detailed proof in the following.

Proof.
In terms of the posterior parameter $μnew$, we have
$μnew=Ew[w]=μ+Σ×Ez[z]=1μ+Σ×Ez∇f(μ+Σz)f(μ+Σz)≈2μ+Σ×dlogf(μ+Σz)dz|z=0=3μ+Σ×dwdz×dlogf(μ+Σz)dw|w=μ=μ+Σ×dlogf(w)dw|w=μ,$
where ① follows equation A.2a of lemma 7. ② sets $z=0$. Such a substitution is reasonable as we expect that the posterior density of $z$ to be concentrated on 0. ③ follows the chain rule.
In terms of the posterior parameter $Σnew$, we have
$Σnew=Var(w)=Σ×(Ez[zzT]-Ez[z]EzT[z])×Σ=1Σ×(I+Ez∇2f(μ+Σz)f(μ+Σz)-Ezdlogf(μ+Σz)dzdlogf(μ+Σz)dzT)×Σ≈2Σ×(I+∇2f(μ+Σz)f(μ+Σz)|z=0-dlogf(μ+Σz)dz|z=0dlogf(μ+Σz)dzT|z=0)×Σ=3Σ+Σ×d2logf(μ+Σz)dzdzT|z=0×Σ=4Σ+Σ×dwdz×d2logf(μ+Σz)dwdwT|w=μ×dwTdzT×Σ=Σ+Σ×d2logf(w)dwdwT|w=μ×Σ,$
where ① follows equation A.2b of lemma 7. ② sets $z=0$. Such a substitution is reasonable as we expect the posterior density of $z$ to be concentrated on 0. ④ follows the chain rule. We give the proof for ③ as follows,
$d2logf(μ+Σz)dzdzTij=∂∂zj∂f(μ+Σz)/∂zif(μ+Σz)=∂2f(μ+Σz)∂zi∂zjf(μ+Σz)-∂f(μ+Σz)∂zi×∂f(μ+Σz)∂zjf2(μ+Σz)=∇2f(μ+Σz)f(μ+Σz)ij-∂logf(μ+Σz)∂zi×∂logf(μ+Σz)∂zj=∇2f(μ+Σz)f(μ+Σz)ij-dlogf(μ+Σz)dzi×dlogf(μ+Σz)dzTj.$

## Appendix B: Second-Order Taylor Approximation for $EN(w|μ,Σ)[σ(wTΔxn)]$

Suppose $R1=EN(w|μ,Σ)σ(wTΔxn)$, the second-order Taylor approximation of $R1$ at $μ$, can be represented as follows:
$R1=EN(w|μ,Σ)σ(wTΔxn)=σ(μTΔxn)1+12[1-σ(μTΔxn)][1-2σ(μTΔxn)]ΔxnTΣΔxn.$
Then we set $R1=max(R1,κ2)$, where $κ2$ is a small, positive value to ensure a positive $R1$.

## Appendix C: Posterior Moments of Beta Distribution

$E[πn]=∫πnP(πn|y,Δxn)dπn=∫πnP(y|πn,Δxn)Beta(πn|αn,βn)Rdπn=1R∫πn[πnR1+(1-πn)R2]Beta(πn|αn,βn)dπn=R1-R2R∫πn2Beta(πn|αn,βn)dπn+R2R∫πnBeta(πn|αn,βn)dπn=R1-R2R(αn+1)αn(αn+βn+1)(αn+βn)+R2Rαnαn+βn=R1(αn+1)αn+R2αnβnR(αn+βn+1)(αn+βn).E[πn2]=∫πn2P(πn|y,Δxn)dπn=∫πn2P(y|πn,Δxn)Beta(πn|αn,βn)Rdπn=1R∫πn2[πnR1+(1-πn)R2]Beta(πn|αn,βn)dπn=R1-R2R∫πn3Beta(πn|αn,βn)dπn+R2R∫πn2Beta(πn|αn,βn)dπn=R1-R2R(αn+2)(αn+1)αn(αn+βn+2)(αn+βn+1)(αn+βn)+R2R(αn+1)αn(αn+βn+1)(αn+βn)=αn(αn+1)[R1(αn+2)+R2βn]R(αn+βn+2)(αn+βn+1)(αn+βn).$

## Appendix D: The Updating Rules for Hyperparameter (⁠$αnnew,βnnew$⁠)

In terms of the sufficient moments with regard to the posterior distribution $q(πn|αnnew,βnnew)$, we have
$E[πn]=∫πnq(πn|αnnew,βnnew)dπn=αnnewαnnew+βnnew,E[πn2]=∫πn2q(πn|αnnew,βnnew)dπn=αnnew(αnnew+1)(αnnew+βnnew+1)(αnnew+βnnew).$
According to the above equations, we have
$E[πn]-E[πn2]=αnnewβnnew(αnnew+βnnew+1)(αnnew+βnnew),E[πn2]-(E[πn])2=αnnewβnnew(αnnew+βnnew+1)(αnnew+βnnew)2.$
Then we have
$αnnew=(E[πn]-E[πn2])E[πn]E[πn2]-(E[πn])2,βnnew=(E[πn]-E[πn2])(1-E[πn])E[πn2]-(E[πn])2.$

## Acknowledgments

I.W.T is supported by ARC under grant DP180100106 and DP200101328. We thank Yinghua Yao, Peiyao Zhao, two anonymous reviewers, and the editor for helpful suggestions for this paper.

## Notes

1

We adopted SVR due to its nonlinear properties and superior generalization performance on small training data set (Schölkopf, Smola, & Bach, 2018). SVR is implemented using the Libsvm with the parameter option -s 3 -t 2.

2

The $y$-axis is in log scale. And the prediction discrepancy would be more significant in normal scale.

3

We used the term “preference” intentionally to show that brain dynamics keep changing with regard to human behavior and it happens because the human brain prefers one decision over others.

4

$πn→1-$ denotes $πn$ is up to approximate 1, while $πn→0+$ denotes $πn$ is down to approximate 0.

5

$P(y|w,Δxn)=EBeta(π|α,β)[P(y|w,π,Δxn)]$.

6

Although the data augmentation procedure generates a corrupted data set with a larger size, the final computational cost (scaling linearly with $T$) is acceptable, benefiting from the efficient updating rules.

7

The parameters ${w,π1:N}$ are used to represent the SWORE model, since equation 3.6 is fully determined by ${w,π1:N}$. We omit the subscript $t-1$ for convenience.

8

$Σ$ is simplified to be a diagonal matrix in the experiment for simplicity.

9

It consists of 30 EEG channels, 2 reference channels, and 1 vehicle position channel. We did not eliminate the 3 non-EEG channel beforehand to demonstrate that our SWORE can automatically remove this kind of noninformative EEG channel during the training.

## References

Barachant
,
A.
,
Bonnet
,
S.
,
Congedo
,
M.
, &
Jutten
,
C.
(
2012
).
Multiclass brain-computer interface classification by Riemannian geometry
.
IEEE Transactions on Biomedical Engineering
,
59
(
4
),
920
928
.
Borghini
,
G.
,
Astolfi
,
L.
,
Vecchiato
,
G.
,
Mattia
,
D.
, &
Babiloni
,
F.
(
2014
).
Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue and drowsiness
.
Neuroscience and Biobehavioral Reviews
,
44
,
58
75
.
Bose
,
R.
,
Wang
,
H.
,
Dragomir
,
A.
,
Thakor
,
N.
,
Bezerianos
,
A.
, &
Li
,
J.
(
2019
).
Regression based continuous driving fatigue estimation: Towards practical implementation
.
IEEE Transactions on Cognitive and Developmental Systems
,
12
,
323
331
.
Chai
,
R.
,
Naik
,
G. R.
,
Nguyen
,
T. N.
,
Ling
,
S. H.
,
Tran
,
Y.
,
Craig
,
A.
, &
Nguyen
,
H. T.
(
2016
).
Driver fatigue classification with independent component by entropy rate bound minimization analysis in an EEG-based system
.
IEEE Journal of Biomedical and Health Informatics
,
21
(
3
),
715
724
.
Colosio
,
M.
,
Shestakova
,
A.
,
Nikulin
,
V. V.
,
Blagovechtchenski
,
E.
, &
Klucharev
,
V.
(
2017
).
Neural mechanisms of cognitive dissonance (revised): An EEG study
.
Journal of Neuroscience
,
37
,
5074
5083
.
Congedo
,
M.
,
Barachant
,
A.
, &
Bhatia
,
R.
(
2017
).
Riemannian geometry for EEG-based brain-computer interfaces: A primer and a review
.
Brain-Computer Interfaces
,
4
(
3
),
155
174
.
Cui
,
Y.
, &
Wu
,
D.
(
2017
).
EEG-based driver drowsiness estimation using convolutional neural networks
. In
Proceedings of the International Conference on Neural Information Processing
(pp.
822
832
).
Berlin
:
Springer
.
Cui
,
Y.
,
Xu
,
Y.
, &
Wu
,
D.
(
2019
).
EEG-based driver drowsiness estimation using feature weighted episodic training
.
IEEE Transactions on Neural Systems and Rehabilitation Engineering
,
27
(
11
),
2263
2273
.
Dornhege
,
G.
,
del R. Millán
,
J.
,
Hinterberger
,
T.
,
McFarland
D. J.
, &
Müller
,
K.-R.
(
2007
). Improving human performance in a real operating environment through real-time mental workload detection. In
G.
Dornhege
,
J. del R.
Millán
,
T.
Hinterberger
,
D. J.
McFarland
, &
K.-R.
Müller
(Eds.),
Toward brain-computer interfacing
(pp.
409
422
).
Cambridge, MA
:
MIT Press
.
Fallahi
,
M.
,
,
M.
,
,
R.
,
Soltanian
,
A. R.
, &
Miyake
,
S.
(
2016
).
Effects of mental workload on physiological and subjective responses during traffic density monitoring: A field study
.
Applied Ergonomics
,
52
,
95
103
.
Gharagozlou
,
F.
,
Saraji
,
G. N.
,
Mazloumi
,
A.
,
Nahvi
,
A.
,
,
A. M.
,
Foroushani
,
R.
,
Samavati
,
M.
(
2015
).
Detecting driver mental fatigue based on EEG alpha power changes during simulated driving
.
Iranian Journal of Public Health
,
44
(
12
), 1693.
Goodfellow
,
I.
,
Bengio
,
Y.
,
Courville
,
A.
, &
Bengio
,
Y.
(
2016
).
Deep learning
.
Cambridge, MA
:
MIT Press
.
Graimann
,
B.
,
Allison
,
B.
, &
Pfurtscheller
,
G.
(
2009
). Brain-computer interfaces: A gentle introduction. In
B.
Graimann
,
G.
Pfurtscheller
, &
B.
Allison
(Eds.),
Brain-computer interfaces
(pp.
1
27
).
Berlin
:
Springer
.
Gurudath
,
N.
, &
Riley
,
H. B.
(
2014
).
Drowsy driving detection by EEG analysis using wavelet transform and $k$-means clustering
.
Procedia Computer Science
,
34
,
400
409
.
Homan
,
R. W.
,
Herman
,
J.
, &
Purdy
,
P.
(
1987
).
Cerebral location of international 10–20 system electrode placement
.
Electroencephalography and Clinical Neurophysiology
,
66
(
4
),
376
382
.
Huang
,
C.-S.
,
Pal
,
N. R.
,
Chuang
,
C.-H.
, &
Lin
,
C.-T.
(
2015
).
Identifying changes in EEG information transfer during drowsy driving by transfer entropy
.
Frontiers in Human Neuroscience
,
9
, 570.
Huang
,
R.-S.
,
Jung
,
T.-P.
, &
Makeig
,
S.
(
2009
).
Tonic changes in EEG power spectra during simulated driving
. In
Proceedings of the International Conference on Foundations of Augmented Cognition
(pp.
394
403
).
Berlin
:
Springer
.
Jagannath
,
M.
, &
Balasubramanian
,
V.
(
2014
).
Assessment of early onset of driver fatigue using multimodal fatigue measures in a static simulator
.
Applied Ergonomics
,
45
(
4
),
1140
1147
.
Jaini
,
P.
,
Chen
,
Z.
,
Carbajal
,
P.
,
Law
,
E.
,
Middleton
,
L.
,
Regan
,
K.
, …
Poupart
,
P.
(
2017
).
Online Bayesian transfer learning for sequential data modeling.
In
Proceedings of the International Conference on Learning Representations
.
Jap
,
B. T.
,
Lal
,
S.
,
Fischer
,
P.
, &
Bekiaris
,
E.
(
2009
).
Using EEG spectral components to assess algorithms for detecting fatigue
.
Expert Systems with Applications
,
36
(
2
),
2352
2359
.
Kar
,
S.
,
Bhagat
,
M.
, &
Routray
,
A.
(
2010
).
EEG signal analysis for the assessment and quantification of driver's fatigue
.
Transportation Research Part F: Traffic Psychology and Behaviour
,
13
(
5
),
297
306
.
Lal
,
S. K.
,
Craig
,
A.
,
Boord
,
P.
,
Kirkup
,
L.
, &
Nguyen
,
H.
(
2003
).
Development of an algorithm for an EEG-based driver fatigue countermeasure
.
Journal of Safety Research
,
34
(
3
),
321
328
.
Laurent
,
F.
,
Valderrama
,
M.
,
Besserve
,
M.
,
Guillard
,
M.
,
Lachaux
,
J.-P.
,
Martinerie
,
J.
, &
Florence
,
G.
(
2013
).
Multimodal information improves the rapid detection of mental fatigue
.
Biomedical Signal Processing and Control
,
8
(
4
),
400
408
.
Li
,
G.
,
Li
,
B.
,
Wang
,
G.
,
Zhang
,
J.
, &
Wang
,
J.
(
2017
).
A new method for human mental fatigue detection with several EEG channels
.
Journal of Medical and Biological Engineering
,
37
(
2
),
240
247
.
Lin
,
C.-T.
,
Chang
,
C.-J.
,
Lin
,
B.-S.
,
Hung
,
S.-H.
,
Chao
,
C.-F.
, &
Wang
,
I.-J.
(
2010
).
A real-time wireless brain–computer interface system for drowsiness detection
.
IEEE Transactions on Biomedical Circuits and Systems
,
4
(
4
),
214
222
.
Lin
,
C.-T.
,
Tsai
,
S.-F.
, &
Ko
,
L.-W.
(
2013
).
EEG-based learning system for online motion sickness level estimation in a dynamic vehicle environment
.
IEEE Transactions on Neural Networks and Learning Systems
,
24
(
10
),
1689
1700
.
Liu
,
Y.-T.
,
Lin
,
Y.-Y.
,
Wu
,
S.-L.
,
Chuang
,
C.-H.
, &
Lin
,
C.-T.
(
2016
).
Brain dynamics in predicting driving fatigue using a recurrent self-evolving fuzzy neural network
.
IEEE Transactions on Neural Networks and Learning Systems
,
27
(
2
),
347
360
.
Müller
,
K.-R.
,
Tangermann
,
M.
,
Dornhege
,
G.
,
Krauledat
,
M.
,
Curio
,
G.
, & Blankertz,
(
2008
).
Machine learning for real-time single-trial EEG-analysis: From brain-computer interfacing to mental state monitoring
.
Journal of Neuroscience Methods
,
167
(
1
),
82
90
.
Nguyen
,
T.
,
Ahn
,
S.
,
Jang
,
H.
,
Jun
,
S. C.
, &
Kim
,
J. G.
(
2017
).
Utilization of a combined EEG/NIRS system to predict driver drowsiness
.
Scientific Reports
,
7
, 43933.
Palanivel
Rajan
,
S.
, &
Dinesh
,
T.
(
2015
).
Systematic review on wearable driver vigilance system with future research directions
.
International Journal of Applied Engineering Research
,
10
(
1
),
627
32
.
Pan
,
Y.
,
Tsang
,
I. W.
,
Singh
,
A. K.
,
Lin
,
C.-T.
, &
Sugiyama
,
M.
(
2020
).
Stochastic multichannel ranking with brain dynamics preferences
.
Neural Computation
,
32
(
8
),
1499
1530
.
Ratcliff
,
R.
,
Philiastides
,
M. G.
, &
Sajda
,
P.
(
2009
).
Quality of evidence for perceptual decision making is indexed by trial-to-trial variability of the EEG
. In
Proceedings of the National Academy of Sciences
,
106
(
16
),
6539
6544
.
Raykar
,
V. C.
,
Yu
,
S.
,
Zhao
,
L. H.
,
,
G. H.
,
Florin
,
C.
,
Bogoni
,
L.
, &
Moy
,
L.
(
2010
).
Learning from crowds
.
Journal of Machine Learning Research
,
11
,
1297
1322
.
Resalat
,
S. N.
, &
Saba
,
V.
(
2015
).
A practical method for driver sleepiness detection by processing the EEG signals stimulated with external flickering light
.
Signal, Image and Video Processing
,
9
(
8
),
1751
1757
.
Richer
,
R.
,
Zhao
,
N.
,
Amores
,
J.
,
Eskofier
,
B. M.
, &
,
J. A.
(
2018
).
Real-time mental state recognition using a wearable EEG.
In
Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
(pp.
5495
5498
).
Piscataway, NJ
:
IEEE
.
,
A.
,
Sundaraj
,
K.
, &
Murugappan
,
M.
(
2012
).
Detecting driver drowsiness based on sensors: A review
.
Sensors
,
12
(
12
),
16937
16953
.
Sauvet
,
F.
,
Bougard
,
C.
,
Coroenne
,
M.
,
Lely
,
L.
,
Van Beers
,
P.
,
Elbaz
,
M.
, …
Chennaoui
,
M.
(
2014
).
In-flight automatic detection of vigilance states using a single EEG channel
.
IEEE Transactions on Biomedical Engineering
,
61
(
12
),
2840
2847
.
Schö
lkopf
,
B.
,
Smola
,
A. J.
, &
Bach
,
F.
(
2018
).
Learning with kernels: Support vector machines, regularization, optimization, and beyond
.
Cambridge, MA
:
MIT Press
.
Soon
,
C. S.
,
Brass
,
M.
,
Heinze
,
H.-J.
, &
Haynes
,
J.-D.
(
2008
).
Unconscious determinants of free decisions in the human brain
.
Nature Neuroscience
,
11
(
5
), 543.
Teplan
,
M.
(
2002
).
Fundamentals of EEG measurement
.
Measurement Science Review
,
2
(
2
),
1
11
.
Van Cutsem
,
J.
,
Marcora
,
S.
,
De Pauw
,
K.
,
Bailey
,
S.
,
Meeusen
,
R.
, &
Roelands
,
B.
(
2017
).
The effects of mental fatigue on physical performance: A systematic review
.
Sports Medicine
,
47
(
8
),
1569
1588
.
Vitter
,
J. S.
(
1985
).
Random sampling with a reservoir
.
ACM Transactions on Mathematical Software
,
11
(
1
),
37
57
.
Wang
,
H.
,
Dragomir
,
A.
,
Abbasi
,
N. I.
,
Li
,
J.
,
Thakor
,
N. V.
, &
Bezerianos
,
A.
(
2018
).
A novel real-time driving fatigue detection system based on wireless dry EEG
.
Cognitive Neurodynamics
,
12
(
4
),
365
376
.
Wang
,
S.
,
Zhang
,
Y.
,
Wu
,
C.
,
Darvas
,
F.
, &
Chaovalitwongse
,
W. A.
(
2015
).
Online prediction of driver distraction based on brain activity patterns
.
IEEE Transactions on Intelligent Transportation Systems
,
16
(
1
),
136
150
.
Wei
,
C.-S.
,
Lin
,
Y.-P.
,
Wang
,
Y.-T.
,
Lin
,
C.-T.
, &
Jung
,
T.-P.
(
2018
).
A subject-transfer framework for obviating inter- and intra-subject variability in EEG-based drowsiness detection
.
NeuroImage
,
174
,
407
419
.
Welch
,
P.
(
1967
).
The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms
.
IEEE Transactions on Audio and Electroacoustics
,
15
(
2
),
70
73
.
Weng
,
R. C.
, &
Lin
,
C.-J.
(
2011
).
A Bayesian approximation method for online ranking
.
Journal of Machine Learning Research
,
12
,
267
300
.
Woodroofe
,
M.
(
1989
).
Very weak expansions for sequentially designed experiments, linear models
.
Annals of Statistics
,
17
(
3
),
1087
1102
.
Xu
,
J.
,
Min
,
J.
, &
Hu
,
J.
(
2018
).
Real-time eye tracking for the assessment of driver fatigue
.
Healthcare Technology Letters
,
5
(
2
),
54
58
.
Yan
,
L.
,
Dodier
,
R. H.
,
Mozer
,
M.
, &
Wolniewicz
,
R. H.
(
2003
).
Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic
. In
Proceedings of the 20th International Conference on Machine Learning
(pp.
848
855
).
Washington, DC
:
AAAI Press
.
Yarkoni
,
T.
,
Barch
,
D. M.
,
Gray
,
J. R.
,
Conturo
,
T. E.
, &
Braver
,
T. S.
(
2009
).
Bold correlates of trial-by-trial reaction time variability in gray and white matter: A multistudy FMRI analysis
.
PLOS One
,
4
(
1
), e4257.
Zhou
,
K.
,
Xue
,
G.-R.
,
Zha
,
H.
, &
Yu
,
Y.
(
2008
).
Learning to rank with ties
. In
Proceedings of the 31stAnnualInternational ACM SIGIR Conference on Research and Development in Information Retrieval
(pp.
275
282
).
New York
:
ACM
.