How does the human brain encode visual object categories? Our understanding of this has advanced substantially with the development of multivariate decoding analyses. However, conventional electroencephalography (EEG) decoding predominantly uses the mean neural activation within the analysis window to extract category information. Such temporal averaging overlooks the within-trial neural variability that is suggested to provide an additional channel for the encoding of information about the complexity and uncertainty of the sensory input. The richness of temporal variabilities, however, has not been systematically compared with the conventional mean activity. Here we compare the information content of 31 variability-sensitive features against the mean of activity, using three independent highly varied data sets. In whole-trial decoding, the classical event-related potential (ERP) components of P2a and P2b provided information comparable to those provided by original magnitude data (OMD) and wavelet coefficients (WC), the two most informative variability-sensitive features. In time-resolved decoding, the OMD and WC outperformed all the other features (including the mean), which were sensitive to limited and specific aspects of temporal variabilities, such as their phase or frequency. The information was more pronounced in the theta frequency band, previously suggested to support feedforward visual processing. We concluded that the brain might encode the information in multiple aspects of neural variabilities simultaneously such as phase, amplitude, and frequency rather than mean per se. In our active categorization data set, we found that more effective decoding of the neural codes corresponds to better prediction of behavioral performance. Therefore, the incorporation of temporal variabilities in time-resolved decoding can provide additional category information and improved prediction of behavior.

How does the brain encode information about visual object categories? This question has been studied for decades using different neural recording techniques, including invasive neurophysiology (Hung, Kreiman, Poggio, & DiCarlo, 2005) and electrocorticography (ECoG); Majima et al., 2014; Watrous, Deuker, Fell, & Axmacher, 2015; Rupp et al., 2017; Miyakawa et al., 2018; Liu, Agam, Madsen, & Kreiman, 2009), as well as noninvasive neuroimaging methods such as functional magnetic resonance imaging (fMRI; Haxby et al., 2001), magnetoencephalography (MEG; Contini, Wardle, & Carlson, 2017; Carlson, Tovar, Alink, & Kriegeskorte, 2013) and electroencephalography (EEG; Kaneshiro, Guimaraes, Kim, Norcia, & Suppes, 2015; Simanova, Van Gerven, Oostenveld, & Hagoort, 2010) or a combination of them (Cichy, Pantazis, & Oliva, 2014). There has been great success in reading out or decoding neural representations of semantic object categories from neuroimaging data. However, it is still unclear if the conventional decoding analyses effectively detect the complex neural codes. Critically, one potential source of neural codes in high-temporal-resolution data (e.g., EEG) can be the “within-trial/window temporal variability” of EEG signals, which is generally ignored through temporal averaging in decoding. The use of such summarized “mean” activity can hide the true spatiotemporal dynamics of neural processes such as object category encoding, which is still debated in cognitive neuroscience (Grootswagers, Robinson, & Carlson, 2019; Majima et al., 2014; Karimi-Rouzbahani, Bagheri, & Ebrahimpour, 2017b; Isik, Meyers, Leibo, & Poggio, 2014; Cichy et al., 2014; Karimi-Rouzbahani, 2018).

Here, we quantitatively compare the information content and the temporal dynamics of a large set of features from EEG time series, each sensitive to a specific aspect of within-trial temporal variability. We then evaluate the relevance of these features by measuring how well each one predicts behavioral performance. Sensory neural codes are multiplexed structures containing information on different timescales and about different aspects of the sensory input (Panzeri, Brunel, Logothetis, & Kayser, 2010; Wark, Fairhall, & Rieke, 2009; Gawne, Kjaer, & Richmond, 1996). Previous animal studies have shown that the brain encodes the sensory information not only in the neural firing rates (i.e., average number of neural spikes within specific time windows), but also in more complex patterns of neural activity, such as millisecond-precise activity and phase (Kayser, Montemurro, Logothetis, & Panzeri, 2009; Victor, 2000; Montemurro, Rasch, Murayama, Logothetis, & Panzeri, 2008). It was shown that stimulus contrast was represented by latency coding at a temporal precision of about 10 ms, whereas the stimulus orientation and the spatial frequency were encoded at a coarser temporal precision (30 ms and 100 ms, respectively; Victor, 2000). It was shown that spike rates on 5 ms to 10 ms timescales carried complementary information to the phase of firing relative to low-frequency (1–8 Hz) local field potentials (LFPs) about epoch of naturalistic movie (Montemurro et al., 2008). Therefore, the temporal patterns and variabilities of neural activity are enriched platforms of neural codes.

Recent computational and experimental studies have proposed that neural variability provides a separate and additional channel to the mean activity for the encoding of general aspects of the sensory information—for example, its “uncertainty” and “complexity” (Orbán, Berkes, Fiser, & Lengyel, 2016; Garrett, Epp, Kleemeyer, Lindenberger, & Polk, 2020). Specifically, uncertainty about the stimulus features (e.g., orientations of lines in the image) was directly linked to neural variability in monkeys' visual area (Orbán et al., 2016) and human EEG (Kosciessa, Lindenberger, & Garrett, 2021): wider inferred range of possible feature combinations in the input stimulus corresponded to wider distribution of neural responses. This could be applied to both within- and across-trial variability (Orbán et al., 2016). Moreover, temporal variability was directly related to the complexity of input images: higher neural variability for house (i.e., more varied) versus face (i.e., less varied) images (Garrett et al., 2020) and provided a reliable measure of perceptual performance in behavior (Waschke, Tune, & Obleser, 2019). The uncertainty- and complexity-dependent modulation of neural variability, which is linked to the category of input information, has been suggested to facilitate neural energy saving and adaptive and effective encoding of the sensory inputs in changing environments (Garrett et al., 2020; Waschke, Kloosterman, Obleser, & Garrett, 2021).

Despite the richness of information encoded by neural variabilities, the unclear transformation of such neuronal codes into EEG activity has led to divergent approaches used for decoding information from EEG. For example, the information in neural firing rates might appear in phase patterns rather than amplitude of EEG oscillations (Ng, Logothetis, & Kayser, 2013). Generally three families of features have been extracted from EEG time series to detect neural codes from temporal variabilities (Waschke et al., 2021): variance-, frequency- and information theory-based features, each detecting specific aspects of variability. In whole-trial decoding, components of event-related potentials (ERPs) such as N1, P1, P2a, and P2b, which quantify time-specific variabilities of within-trial activation, have provided significant information about object categories (separately and in combination; Chan, Halgren, Marinkovic, & Cash, 2011; Wang, Xiong, Hu, Yao, & Zhang, 2012; Qin et al., 2016). Others successfully decoded information from more complex variance- and frequency-based features such as signal phase (Behroozi, Daliri, & Shekarchi, 2016; Watrous, Deuker, Fell, & Axmacher, 2015; Torabi, Jahromy, & Daliri, 2017; Wang, Wang, & Yu, 2018; Voloh, Oemisch, & Womelsdorf, 2020), signal power across frequency bands (Rupp et al., 2017; Miyakawa et al., 2018; Majima et al., 2014), time-frequency wavelet coefficients (Hatamimajoumerd & Talebpour, 2019; Taghizadeh-Sarabi, Daliri, & Niksirat, 2015), interelectrode temporal correlations (Karimi-Rouzbahani, Bagheri, & Ebrahimpour, 2017a), and information-based features (e.g., entropy; Joshi, Panigrahi, Anand, & Santhosh, 2018; Torabi et al., 2017; Stam, 2005). Therefore, the neural codes are generally detected from EEG activity using a wide range of features sensitive to temporal variability.

While insightful, previous studies have also posed new questions about the relative richness, temporal dynamics, and behavioral relevance of different features of neural variability. First, can the features sensitive to temporal variabilities provide additional category information to the conventional mean feature? While several of the above studies have compared multiple features (Chan et al., 2011; Taghizadeh-Sarabi et al., 2015; Torabi et al., 2016), none of them compared their results against the conventional mean activity, which is the dominant feature, especially in time-resolved decoding (Grootswagers, Wardle, & Carlson, 2017). This comparison will not only validate the richness of each feature of neural variability but will also show if the mean activity detects a large portion of the neural codes produced by the brain. We predicted that the informative neural variabilities, if properly decoded, should provide additional information to the mean activity, which overlook the analysis window.

Second, do the features sensitive to temporal variabilities evolve over similar time windows to the “mean” feature? Among all the studies mentioned above, only a few investigated the temporal dynamics of features, other than the mean in time-resolved decoding (Majima et al., 2014; Stewart, Nuthmann, & Sanguinetti, 2014; Karimi-Rouzbahani et al., 2017a), where the temporal evolution of information encoding is studied (Grootswagers et al., 2017). As distinct aspects of sensory information (e.g., contrast versus spatial frequency) are represented on different temporal scales (Victor, 2000; Montemurro et al., 2008) and different variability features are potentially sensitive to distinct aspects of variability, we might see differential temporal dynamics for different features.

Third, do the features sensitive to temporal variabilities explain the behavioral recognition performance more accurately than the mean feature? One important question, which was not covered in the above studies, was whether the extracted information was behaviorally relevant or just epiphenomenal to the experimental conditions. One way of validating the relevance of the extracted neural codes is to check if they could predict the relevant behavior (Williams, Dang, & Kanwisher, 2007; Grootswagers, Cichy, & Carlson, 2018; Woolgar, Dermody, Afshar, Williams, & Rich, 2019). We previously found that the decoding accuracies obtained from mean signal activations could predict the behavioral recognition performance (Ritchie, Tovar, & Carlson, 2015). However, it remains unknown whether (if at all) the information obtained from temporal variabilities can explain more variance of the behavioral performance. Our prediction was that as the more informative features access more of the potentially overlooked neural codes, they should also explain the behavioral performance more accurately.

In this study, we address the above questions to provide additional insights into what aspects of neural variabilities might reflect the neural codes more thoroughly and how we can extract them most effectively using multivariate decoding analyses.

The data sets used in this study and the code are available online at https://osf.io/wbvpn/. The EEG and behavioral data are available in Matlab .mat format and the code in Matlab .m format. All the open-source scripts used in this study were compared against other implementations of identical algorithms in simulations and used only if they produced identical results. All open-source implementation scripts of similar algorithms produced identical results in our simulations. To evaluate different implementations, we tested them using 1000 random (normally distributed with unit variance and zero mean) time series, each including 1000 samples.

2.1  Overview of Data Sets

We chose three previously published EEG data sets in this study which differed across a wide range of parameters including the recording setup (e.g., amplifier, number of electrodes, preprocessing steps), characteristics of the image-set (e.g., number of categories and exemplars within each category, colorfulness of images), and task (e.g., presentation length, order and the participants' task; see Table 1). All three data sets previously successfully provided object category information using multivariate analyses.

Table 1:

Details of the Three Data Sets Used in This Study.

Data Set# ElectrodesBandpass FilteringNotch Filtering# Object Categories# Stimulus RepetitionStimulus Presentation TimeStimulus Size (Periphery)TaskParticipants' AccuracyParticipants' Age (median)Participants' Gender
Karimi-Rouzbahani et al., 2017a  31 0.03–200 Hz 50 Hz 50 ms 213.5 (0.78.8Color matching (passive) 94.68% 22.1 7 male 3 female 
Karimi-Rouzbahani, Vahab, Ebrahimpour, & Menhaj, 2019  31 0.03–200 Hz 50 Hz 900 ms 8× 8 (0) Object category detection (active) 94.65% 26.4 6 male 4 female 
Kaneshiro et al., 2015  128 0.03–50 Hz No 12 500 ms 7.0× 6.5 (0) No task (fixation) N/A 30.5 7 male 3 female 
Data Set# ElectrodesBandpass FilteringNotch Filtering# Object Categories# Stimulus RepetitionStimulus Presentation TimeStimulus Size (Periphery)TaskParticipants' AccuracyParticipants' Age (median)Participants' Gender
Karimi-Rouzbahani et al., 2017a  31 0.03–200 Hz 50 Hz 50 ms 213.5 (0.78.8Color matching (passive) 94.68% 22.1 7 male 3 female 
Karimi-Rouzbahani, Vahab, Ebrahimpour, & Menhaj, 2019  31 0.03–200 Hz 50 Hz 900 ms 8× 8 (0) Object category detection (active) 94.65% 26.4 6 male 4 female 
Kaneshiro et al., 2015  128 0.03–50 Hz No 12 500 ms 7.0× 6.5 (0) No task (fixation) N/A 30.5 7 male 3 female 

2.1.1  Data Set 1

We previously collected data set 1 while participants were briefly (i.e., 50 ms) presented with gray-scale images from four synthetically generated 3D object categories (Karimi-Rouzbahani et al., 2017a). The objects underwent systematic variations in scale, positional periphery, in-depth rotation, and lighting conditions, which made perception difficult, especially in extreme variation conditions. Randomly ordered stimuli were presented in consecutive pairs (see Figure 1, top row). The participants' task was unrelated to object categorization; they pressed one of two predetermined buttons to indicate if the fixation dots, superimposed on the first and second stimuli, were the same or a different color (two-alternative forced choice).
Figure 1:

Paradigms of the data sets used in this study. Data set 1 (top row) presented two consecutive object images, each with a fixation dot. The participant's task was to indicate if the fixation dots were the same or different colors across the image pairs (passive task). Data set 2 (middle row) presented objects from the target and nontarget categories in sequences of 12 images. The participant's task was to indicate, for each image, if it was from the target or nontarget category (active task). Data set 3 (bottom row), presented sequences of object images from six categories. Participants did not have any specific tasks except for looking at the center of the image (no overt task). More details about the data sets in the relevant references are provided in Table 1.

Figure 1:

Paradigms of the data sets used in this study. Data set 1 (top row) presented two consecutive object images, each with a fixation dot. The participant's task was to indicate if the fixation dots were the same or different colors across the image pairs (passive task). Data set 2 (middle row) presented objects from the target and nontarget categories in sequences of 12 images. The participant's task was to indicate, for each image, if it was from the target or nontarget category (active task). Data set 3 (bottom row), presented sequences of object images from six categories. Participants did not have any specific tasks except for looking at the center of the image (no overt task). More details about the data sets in the relevant references are provided in Table 1.

2.1.2  Data Set 2

We collected data set 2 in an active categorization experiment, in which participants pressed a button if the presented object image was from a target category (go/no-go), which was cued at the beginning of each block of 12 stimuli (Karimi-Rouzbahani, Vahab, Ebrahimpour, & Menhaj, 2019; see Figure 1, middle row). The object images, which were cropped from photographs, were part of the well-established benchmark image set for object recognition developed by Kiani, Esteky, Mirpour, and Tanaka (2007). This image set has been previously used to extract object category information from both human and monkey brain using MEG (Cichy et al., 2014), fMRI (Cichy et al., 2014; Kriegeskorte et al., 2008), and single-cell electrophysiology (Kriegeskorte et al., 2008; Kiani et al., 2007).

2.1.3  Data Set 3

We also used another data set (data set 3), which was not collected in our lab. This data set was collected by Kaneshiro et al. (2015) on six sessions for each participant. We used the first session only because it could represent the whole data set (the next sessions were repetition of the same stimuli to increase the signal-to-noise ratio) and we preferred to avoid a potential effect of extended familiarity with the stimuli on neural representations. The EEG data were collected during passive viewing (participants had no task but to keep fixating on the central fixation cross; see Figure 1, bottom row) of six categories of objects with stimuli chosen from Kiani et al. (2007) as explained above. We used a preprocessed (i.e., bandpass-filtered in the range 0.03 to 50 Hz) version of the data set, which was available online.1

All three data sets were collected at a sampling rate of 1000 Hz. For data sets 1 and 2, only the trials that led to correct responses by participants were used in the analyses. Each data set consisted of data from 10 participants. Each object category in each data set included 12 exemplars. To make the three data sets as consistent as possible, we preprocessed them differently from their original papers. Specifically, the bandpass filtering range of data set 3 was 0.03 to 50 Hz, and we did not have access to the raw data to increase the upper cutting frequency to 200 Hz. Data sets 1 and 2 were bandpass-filtered in the range 0.03 to 200 Hz before the data were split into trials. We also applied 50 Hz notch filters to data sets 1 and 2 to remove line noise. Next, we generated different versions of the data by bandpass-filtering the data in delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–16 Hz), and gamma (16–200 Hz) bands to see if there is any advantage for the suggested theta or delta frequency bands (Watrous et al., 2015; Behroozi et al., 2016; Wang, Wang, & Yu, 2018). We used finite-impulse-response (FIR) filters with 12 dB roll-off per octave for bandpass-filtering of data sets 1 and 2 and when evaluating the sub-bands of the three data sets. All the filters were applied before splitting the data into trials.

We did not remove artifacts (e.g., eye related and movement related) from the signals, as we and others have shown that sporadic artifacts have minimal effect in multivariate decoding (Grootswagers et al., 2017). To increase signal-to-noise ratios in the analyses, each unique stimulus was presented to the participants 3, 6, and 12 times in data sets 1, 2, and 3, respectively. Trials were defined in the time window from 200 ms before to 1000 ms after the stimulus onset to cover most of the range of event-related neural activations. The average prestimulus (-200–0 ms relative to the stimulus onset) signal amplitude was removed from each trial of the data. For more information about each data set, see Table 1 and the references to their original publications.

2.2  Features

EEG signals are generated by inhibitory and excitatory postsynaptic potentials of cortical neurons. These potentials extend to the scalp surface and are recorded through electrodes as amplitudes of voltage in units of microvolts. Researchers have been using different aspects of these voltage recordings to obtain meaningful information about human brain processes. The main focus of this study is to compare the information content of features that are sensitive to temporal variabilities of neural activations against the mean of activity within the analysis window, which is conventionally used in decoding analysis (Grootswagers et al., 2017). Below we explain the mathematical formulas for each feature used in this study. We also provide brief information about potential underlying neural mechanisms that can lead to the information content provided by each feature.

We classified the features into five classes based on their mathematical similarity to simplify the presentation of the results and their interpretations: moment, complexity, ERP, frequency domain, and multivalued features. However, the classification of the features is not strict, and the features might be classified based on other criteria and definitions. For example, complexity itself has different definitions (Tononi & Edelman, 1998), such as degree of randomness or degrees of freedom in a large system of interacting elements. There are also recent studies that split the variability features into the three categories of variance-, frequency- and information theory-based categories (Waschke et al., 2021). Therefore, each definition may exclude or include some of our features in the class. It is of note that we used only the features that were previously used to decode categories of evoked potentials from EEG signals through multivariate decoding analysis. Nonetheless, there are definitely other features available, especially those extracted from EEG time series collected during long-term monitoring of human neural representations in health and disorder (Fulcher & Jones, 2017). In presenting the features' formulas, we avoided repeating the terms from the first feature to the last one. Therefore, readers might need to go back a few steps or features to find the definitions of the terms. Note that in this study, the analyses are performed in either 1000 ms time windows (i.e. number of samples used for feature extraction: N=1000) in the whole-trial analysis or 50 ms time windows (N=50) in time-resolved analysis.

2.2.1  Moment Features

These features are the most straightforward and intuitive ones from which we might be able to extract information about neural processes. Mean, variance, skewness, and kurtosis are the first to fourth moments of EEG time series and can provide information about the shape of the signals and their deviation from stationarity which is the case in evoked potentials (Rasoulzadeh et al., 2017; Wong Galka, Yamashita, & Ozaki, 2006). These moments have been shown to be able to differentiate visually evoked responses (Pouryzdian & Erfanian, 2010; Alimardani, Cho, Boostani, & Hwang, 2018). The second to fourth moments are also categorized as variance-based features in recent studies (Waschke et al., 2021).

Mean. Mean amplitude of an EEG signal changes in proportion to the neural activation of the brain. It is by far the most common feature of the recorded neural activations used in analyzing brain states and cognitive processes in both univariate and multivariate analyses (Vidal et al., 2010; Hebart & Baker, 2018; Grootswagers et al., 2017; Karimi-Rouzbahani et al., 2019). In EEG, brain activation is reflected as the amplitude of the recorded voltage across each electrode and the reference electrode at specific time points. To calculate the mean feature, the first moment in statistics, the sample mean is calculated for each recorded EEG time series as
x¯=1Nt=1Nxt,
(2.1)
where x¯ is the mean of the N time samples contained in the analysis window and xt refers to the amplitude of the recorded sample at time point t. N can be as small as unity as in the case of time-resolved EEG analysis (Grootswagers et al., 2017) or so large that it can cover the whole trial in whole-trial analysis. Accordingly, we set N=1000 (1000 ms) and N=50 (50 ms) for the whole-trial and time-resolved decoding analyses, respectively.
Median. Compared to the mean feature, the median is less susceptible to outliers (e.g., spikes) in the time series, which might not come from neural activations but rather from artifacts caused by, for example, the recording hardware, preprocessing, or eye-blinks. The median is calculated as
Median(X)=XN2ifNisevenXN-12+XN+122ifNisodd,
(2.2)
where X is the ordered values of samples in the time series xt for t=1,,N.
Variance. The variance of an EEG signal is one of the simplest indicators showing how much the signal is deviated from stationarity, that is, from its original baseline statistical properties (Wong et al., 2006). It is a measure of signal variabilities (within trial here), has been shown to decline upon the stimulus onset potentially as a result of neural coactivation, and has provided information about object categories in a recent EEG decoding study (Karimi-Rouzbahani et al., 2017a). Variance is calculated as
σ2=1Nt=1N(xt-x¯)2.
(2.3)
Skewness. While variance is silent about the direction of the deviation from the mean, skewness, the third signal moment, measures the degree of asymmetry in the signal's probability distribution. In symmetric distribution (i.e., when samples are symmetric around the mean), skewness is zero. Positive and negative skewness indicates right- and left-ward tailed distribution, respectively. As the visually evoked ERP responses usually tend to be asymmetrically deviated in either a positive or negative direction, even after baseline correction (Mazaheri & Jensen, 2008), we assume that skewness should provide information about the visual stimulus if each category modulates the deviation of the samples differentially. Skewness is calculated as
γ1=1Nt=1Nxt-x¯σ3.
(2.4)
Kurtosis. Kurtosis reflects the degree of “tailedness” or “flattedness” of the signal's probability distribution. Accordingly, the more heaviness there is in the tails, the less value of the kurtosis and vice versa. Based on previous studies, Kurtosis has provided distinct representations corresponding to different classes of visually evoked potentials (Alimardani et al., 2018; Pouryzdian & Erfanian, 2010). We test to see if it plays a more generalized role in information coding (e.g., coding of semantic aspects of visual information) as well. It is the fourth standardized moment of the signal, defined as
Kurt=1Nt=1Nxt-x¯σ4.
(2.5)

2.2.2  Complexity Features

There can potentially be many cases in which simple moment statistics such as mean, median, variance, skewness, and kurtosis, which rely on distributional assumptions, provide equal values for distinct time series (e.g., series A: 10, 20, 10, 20, 10, 20, 10, 20 versus series B: 20, 20, 20, 10, 20, 10, 10, 10) for both of which the five features provide equal results. Therefore, we need more complex and possibly nonlinear measures that can detect subtle but meaningful temporal patterns from time series. The analysis of nonlinear signal features has recently been growing, following the findings showing that EEG reflects weak but significant nonlinear structures (Stam, 2005; Stêpieñ, 2002). Importantly, many studies have shown that the complexity of EEG time series can significantly alter during cognitive tasks such as visual (Bizas et al., 1999) and working memory tasks (Sammer, 1999; Stam, 2000). Therefore, it was necessary to evaluate the information content of nonlinear features for our decoding of object categories. As mentioned above, the grouping of these nonlinear features as “complexity” here is not strict, and the features included in this class are those that capture complex and nonlinear patterns across time series. Although the accurate detection of complex and nonlinear patterns generally needs more time samples compared to linear patterns (Procaccia, 1988), it has been shown that nonlinear structures can be detected from short EEG time series as well (i.e., through fractal dimensions; Preissl, Lutzenberger, Pulvermüller, & Birbaumer, 1997). Nonetheless, we extract these features from both time-resolved (50 samples) and whole-trial data (1000 samples) to ensure we do not miss potential information represented in longer temporal scales.

Lempel-Ziv complexity (LZ Cmplx). Lempel-Ziv complexity measures the complexity of time series (Lempel & Ziv, 1976). Basically, the algorithm counts the number of unique sub-sequences within a larger binary sequence. Accordingly, a sequence of samples with a certain regularity does not lead to a large LZ complexity. However, the complexity generally grows with the length of the sequence and its irregularity. In other words, it measures the generation rate of new patterns along a digital sequence. In a comparative work, it was shown that compared to many other frequency metrics of time series (e.g., noise power, stochastic variability), LZ complexity has the unique feature of providing a scalar estimate of the bandwidth of time series and the harmonic variability in quasi-periodic signals (Aboy, Hornero, Abásolo, & Álvarez, 2006). It is widely used in biomedical signal processing and has provided successful results in the decoding of visual stimuli from neural responses in primary visual cortices (Szczepański, Amigó, Wajnryb, & Sanchez-Vives, 2003). We used the code by Quang Thai2 implemented based on “exhaustive complexity,” which is considered to provide the lower limit of the complexity as explained by Lempel and Ziv (1976). We used the signal median as a threshold to convert the signals into binary sequences for the calculation of LZ complexity. The LZ complexity provided a single value for each signal time series.

Fractal dimension. In signal processing, fractal is an indexing technique that provides statistical information about the complexity of time series. A higher fractal value indicates more complexity for a sequence as reflected in more nesting of repetitive sub-sequences at all scales. Fractal dimensions are widely used to measure two important attributes: self-similarity and the shape of irregularity. A growing set of studies has been using fractal analyses for the extraction of information about semantic object categories (such as living and nonliving categories of visual objects; Ahmadi-Pajouh, Ala, Zamanian, Namazi, & Jafari, 2018; Torabi et al., 2017), as well as simple checkerboard patterns (Namazi, Ala, & Bakardjian, 2018) from visually evoked potentials. In this study, we implemented two of the common methods for the calculation of fractal dimensions of EEG time series, which have been previously used to extract information about object categories as explained below. We used the implementations by Jesús Monge Álvarez3 for fractal analysis.

In Higuchi's fractal dimension (Higuchi FD; Higuchi, 1988), a set of sub-sequences xkm is generated in which k and m refer to the step size and initial value, respectively. Then the length of this fractal dimension is calculated as
Lkm=i=1N-mk|x(m+ik)-x(m+(i-1).k)|N-1N-mk.kk,
(2.6)
where N-1N-mk.k is the normalization factor The length of the fractal curve at step size of k is calculated by averaging k sets of Lkm. Finally the resultant average will be proportional to k-D where D is the fractal dimension. We set the free parameter of k equal to half the length of signal time series in the current study.
We also calculated fractal dimension using Katz's method (Katz FD; Katz, 1988) as it showed a significant amount of information about object categories in a previous study (Torabi et al., 2017). The fractal dimension (D) is calculated as
D=log10Lalog10da=log10rlog10dL+log10r,
(2.7)
where L and a refer to the sum and average of the consecutive signal samples, respectively. Also d refers to the maximum distance between first sample the ith sample of the signal, which has the maximum distance from the first sample as
L=i=2N|xi-xi-1|,
(2.8)
d=max(distance(1,i)),
(2.9)
r=L/a.
(2.10)
Hurst exponent. The Hurst exponent (Hurst Exp) is widely used to measure long-term memory in time-dependent random variables such as biological time series (Racine, 2011). In other words, it measures the degree of interdependence across samples in the time series and operates like an autocorrelation function over time. Hurst values between 0.5 and 1 suggest the consecutive appearance of high signal values on large timescales while values between 0 and 0.5 suggest frequent switching between high and low signal values. Values around 0.5 suggest no specific patterns among samples of a time series. It is defined as an asymptotic behavior of a rescaled range as a function of the time span of the time series defined as
Emax(z1,z2,,zN)-min(z1,z2,,zN)1Nt=1N(xt-x¯)2=C.NHasN,
(2.11)
zt=i=1tyi;t=1,,N,
(2.12)
yt=xt-x¯,
(2.13)
where E is the expected value, C is a constant and H is the Hurst exponent (Racine, 2011) We used the open-source implementation of the algorithm,4 which has also been used previously for the decoding of object category information in EEG (Torabi et al., 2017).

Entropy. Entropy can measure the perturbation in time series (Waschke et al., 2021). A higher value for entropy suggests a higher irregularity in the given time series. Precise calculation of entropy usually requires a considerable number of samples and is also sensitive to noise. Here we used two methods for the calculation of entropy, each of which has advantages over the other.

Approximate entropy (Apprx Ent) was initially developed to be used for medical data analysis (Pincus & Huang, 1992), such as heart rate, and then was extended to other areas such as brain data analysis. It has the advantage of requiring a low computational power, which makes it perfect for real-time applications on low sample sizes (<50). However, the quality of this entropy method is impaired on lower lengths of the data. This metric detects changes in episodic behavior, which are not represented by peak occurrences or amplitudes (Pincus & Huang, 1992). We used an open-source code5 for calculating approximate entropy. We set the embedded dimension and the tolerance parameters to 2% and 20% of the standard deviation of the data, respectively, to roughly follow a previous study (Shourie, Firoozabadi, & Badie, 2014), which compared approximate entropy in visually evoked potentials and found differential effects across artist versus nonartist participants when looking at paintings.

Sample entropy (Sample Ent), a refinement of the approximate entropy, is frequently used to calculate the regularity of biological signals (Richman & Moorman, 2000). Basically, it is the negative natural logarithm of the conditional probability that two sequences (subset of samples) that are similar for m points remain similar at the next point. A lower sample entropy also reflects a higher self-similarity in the time series. It has two main advantages to the approximate entropy: it is less sensitive to the length of the data and is simpler to implement. However, it does not focus on self-similar patterns in the data. We used the Matlab entropy function for the extraction of this feature, which has already provided category information in a previous study (Torabi et al., 2017). (See Richman & Moorman, 2000, and Subha, Joseph, Acharya, & Lim, 2010, for the details of the algorithm.)

Autocorrelation. Autocorrelation (Autocorr) determines the degree of similarity between the samples of a given time series and a time-lagged version of the same series. It detects periodic patterns in signals, which is an integral part of EEG time series. Therefore, following recent successful attempts in decoding neural information using the autocorrelation function from EEG signals (Wairagkar, Zoulias, Oguntosin, Hayashi, & Nasuto, 2016), we evaluated the information content of the autocorrelation function in decoding visual object categories. As neural activations reflect many repetitive patterns across time, the autocorrelation function can quantify the information contents of those repetitive patterns. Autocorrelation is calculated as
R(τ)=1(N-τ)σ2t=1N-τ(xt-x¯)(xt+τ-x¯),
(2.14)
where τ indicates the number of lags in samples of the shifted signal. A positive value for autocorrelation indicates a strong relationship between the original time series and its shifted version, whereas a negative autocorrelation refers to an opposite pattern between them. Zero autocorrelation indicates no relationship between the original time series and its shifted version. In this study, we extracted autocorrelations for 30 consecutive lags ([τ=1,2,,30]) and used their average in classification. Note that each lag refers to 1 ms as the data were sampled at 1000 Hz.
Hjorth parameters. These are descriptors of statistical properties of signals introduced by Hjorth (1970). These parameters are widely used in EEG signal analysis for feature extraction across a wide set of applications including visual recognition (Joshi et al., 2018; Torabi et al., 2017). These features consist of activity, mobility, and complexity as defined below. As the activity parameter is equivalent to the signal variance, which we already Hjorth complexity (Hjorth Cmp) determines the variation in time series' frequency by quantifying the similarity between the signal and a pure sine wave leading to a value of one in case of perfect match In other words, values around one suggest lower complexity for a signal. It is calculated as
Complexity=Mobility(dxtdt)Mobility(xt).
(2.15)
Hjorth mobility (Hjorth Mob) determines the proportion of standard deviation of the power spectrum as is calculated below, where var refers to the signal variance:
Mobility=vardxtdtvar(xt).
(2.16)

2.2.3  ERP Components (N1, P1, P2a, and P2b)

An ERP is a measured brain response to a specific cognitive, sensory, or motor event that provides an approach to studying the correlation between the event and neural processing. According to the latency and amplitude, ERP is split into specific subwindows called components. Here, we extracted ERP components by calculating the mean of signals in specific time windows to obtain the P1 (80–120 ms), N1 (120–200 ms), P2a (150–220 ms), and P2b (200–275 ms) components, which were shown previously to provide significant amounts of information about visual object and face processing in univariate (Rossion et al., 2000; Rousselett, Husk, Bennett, & Sekuler, 2007) and multivariate analyses (Chan et al., 2011; Jadidi, Zargar, & Moradi, 2016; Wang et al., 2012). As these components are calculated in limited and specific time windows, in the whole-trial analysis, they reflect the mean of activity in their specific time windows, rather than the whole post-stimulus window. They will be also absent from time-resolved analyses by definition.

2.2.4  Frequency-Domain Features

Neural variability is commonly analyzed in frequency domain by calculating spectral power across frequency bands. Specifically, as data transformation from time to frequency domain is almost lossless using Fourier transform, oscillatory power basically reflects frequency-specific variance (with the total power reflecting the overall variance of the time series; Waschke et al., 2021). Motivated by previous studies showing signatures of object categories in the frequency domain (Behroozi et al., 2016; Rupp et al., 2017; Iranmanesh & Rodriguez-Villegas, 2017; Joshi et al., 2018; Jadidi et al., 2016) and the representation of temporal codes of visual information in the frequency domain (Eckhorn et al., 1988), we also extracted frequency-domain features to see if they could provide additional category-related information to time-domain features. It is of note that while the whole-trial analysis allows us to compare our results with previous studies, the evoked EEG potentials are generally nonstationary (i.e., their statistical properties change along the trial) and potentially dominated by low-frequency components. Therefore, the use of time-resolved analysis, which looks at more stationary subwindows of the signal (e.g., 50 samples here), will allow us to detect subtle high-frequency patterns of neural codes.

Signal power (Signal Pw). Power spectrum density (PSD) represents the intensity or the distribution of the signal power into its constituent frequency components. This feature was motivated by previous studies showing associations between aspects of visual perception and power in certain frequency bands (Rupp et al., 2017; Behroozi et al., 2016; Majima et al., 2014). According to the Fourier analysis, signals can be broken into their constituent frequency components or a spectrum of frequencies in a specific frequency range. Here, we calculated signal power using the PSD as in
S˜xx(w)=(Δt)2Tn=1Nxne-iwnΔt2,
(2.17)
where xn=xnΔt is signal sampled at a rate of T=1Δt and w is the frequency at which the signal power is calculated. As signal power is a relatively broad term, including the whole power spectrum of the signal, we also extracted a few more parameters from the signal frequency representation to see what specific features in the frequency domain (if any) can provide information about object categories.
Mean frequency (Mean Freq). Motivated by the successful application of mean and median frequencies in the analysis of EEG signals and their relationship to signal components in the time domain (Intrilligator & Polich, 1995; Abootalebi, Moradi, & Khalilzadeh, 2009), we extracted these two features from the signal power spectrum to obtain a more detailed insight into the neural dynamics of category representations. Mean frequency is the average of the frequency components available in a signal. Assume a signal consisting of two frequency components of f1 and f2. The mean frequency of this signal is fmean=f1+f22. Generally the mean normalized (by the intensity) frequency is calculated using the following formula,
fmean=i=0nlifii=0nli,
(2.18)
where n is the number of splits of the PSD and fi and li are the frequency and the intensity of the PSD in its ith slot, respectively It was calculated using Matlab meanfreq function.

Median frequency (Med Freq). This is the median normalized frequency of the power spectrum of a time-domain signal. It is calculated similar to the signal median in the time domain; however, here the values are the power intensity in different frequency bins of the PSD. This feature was calculated using Matlab medfreq function.

Power and phase at median frequency (Pw MdFrq and Phs MdFrq). Interestingly, apart from the median frequency itself, which reflects the frequency aspect of the power spectrum, the power and phase of the signal at the median frequency have also been shown to be informative about aspects of human perception (Joshi et al., 2018; Jadidi et al., 2016). Therefore, we also calculated the power and phase of the frequency-domain signals at the median frequency as features.

Average frequency (Avg Freq). Evoked potentials show a few positive and negative peaks after the stimulus onset, and they might show deviation in the positive or negative directions depending on the information content (Mazaheri & Jensen, 2008). Therefore, we also evaluated the average (zero-crossing) frequency of the ERPs by counting the number of times the signal swapped signs during the trial. Note that each trial is baselined according to the average amplitude of the same trial in the last 200 ms immediately before the stimulus onset. We calculated the average frequency on the poststimulus time window.

Spectral edge frequency (SEF 95%). This is a common feature used in monitoring the depth of anesthesia and stages of sleep using EEG (Iranmanesh & Rodriguez-Villegas, 2017). It measures the frequency that covers X percent of the PSD. X is usually set in the range of 75% to 95%. Here we set X to 95%. Therefore, this reflects the frequency observed in a signal that covers 95% of a signal power spectrum.

2.2.5  Multivalued Features

The main hypothesis of this study is that we can potentially obtain more information about object categories as well as behavior if we take into account the temporal variability of neural activity within the analysis window (i.e., trial) rather than averaging the samples as in conventional decoding analyses. While the above variability-sensitive features return a single value from each individual time series (analysis window), a more flexible feature would allow as many informative patterns to be detected from an individual time series. Therefore, we extracted other features, which provide more than one value per analysis window, so that we can select the most informative values from across electrodes and time points simultaneously (see “Dimensionality reduction” below). We also included the original magnitude data as our reference feature, so that we know how much (if at all) our feature extraction and selection procedures improved decoding.

Interelectrode correlation (Cross Corr). Following up on recent studies that have successfully used interarea correlation in decoding object category information from EEG activations (Majima et al., 2014; Karimi-Rouzbahani et al., 2017a; Tafreshi, Daliri, & Ghodousi, 2019), we extracted interelectrode correlation to measure the similarity between pairs of signals—here, from different pairs of electrodes. This feature of correlated variability quantifies covariability of neural activations across pairs of electrodes. Although closer electrodes tend to provide more similar (and therefore correlated) activation, compared to further electrodes (Hacker, Snyder, Pahwa, Corbetta, & Leuthardt, 2017), the interelectrode correlation can detect correlations that are functionally relevant and are not explained by the distance (Karimi-Rouzbahani et al., 2017a). This feature detects similarities in temporal patterns of fluctuations across time between pairs of signals, which is calculated as
Rxy=1Nσxσyt=1N(xt-x¯)(yt-y¯),
(2.19)
where x and y refer to the signals obtained from electrodes x and y, respectively. We calculated the cross-correlation between each electrode and all the other electrodes to form a cross-correlation matrix. Accordingly, we initially obtained all the unique possible pairwise interelectrode correlations (465, 465, and 8128 unique values for data sets 1, 2, and 3, respectively), which were then reduced in dimension using PCA to the equal number of dimensions obtained for single-valued features.
Wavelet transform (wavelet). Recent studies have shown remarkable success in decoding object categories using the wavelet transformation of the EEG time series (Taghizadeh-Sarabi et al., 2015; Torabi et al., 2017). Considering the time- and frequency-dependent nature of ERPs, wavelet transform seems to be a reasonable choice as it provides a time-frequency representation of signal components. It determines the primary frequency components and their temporal position in time series. The transformation passes the signal time series through digital filters (Guo, Rivero, Seoane, & Pazos, 2009; see equation 2.20) using the convolution operator, each of which adjusted to extract a specific frequency (scale) at a specific time as in
yn=x*g=k=-+xkgn-k,
(2.20)
where g is the digital filter and * is the convolution operator. This filtering procedure is repeated for several rounds (levels) filtering low- (approximations) and high-frequency (details) components of the signal to provide more fine-grained information about the constituent components of the signal. This can lead to coefficients that can potentially discriminate signals evoked by different conditions. Following up on a previous study (Taghizadeh-Sarabi et al., 2015) and to make the number of wavelet features comparable in number to signal samples, we used detail coefficients at five levels, D1,,D5, as well as the approximate coefficients at level 5, A5. This led to 1015 and 57 features in the full trial and in the 50 ms sliding time windows, respectively. We used the Symlet2 basis function for our wavelet transformations as implemented in Matlab.
Hilbert transform (Hilb Amp and Hilb Phs). Hilbert transform provides amplitude and phase information about the signal and has recently shown successful results in decoding visual letter information from ERPs (Wang et al., 2018). The phase component of the Hilbert transform can qualitatively provide the spatial information obtained from the wavelet transform, leading to their similarity evaluating neuronal synchrony (Le Van Quyen et al., 2001). However, it is still unclear which method can detect category-relevant information from the nonstationary ERP components more effectively. Hilbert transform is described as a mapping function that receives a real signal xt (as defined above), and upon convolution with the function 1πt produces another function of a real variable H(x)(t) as
H(x)(t)=1n-+xτt-τdτ,
(2.21)
where H(x)(t) is a frequency-domain representation of the signal xt, which has simply shifted all the components of the input signal by π2. Accordingly, it produces one amplitude and one phase component per sample in the time series. In the current study, Hilbert transform was applied on 1000 and 50 samples in the whole-trial and time-resolved analysis, respectively. We used the amplitude and phase components separately to discriminate object categories in the analyses.

Amplitude and phase locking (Amp Lock and Phs Lock). Although interelectrode correlated variability (Cross Corr), which is interpreted as inter-area connectivity, has successfully provided object category information (Majima et al., 2014; Karimi-Rouzbahani et al., 2017a), previous studies suggested that neural communication is realized through amplitude and phase locking and coupling (Bruns, Eckhorn, Jokeit, & Ebner, 2000; Siegel, Donner, & Engel, 2012; Engel, Gerloff, Hilgetag, & Nolte, 2013). More recently, researchers have quantitatively shown that amplitude and phase locking detect distinct signatures of neural communication across time and space from neural activity (Siems & Siegel, 2020; Mostame & Sadaghiani, 2020). Therefore, in line with recent studies, which successfully decoded object categories using inter-area-correlated variability as neural codes (Tafreshi et al., 2019), we extracted amplitude and phase locking as two major connectivity features that might contain object category information as well. Briefly, amplitude locking refers to the coupling between the envelopes of two signals (electrodes) and reflects the correlation of activation amplitude. To estimate the amplitude locking between two signals, we extracted the envelopes of the two signals using Hilbert transform (Gabor, 1946; explained below), then estimated the Pearson correlation between the two resulting envelopes as amplitude locking.

Phase locking refers to the coupling between the phases of two signals and measures the synchronization of rhythmic oscillation cycles. To measure phase locking, we used one of the simplest implementations, the phase-locking value (PLV), which includes minimal mathematical assumptions (Bastos & Schoffellen, 2016), calculated as
PLV=1Ni=1NeΔΦi,
(2.22)
where N is the number of trials and ΔΦ is the phase difference between the signals to electrode pairs. As we used multivariate decoding without any trial averaging, N was equal to one here. The calculation of amplitude and phase locking was performed on all electrode pairs leading to 465 and 8128 unique numbers for the 31- (data sets 1 and 2) and 128-electrode (data set 3) data sets before dimension reduction was performed.

Original magnitude data (Orig Mag). We also used the poststimulus original magnitude data (1000 or 50 samples for the whole-trial and sliding time windows, respectively) to decode object category information without any feature extraction. This provided a reference to compare the information content of the mean and variability features to see if the former provided any extra information.

2.3  Multivariate Decoding

We used multivariate decoding to extract information about object categories from our EEG data sets. Basically, multivariate decoding, which has been dominating neuroimaging studies recently (Haynes & Rees, 2006; Grootswagers et al., 2017; Hebart & Baker, 2018), measures the cross-condition dissimilarity or contrast to quantify information content in neural representations. We used linear discriminant analysis (LDA) classifiers in multivariate analysis to measure the information content across all possible pairs of object categories within each data set. Specifically, we trained and tested the classifiers on animal versus car, animal versus face, animal versus plane, car versus plane, face versus car and plane versus face categories, then averaged the six decoding results and reported them for each participant.

The LDA classifier has been shown to be robust when decoding object categories from M/EEG (Grootswagers et al., 2017, 2019), has provided higher decoding accuracies than Euclidean distance and correlation-based decoding methods (Carlson et al., 2013), and was around 30 times faster to train in our initial analyses compared to the more complex classifier of support vector machines (SVM). We ran our initial analysis and found similar results for the LDA and SVM and used LDA to save the time. We used a 10-fold cross-validation procedure in which we trained the classifier on 90% of the data and tested it on the left-out 10% of the data, repeating the procedure 10 times until all trials from the pair of categories participated once in the training and once in the testing of the classifiers. We repeated the decoding across all possible pairs of categories within each data set, which were 6, 6, and 15 pairs for data sets 1, 2, and 3, which consisted of 4, 4, and 6 object categories, respectively. Finally, we averaged the results across all combinations and reported them as the average decoding for each participant.

In the whole-trial analyses, we extracted the above-mentioned features from the 1000 data samples after the stimulus onset (from 1 to 1000 ms). In the time-resolved analyses, we extracted the features from 50 ms sliding time windows in steps of 5 ms across the time course of the trial (-200 to 1000 ms relative to the stimulus onset time). Therefore, in time-resolved analyses, the decoding rates at each time point reflect the results for the 50 ms window around the time point, from -25 to +24 ms relative to the time point. Time-resolved analyses allowed us to evaluate the evolution of object category information across time as captured by different features.

2.4  Dimensionality Reduction

The multivalued features (e.g. interelectrode correlation, wavelet, Hilbert amplitude and phase, amplitude and phase locking, and original magnitude data) resulted in more than a single feature value per trial per sliding time window. This could provide higher decoding values compared to the decoding values obtained from single-valued features merely because of including a higher number of features. Moreover, when the features outnumber the observations (trials here), the classification algorithm can overfit to the data (Hart, Stork, & Duda, 2000). Therefore, to obtain comparable decoding accuracies across single-valued and multivalued features and to avoid potential overfitting of classifier to the data we used principal component analysis (PCA) to reduce the dimension of the data in multivalued features. Accordingly, we reduced the number of the values in the multivalued features to one per time window per trial, which equaled the number of values for the single-valued features.

To avoid potential leakage of information from testing to training (Pulini, Kerr, Loo, & Lenartowicz, 2019), we applied the PCA algorithm on the training data (folds) only and used the training PCA parameters (eigenvectors and means) for both training and testing sets for dimension reduction in each cross-validation run separately. We applied the dimension-reduction procedure only on the multivalued features. Note that we did not reduce the dimension of the neural space (columns in the dimension-reduced data matrix) to below the number of electrodes “e” (opposite of Hatamimajoumerd, Talebpour, & Mohsenzadeh, 2020) as we were interested in qualitatively comparing our results with the vast literature currently using multivariate decoding with all sensors (Grootswagers et al., 2017; Hebart & Baker, 2018). Also, we did not aim at finding more than one feature per trial, per time window, as we wanted to compare the results of multivalued features with those of single-valued features, which had only a single value per trial, per time window.

One critical point here is that we applied the PCA on the concatenated data from all electrodes and values obtained from each individual feature (e.g., wavelet coefficients in wavelet) within each analysis window (e.g., 50 ms in time-resolved decoding). Therefore, for the multivalued features, the “e” selected dimensions, were the most informative spatial and temporal patterns detected across both electrodes and time samples. Therefore, it could be the case that within a given time window, two of the selected dimensions were from the same electrode (because two elements from the same electrode were more informative than the other electrode), which would lead to some electrodes not having any representatives among the selected dimensions. This is in contrast to the single-valued features (e.g., mean) from which we obtained only one value per analysis window per electrode, limiting the features to only the spatial patterns within the analysis window, rather than both spatial and temporal patterns.

2.5  Statistical Analyses

2.5.1  Bayes Factor Analysis

As in our previous studies (Grootswagers et al., 2019; Robinson, Grootswagers, & Carlson, 2019), to determine the evidence for the null and the alternative hypotheses, we used Bayes analyses as implemented by Bart Krekelberg based on Rouder, Morey, Speckman, and Province (2012). We used standard rules of thumb for interpreting levels of evidence (Lee & Wagenmakers, 2005; Dienes, 2014): Bayes factors of >10 and <1/10 were interpreted as strong evidence for the alternative and null hypotheses, respectively, and >3 and <1/3 were interpreted as moderate evidence for the alternative and null hypotheses, respectively. We considered the Bayes factors that fell between 3 and 1/3 as suggesting insufficient evidence either way. In the whole-trial decoding analyses, we asked whether there was a difference between the decoding values obtained from all possible pairs of features and also across frequency bands within every feature. Accordingly, we performed the Bayes factor analysis and calculated the Bayes factors as the probability of the data under an alternative (i.e., difference) relative to the null (i.e., no difference) hypothesis between all possible pairs of features and also across frequency bands within every feature and data set separately. The same procedure was used to evaluate evidence for difference (i.e., alternative hypothesis) or no difference (i.e., null hypothesis) in the maximum and average decoding accuracies and the time of maximum and above-chance decoding accuracies across features for each data set separately.

We also evaluated evidence for the alternative of above-chance decoding accuracy versus the null hypothesis of no difference from chance. For that purpose, we performed Bayes factor analysis between the distribution of actual accuracies obtained and a set of 1000 accuracies obtained from random permutation of class labels across the same pair of conditions (null distribution) on every time point (or only once for the whole-trial analysis), for each feature and data set separately. No correction for multiple comparisons was performed when using Bayes factors as they are much more conservative than frequentist analysis in providing false claims with confidence (Gelman & Tuerlinckx, 2000; Gelman, Hill, & Yajima, 2012). The reason for the lower susceptibility of Bayesian analysis compared to classical statistics is the use of priors, which, if chosen properly (here using the data-driven approach developed by Rouder et al. (2012), significantly reduce the chance of making type I (false-positive) errors.

The priors for all Bayes factor analyses were determined based on Jeffrey-Zellner-Siow priors (Jeffreys, 1998; Zellner & Siow, 1980), which are from the Cauchy distribution based on the effect size that is initially calculated in the algorithm (Rouder et al., 2012). The priors are data driven and have been shown to be invariant with respect to linear transformations of measurement units (Rouder et al., 2012), which reduces the chance of being biased toward the null or alternative hypotheses.

2.5.2  Random Permutation Testing

To evaluate the significance of correlations between decoding accuracies and behavioral reaction times, we calculated the percentage of the actual correlations that were higher (when positive) or lower (when negative) than a set of 1000 randomly generated correlations. These random correlations were generated by randomizing the order of participants' data in the behavioral reaction time vector (null distribution) for every time point and feature separately. The true correlation was considered significant if it surpassed 95% of the randomly generated correlations in the null distribution in either positive or negative directions (p<0.05) and the p-values were corrected for multiple comparisons across time using Matlab's mafdr function which works based on fix rejection region (Storey, 2002).

To check the information content of different features of the EEG activity about object categories, we performed multivariate pattern decoding on both the whole-trial as well as time-resolved data. The whole-trial analysis was aimed at providing results comparable to previous studies, most of which performed whole-trial analysis. The time-resolved analysis, however, was the main focus of our study and allowed us to check the information and temporal dynamics of variability-based neural codes as captured by different features. In Figures 2 and 3, we present a summary of the results with emphasis on the comparison between the time-specific ERP components, the most informative features detecting neural variability (Wavelet and Orig Mag), and the conventional mean feature, which ignores potential information in neural variabilities. The complete comparison between the 32 features are provided in supplementary materials but briefly explained in this letter.
Figure 2:

Whole-trial decoding of object categories in the three data sets across the broad band and different frequency bands (A) with their Bayesian analyses (B). The results are presented only for features of mean, ERP components, wavelet, and Orig Mag. For full results including other features, see Supplementary Figures 1 and 2. (A) The black horizontal dashed lines on the top panels refer to chance-level decoding. Thick bars show the average decoding across participants (error bars standard error across participants). Bayes factors are shown in the bottom panel of each graph. Filled circles show moderate to strong evidence for either hypothesis, and empty circles indicate insufficient evidence. They show the results of Bayes factor analysis when evaluating the difference from chance-level decoding. (B) Top panel: Bayes matrices compare the decoding results within each frequency band across features separated by data sets. Bottom panel: Bayes matrices compare decoding results across different frequency bands and data set separately. Colors indicate different levels of evidence for existing difference (moderate 3 < BF < 10, orange; strong BF > 10, yellow), no difference (moderate 0.1 < BF < 0.3, light blue; strong BF < 0.1, dark blue) or insufficient evidence (1 < BF < 3 green; 0.3 < BF < 1 cyan) for either hypotheses. For example, for data set 1, there is strong evidence for higher decoding values for the N1 feature in the theta and alpha bands than in the gamma band, as indicated by the red box.

Figure 2:

Whole-trial decoding of object categories in the three data sets across the broad band and different frequency bands (A) with their Bayesian analyses (B). The results are presented only for features of mean, ERP components, wavelet, and Orig Mag. For full results including other features, see Supplementary Figures 1 and 2. (A) The black horizontal dashed lines on the top panels refer to chance-level decoding. Thick bars show the average decoding across participants (error bars standard error across participants). Bayes factors are shown in the bottom panel of each graph. Filled circles show moderate to strong evidence for either hypothesis, and empty circles indicate insufficient evidence. They show the results of Bayes factor analysis when evaluating the difference from chance-level decoding. (B) Top panel: Bayes matrices compare the decoding results within each frequency band across features separated by data sets. Bottom panel: Bayes matrices compare decoding results across different frequency bands and data set separately. Colors indicate different levels of evidence for existing difference (moderate 3 < BF < 10, orange; strong BF > 10, yellow), no difference (moderate 0.1 < BF < 0.3, light blue; strong BF < 0.1, dark blue) or insufficient evidence (1 < BF < 3 green; 0.3 < BF < 1 cyan) for either hypotheses. For example, for data set 1, there is strong evidence for higher decoding values for the N1 feature in the theta and alpha bands than in the gamma band, as indicated by the red box.

Figure 3:

Time-resolved decoding of object categories from the three data sets for three of the target features (A) and their extracted timing and amplitude parameters (B–E). (A) Top section in each panel shows the decoding accuracies across time, and the bottom section shows the Bayes factor evidence for the difference of the decoding accuracy compared to chance-level decoding. The solid lines show the average decoding across participants and the shaded area the standard error across participants. The horizontal dashed lines on the top panel refer to chance-level decoding. Filled circles in the Bayes factors show moderate to strong evidence for either difference or no difference from chance level or across features, and empty circles indicate insufficient evidence for either hypotheses. (B) Timing and amplitude parameters extracted from the time-resolved accuracies in panel A. (B–E) Left: The maximum and average decoding accuracies, the time of maximum, and the first above-chance decoding. The horizontal dashed lines refer to chance-level decoding. Thick bars show the average across participants (error bars standard error across participants). Bottom sections on panels B and C show the Bayes factor evidence for the difference of the decoding accuracy compared to chance-level decoding. Right: Matrices compare the parameters obtained from different features. Different levels of evidence for existing difference (moderate 3 < BF < 10, orange; strong BF > 10, yellow), no difference (moderate 0.1 < BF < 0.3, light blue; strong BF < 0.1, dark blue), or insufficient evidence (1 < BF < 3 green; 0.3 < BF < 1 cyan) for either hypotheses. Filled circles in the Bayes factors show moderate to strong evidence for either hypothesis, and open circles indicate insufficient evidence. Single and double stars indicate moderate and strong evidence for difference between the parameters obtained from decoding curves of the three features.

Figure 3:

Time-resolved decoding of object categories from the three data sets for three of the target features (A) and their extracted timing and amplitude parameters (B–E). (A) Top section in each panel shows the decoding accuracies across time, and the bottom section shows the Bayes factor evidence for the difference of the decoding accuracy compared to chance-level decoding. The solid lines show the average decoding across participants and the shaded area the standard error across participants. The horizontal dashed lines on the top panel refer to chance-level decoding. Filled circles in the Bayes factors show moderate to strong evidence for either difference or no difference from chance level or across features, and empty circles indicate insufficient evidence for either hypotheses. (B) Timing and amplitude parameters extracted from the time-resolved accuracies in panel A. (B–E) Left: The maximum and average decoding accuracies, the time of maximum, and the first above-chance decoding. The horizontal dashed lines refer to chance-level decoding. Thick bars show the average across participants (error bars standard error across participants). Bottom sections on panels B and C show the Bayes factor evidence for the difference of the decoding accuracy compared to chance-level decoding. Right: Matrices compare the parameters obtained from different features. Different levels of evidence for existing difference (moderate 3 < BF < 10, orange; strong BF > 10, yellow), no difference (moderate 0.1 < BF < 0.3, light blue; strong BF < 0.1, dark blue), or insufficient evidence (1 < BF < 3 green; 0.3 < BF < 1 cyan) for either hypotheses. Filled circles in the Bayes factors show moderate to strong evidence for either hypothesis, and open circles indicate insufficient evidence. Single and double stars indicate moderate and strong evidence for difference between the parameters obtained from decoding curves of the three features.

3.1  Can the Features Sensitive to Temporal Variabilities Provide Additional Category Information to the Conventional Mean Feature?

To answer this first question, we compared decoding accuracies in the whole-trial time span (0 to 1000 ms relative to stimulus onset) across all features and for each data set separately (see the complete results in Supplementary Figures 1 and 2 and summary results in Figure 2, black bars). There was not enough (BF > 3) evidence for above-chance decoding for majority of features (e.g., moment features, complexity, and frequency-domain features; see Supplementary Figure 1, black bars and their Bayesian analyses). However, consistently across the three data sets, there was moderate (3 < BF < 1 0) or strong (BF > 10) evidence for above-chance decoding for all ERP components (N1, P1, P2a, and P2b), wavelet coefficients (Wavelet), and original magnitude data (Orig Mag), which were either targeted at specific time windows within the trial (ERPs) or could detect temporal variabilities within the trial (Wavelet and Orig Mag; see Figure 2A, black bars).

Importantly, in all three data sets, there was moderate (3 < BF < 10) or strong (BF > 10) evidence that ERP components of N1 and P2a provided higher decoding values than the mean (see Figure 2B, black boxes in Bayes matrices). There was also strong evidence (BF > 10) that the wavelet and Orig Mag features outperformed the mean feature in data sets 2 and 3 (see Figure 2B, blue boxes in Bayes matrices). This shows that simply using the earlier ERP components of N1 and P2a can provide more information than using the mean activity across the whole trial. This was predictable, as the mean across the whole trial simply ignores within-trial temporally specific information. Interestingly, even ERPs were outperformed by wavelet and Orig Mag features in data set 3 (but not the opposite across the three data sets; see Figure 2B, violet boxes in Bayes matrices). This suggests that even further targeting the most informative elements (wavelet) and/or data samples (Orig Mag) within the trial can lead to improved decoding. Note that the wavelet and Orig Mag features provided the most informative temporal patterns and samples on the dimension reduction procedure applied on their extracted features (see section 2).

Following previous observations about the advantage of delta (Watrous et al., 2015; Behroozi et al., 2016) and theta (Wang et al., 2018) frequency bands, we compared the information content in the delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–16 Hz), gamma (16–200 Hz), and broad frequency bands. We predicted the domination of theta frequency band, following suggestions about the domination of the theta frequency band in feedforward visual processing (Bastos et al., 2015). For our top-performing ERP, wavelet, and Orig Mag features, we saw consistent domination of theta followed by the alpha frequency band (see Figure 2A). Interestingly, for the ERP components, the decoding in the theta band even outperformed the broad band (BF > 3 for P2b), which contained the whole frequency spectrum. Note that as opposed to previous suggestions (Karakaş, Erzengin, & Başar, 2000), the domination of the theta frequency band in ERP components could not be trivially predicted by their timing relative to the stimulus onset. If this was the case here, the P2b component (200–275 ms) should have elicited its maximum information in the delta (0.5–4 Hz) and theta (4–8 Hz), rather than the theta and alpha (8–12 Hz) frequency bands. For the mean feature, the delta band provided the highest information level comparable to the level of the broad band activity. This confirms that broad band whole-trial mean activity, reflects the general trend of the signal (low-frequency component).

Together, we observed that the features that are targeted at informative windows of the trial (ERP components), and those sensitive to informative temporal variabilities (wavelet and Orig Mag) could provide additional category information to the conventionally used mean of activity. We observed that the theta frequency band, which has been suggested to support feedforward information flow, is also dominant in our data sets, which are potentially dominated by feedforward processing of visual information during object perception. Next, we compare the temporal dynamics of information encoding across our features.

3.2  Do the Features Sensitive to Temporal Variabilities Evolve over Similar Time Windows to the Mean Feature?

One main insight that EEG decoding can provide is to reveal the temporal dynamics of cognitive processes. However, the mean activity, which has dominated the literature (Grootswagers et al., 2017), might hide or distort the true temporal dynamics as it ignores potentially informative temporal variabilities (codes) within the analysis window. Therefore, we systematically compared the information content of a large set of features that are sensitive to temporal variabilities using time-resolved decoding (50 ms sliding time windows in steps of 5 ms; see the rationale for choosing the 50 ms windows in Supplementary Figure 3A). By definition, we do not have the time-resolved decoding results for the ERP components here.

Before presenting the time-resolved decoding results, to validate the results and suggestions made about our whole-trial decoding (see Figure 2), we performed two complementary analyses. First, we checked to see if the advantage of the theta-to-broad-band decoding in the whole-trial analysis (see Figure 2) could generalize to time-resolved decoding: we observed the same effect in the (variability-sensitive) wavelet feature (in many time points especially for data set 2; BF > 3), but not in the (variability-insensitive) mean feature (see Supplementary Figure 3B). This could possibly be explained by the smoothing (low-pass filtering) effect of the mean feature making both theta and broad band data look like low-frequency data. Next, we used the spatiotemporal specificity of classifier weights and time-resolved decoding to see if theta band information would show a feedforward trend on the scalp to support our earlier suggestion. Visual inspection suggests information spread from the posterior to the anterior parts of the head (e.g., as in feedforward models of visual processing; (Karimi-Rouzbahani et al., 2017c; see Supplementary Figure 4), supporting the role of theta -band activity in feedforward processing. Despite these observations, we used broad band signals in the following analyses to be able to compare our results with previous studies, which generally used the broad band activity.
Figure 4:

Correlation between the decoding accuracies and behavioral reaction times for data set 2 (other data sets did not have an active object recognition or detection task). (A) Top section in each panel shows the (Spearman's) correlation coefficient obtained from correlating the decoding values and the reaction times for each feature separately. Correlation curves were obtained from the data of all participants. Bottom section shows positively or negatively significant (p<0.05; filled circles) or nonsignificant (p> 0.05; open circles) correlations as evaluated by random permutation of the variables in correlation. (B) Correlation between each of the amplitude and timing parameters of time-resolved decoding (i.e., maximum and average decoding accuracy and time of first and maximum decoding) with the average time-resolved correlations calculated from panel for the set of N=28 features. The slanted line shows the best linear fit to the distribution of the data.

Figure 4:

Correlation between the decoding accuracies and behavioral reaction times for data set 2 (other data sets did not have an active object recognition or detection task). (A) Top section in each panel shows the (Spearman's) correlation coefficient obtained from correlating the decoding values and the reaction times for each feature separately. Correlation curves were obtained from the data of all participants. Bottom section shows positively or negatively significant (p<0.05; filled circles) or nonsignificant (p> 0.05; open circles) correlations as evaluated by random permutation of the variables in correlation. (B) Correlation between each of the amplitude and timing parameters of time-resolved decoding (i.e., maximum and average decoding accuracy and time of first and maximum decoding) with the average time-resolved correlations calculated from panel for the set of N=28 features. The slanted line shows the best linear fit to the distribution of the data.

Time-resolved decoding analyses showed that for all features, including the complexity features, which were suggested to need large sample sizes (Procaccia, 1988), there was moderate (3 < BF < 10) or strong (BF > 10) evidence for above-chance decoding at some time points consistently across the three data sets (see Supplementary Figure 5A). However, all features showed distinct temporal dynamics to each other and across data sets. The dissimilarities between data sets could be driven by many data set–specific factors, including duration of image presentation (Carlson et al., 2013). However, there were also similarities between the temporal dynamics of different features. For example, the time points of first strong (BF > 10) evidence for above-chance decoding ranged from 75 ms to 195 ms (see Supplementary Figures 5A and 5E), and the decoding values reached their maxima in the range between 150 ms and 220 ms (see Supplementary Figures 5A and 5D) across features. This is consistent with many decoding studies showing the temporal dynamics of visual processing in the brain (Isik et al., 2014; Cichy et al., 2014; Karimi-Rouzbahani, Woolgar, & Rich, 2021). There was no feature that consistently preceded or followed other features to suggest the existence of very early or late neural codes (see Supplementary Figures 5D and 5E). There was more information decoded from features of mean, median, variance, and several multivalued features, especially wavelet and Orig Mag, compared to other features across the three data sets (see Supplementary Figure 5A). The mentioned features dominated other features in terms of both average and maximum decoding accuracies (see Supplementary Figures 5B and 5C). A complementary analysis suggested a potential overlap between the neural codes that different features detected (see Supplementary Figure 6).

We then directly compared the mean and the most informative variability-sensitive features (wavelet and Orig Mag). Consistently across the data sets, there was moderate (3 < BF < 10) or strong (BF > 10) evidence for higher decoding obtained by wavelet and Orig Mag compared to the mean feature on time points before 200 ms poststimulus onset (see Figure 3A). After 200 ms, this advantage sustained (data set 3), disappeared (data set 1), or turned into disadvantage (data set 2). Except for few very short continuous intervals, during which wavelet provided higher decoding values compared to Orig Mag, the two features provided almost the same results (see Figure 3, yellow dots on bottom panels). Comparing the parameters of the decoding curves, we found moderate (3 < BF < 10) or strong (BF > 10) evidence for higher maximum decoding for the wavelet and Orig Mag features than the mean feature in data sets 1 and 3 (see Figure 3B). There was also moderate (3 < BF < 10) evidence for higher maximum decoding accuracy for wavelet versus Orig Mag (see Figure 3B). There was also strong (BF > 10) evidence for higher average decoding accuracy for the wavelet and Orig Mag features over the mean feature in data set 3 (see Figure 3C). There was also moderate (3 < BF < 10) evidence for higher maximum decoding for wavelet versus Orig Mag in data sets 2 and 3. These results show that the wavelet feature provides the highest maximum (in data set 3) and average (in data sets 2 and 3) decoding accuracies among the three features, followed by the Orig Mag feature. The measures of maximum and average decoding accuracies were calculated in the poststimulus onset (0–1000 ms) for each participant separately. We also compared the timing parameters of the decoding curves (i.e., the time to the first above chance and maximum decoding relative to stimulus onset) obtained for the three features (see Figures 3D and 3E), but found insufficient evidence (0.3 < BF < 3) for their difference.

Together, these results suggest that the inclusion of temporal variabilities of activity can provide additional information about object categories to what is conventionally obtained from the mean of activity. Note that the advantage of wavelet and Orig Mag features cannot be explained by the size or dimensionality of the feature space, as the numbers of dimensions were equalized across features. Importantly, however, the decoding of information from temporal variabilities did not lead to different temporal dynamics of information decoding. This can be explained by either the common cognitive processes producing the decoded neural codes (i.e., object categorization), the overlap between the information (neural codes) detected by our features, or a combination of both.

3.3  Do the Features Sensitive to Temporal Variabilities Explain the Behavioral Recognition Performance More Accurately than the Mean Feature?

Although we observed an advantage for the features that were sensitive to temporal variability (e.g. wavelet) over other, more summarized features (e.g. mean), this can all be a by-product of more flexibility (e.g. inclusion of both temporal and spatial codes) in the former over the latter, and not read out by downstream neurons that support behavior. To validate the behavioral relevance of the detected neural codes we calculated the correlation between the decoding accuracies of features and the reaction times of participants (Vidaurre, Myers, Stokes, Nobre, & Woolrich, 2019; Ritchie et al., 2015). Participants' reaction times in object recognition have been previously shown to be predictable from decoding accuracy (Ritchie et al., 2015). We expected to observe negative correlations between the features' decoding accuracies and participants' reaction times in the poststimulus span (Ritchie et al., 2015). This suggests that greater separability between neural representations of categories might lead to categorizing them faster in behavior, supporting that the decoded neural codes might be used by neurons that drive behavior. We used only data set 2 in this analysis, as it was the only data set with an active object detection task, so relevant reaction times were available. The (Spearman's rank-order) correlations were calculated across the time course of the trials between the 10-dimensional vector of neural decoding accuracies obtained on every time point and the 10-dimensional vector of behavioral reaction times, both obtained from the group of 10 participants (Cichy et al., 2014). This resulted in a single correlation value for each time point for the whole group of participants.

All features except Katz FD showed negative trends after the stimulus onset (see Figure 4A). The correlations showed more sustained negative values for the multivalued versus single-valued features (p< 0.05). There were also larger negative peaks (generally <-0.5) for multivalued features, especially wavelet compared to other features (generally >-0.5). Specifically, while higher-order moment features (variance, skewness, and kurtosis), as well as many complexity features, showed earlier negative peaks at around 150 ms, mean, median, frequency-domain features, and multivalued features showed later negative peaks after 300 ms. Therefore, the multivalued features, especially wavelet, which were sensitive to temporal variabilities of the signals, showed the most sustained and significant correlations to behavior.

Visual inspection suggests that features that provided a higher decoding accuracy (e.g., wavelet, Figure 3) also did better at predicting behavioral performance (e.g., wavelet, Figure 4). To quantitatively see if such a relationship exists, we calculated the correlation between parameters of the decoding curves (introduced in Figures 3B to 3D) and the average correlation to behavior obtained by the same features (see Figure 4A). Specifically, we used the average decoding and maximum decoding accuracies, which we hypothesized to predict average correlation to behavior. The rationale behind this hypothesis was that more effective decoding of neural codes, as reflected in higher average decoding and maximum decoding accuracies (see Figure 3), should facilitate better prediction of behavior by detecting subtle but overlooked behavior-relevant neural codes. As a control, we also evaluated the time of first above-chance decoding and time of maximum decoding accuracies, which we hypothesized not to correlate with average correlation to behavior. Our rationale behind this prediction was that we already observed relatively similar temporal dynamics for more and less informative features of neural activity (see Figures 3D and 3E), suggesting that all those features detect some aspects of the codes produced by similar neural mechanisms.

To obtain the parameter of average correlation to behavior, we simply averaged the correlation to behavior in the poststimulus time span for each feature separately (see Figure 4A). Results showed that (see Figure 4B) while the temporal parameters of time of first above-chance decoding and time of maximum decoding (our control parameters) failed to predict the level of average correlation to behavior (r=0.24, p=0.21, and r=0.17, p=0.38, respectively), the parameters of maximum decoding and average decoding accuracies significantly (r=-0.69 and r=-0.71 respectively, with p< 0.0001; Pearson's correlation) predicted the average correlation to behavior. Note the difference between the Spearman's correlation to behavior calculated in Figure 4A and the correlations reported in Figure 4B. While the Spearman's correlation to behavior is obtained by correlating the time-resolved decoding rates and corresponding reaction times across participants, the average correlation to behavior is calculated by correlating the poststimulus average of the former correlations and their corresponding decoding parameters across features rather than participants. This result suggests that the more effective the decoding of the neural codes, the better the prediction of behavior. This is not a trivial result; higher decoding values for the more informative features do not necessarily lead to higher correlation to behavior, as “correlation” normalizes the absolute values of input variables.4

Temporal variability of neural activity has been suggested to provide an additional channel to the mean of activity for the encoding of several aspects of the input sensory information. This includes complexity (Garrett et al., 2020), uncertainty (Orbán et al., 2016), and variance (Hermundstad et al., 2014) of the input information. It is suggested that the brain optimizes the neuronal activation and variability to avoid overactivation (energy loss) for simple, familiar, and less informative categories of sensory inputs. For example, face images, which have less variable compositional features, evoked less variable responses in fMRI compared to house images, which were more varied, even in a passive viewing task (Garrett et al., 2020). This automatic and adaptive modulation of neural variability can result in more effective and accurate encoding of the sensory inputs in changing environments–for example, by suppressing uninformative neuronal activation for less varied (more familiar) stimuli such as face versus house images (Garrett et al., 2020). Despite the recent evidence about the richness of information in temporal variability, which is modulated by the category of the sensory input (Garrett et al., 2020; Orbán et al., 2016; Waschke et al., 2021), the majority of EEG studies still ignore variability in decoding. Specifically, they generally either extract variability (e.g., entropy and power) from the whole-trial activity (e.g., for brain-computer interface) or use the simple mean (average) magnitude data within subwindows of the trial (e.g., for time-resolved decoding; Grootswagers et al., 2017). The former can miss the informative within-trial variabilities and fluctuations of the trial in the highly dynamical and nonstationary evoked potentials. The latter may overlook the informative variabilities within the sliding time windows as a result of temporal averaging.

Here, we quantified the advantage of the features sensitive to temporal variabilities over the conventional mean activity. In whole-trial analysis, we observed that the features, which targeted informative subwindows and samples of the trial (e.g., ERP components, wavelet coefficients (wavelet), and original magnitude data (Orig Mag), could provide more category information than the mean feature, which ignored temporal variabilities. Interestingly, ERP components (N1, P2a, and P2b) provided comparable results to those obtained by informative samples (Orig Mag) or wavelet transformation (except for data set 3). That could be the reason for the remarkable decoding results achieved in previous studies that used ERPs (Wang et al., 2012; Qin et al., 2016) and wavelet (Taghizadeh-Sarabi et al., 2015). These results also propose that we might not need to apply complex transformations (e.g., wavelet) on the data in whole-trial analysis (Taghizadeh-Sarabi et al., 2015), as comparable results can be obtained using simple ERP components or original magnitude data. However, inclusion of more dimensions of the features in decoding or combining them (Karimi Rouzbahani & Daliri, 2011; Qin et al., 2016) could potentially provide higher decoding accuracies for multivalued (e.g., wavelet; Taghizadeh-Sarabi et al., 2015) than ERP features (i.e., we equalized the dimensions across features here).

The wavelet and Original magnitude data not only outperformed all the variability-sensitive features, but also the conventional mean feature. Importantly, while features such as Hilbert phase and amplitude, phase- and amplitude-locking, and interelectrode correlations also had access to all the samples within the sliding analysis window, they failed to provide information comparable to the Wavelet and Orig Mag features. The reason for the success of the original magnitude data seems to be that it basically makes no assumptions about the shape or pattern of the potential neural codes, as opposed to Hilbert phase (Hilb Phs), amplitude (Hilb Amp), and correlated variability (Cross Corr), each of which is sensitive to one specific aspect of neural variability (phase, amplitude, correlation). The reason for the success of the wavelet feature seems to be its reasonable balance between flexibility in detecting potential neural codes contained in the amplitude, phase, and frequency/scale and a relatively lower susceptibility to noise as a result of filtering applied on different frequency bands (Guo et al., 2009). Together, these observations support the idea that neural codes are complex structures reflected in multiple aspects of EEG data such as amplitude, phase, and frequency/scale (Panzeri et al., 2010; Waschke et al., 2021).

The advantage of theta over broad band in our data (see Supplementary Figures 1 and 3) is consistent with previous monkey studies suggesting that theta and gamma frequency bands played major roles in feedforward processing of visual information in the brain (Bastos et al., 2015), which also seemed dominant here (see Supplementary Figure 4). One potential reason for the encoding of feedforward information in the theta band can be that bottom-up sensory signals transfer information about ongoing experiences, which might need to be stored in long-term memory for future use (Zheng & Colgin, 2015). Long-term memories are suggested to be encoded by enhanced long-lasting synaptic connections. The optimal patterns of activity that can cause such changes in synaptic weights were suggested to be successive theta cycles that carry contents in fast gamma rhythms (100 Hz; Larson, Wong, & Lynch, 1986). While direct correspondence between invasive versus noninvasive neural data remains unclear (Ng, Logothetis, & Kayser, 2013), this study provides additional evidence for the major role of the theta frequency band in human visual perception (Wang et al., 2012; Qin et al., 2016; Jadidi et al., 2016; Taghizadeh-Sarabi et al., 2015; Torabi et al., 2017). It also suggests that the BCI community might benefit from concentrating on specific frequency bands relevant to the cognitive or sensory processing undergoing in the brain, that is, investigating the theta band when stimulating the visual system.

One critical question for cognitive neuroscience has been whether (if at all) neuroimaging data can explain behavior (Williams et al., 2007; Ritchie et al., 2015; Woolgar et al., 2019; Karimi-Rouzbahani et al., 2019; Karimi-Rouzbahani, Ramezani, Woolgar, Rich, & Ghodrati, 2021). We extended this question by asking whether more optimal decoding of object category information can lead to better prediction of behavioral performance. We showed in data set 2 that this can be the case. Critically, here we observed for the same data set that there seems to be a linear relationship between the obtainable decoding accuracy and the explanatory power of the features. It implies that in order to bring neuroimaging observations closer to behavior, we might need to work on how we can read out the neural codes more effectively.

It has been suggested that neural variability is modulated not only by sensory information (as focused on here) but also by other top-down cognitive processes such as attention, expectation, memory, and task demands (Waschke et al. 2021). For example, attention decreased low-frequency neural variabilities/power (2–10 Hz, referred to as “desynchronization”) while increasing high-frequency neural variabilities/power (Wyart & Tallon-Baudry, 2009). Therefore, in the future, it will be interesting to know which features best detect the modulation of neural variability in other cognitive tasks. Moreover, it is interesting to know how (if at all) a combination of the features used in this study could provide any additional information about object categories or behavior. In other words, although all of the individual features evaluated here covered some variance of category object information, to detect the neural information more effectively, it might be helpful to combine multiple features using supervised and unsupervised methods (Karimi Rouzbahani & Daliri, 2011; Qin et al., 2016).

The cross-data set, large-scale analysis methods implemented in this study align with the growing trend toward meta-analysis in cognitive neuroscience. Recent studies have also adopted and compared several data sets to facilitate forming more rigorous conclusions about how the brain performs different cognitive processes such as sustained attention (Langner & Eickhoff, 2013) or working memory (Adam, Vogel, & Awh, 2020). Our results provide evidence supporting the idea that neural variability seems to be an additional channel for information encoding in EEG, which should not be simply ignored.

H.K.-R. was funded by the Royal Society's Newton International Fellowship (SUAI/059/G101116) and MRC Cognition and Brain Sciences Unit.

Abootalebi
,
V.
,
Moradi
,
M. H.
, &
Khalilzadeh
,
M. A.
(
2009
).
A new approach for EEG feature extraction in P300-based lie detection.
Computer Methods and Programs in Biomedicine
,
94
(
1
),
48
57
.
[PubMed]
Aboy
,
M.
,
Hornero
,
R.
,
Abásolo
,
D.
, &
Álvarez
,
D.
(
2006
).
Interpretation of the Lempel-Ziv complexity measure in the context of biomedical signal analysis.
IEEE Transactions on Biomedical Engineering
,
53
(
11
),
2282
2288
.
[PubMed]
Adam
,
K. C.
,
Vogel
,
E. K.
, &
Awh
,
E.
(
2020
).
Multivariate analysis reveals a generalizable human electrophysiological signature of working memory load.
Psychophysiology
,
57
(
12
), e13691.
Ahmadi-
Pajouh
,
M. A.
,
Ala
,
T. S.
,
Zamanian
,
F.
,
Namazi
,
H.
, &
Jafari
,
S.
(
2018
).
Fractal-based classification of human brain response to living and non-living visual stimuli.
Fractals
,
26
(
5
), 1850069.
Alimardani
,
F.
,
Cho
,
J. H.
,
Boostani
,
R.
, &
Hwang
,
H. J.
(
2018
).
Classification of bipolar ?1217 disorder and schizophrenia using steady-state visual evoked potential based features
.
IEEE Access
,
6
,
40379
40388
.
Bastos
,
A. M.
, &
Schoffelen
,
J. M.
(
2016
).
A tutorial review of functional connectivity analysis methods and their interpretational pitfalls.
Frontiers in Systems Neuroscience
,
9
, 175.
[PubMed]
Bastos
,
A. M.
,
Vezoli
,
J.
,
Bosman
,
C. A.
,
Schoffelen
,
J. M.
,
Oostenveld
,
R.
,
Dowdall
,
J. R.
, …
Fries
,
P.
(
2015
).
Visual areas exert feedforward and feedback influences through distinct frequency channels.
Neuron
,
85
(
2
),
390
401
.
[PubMed]
Behroozi
,
M.
,
Daliri
,
M. R.
, &
Shekarchi
,
B.
(
2016
).
EEG phase patterns reflect the representation of semantic categories of objects.
Medical and Biological Engineering and Computing
,
54
(
1
),
205
221
.
[PubMed]
Bizas
,
E.
,
Simos
,
P. G.
,
Stam
,
C. J.
,
Arvanitis
,
S.
,
Terzakis
,
D.
, &
Micheloyannis
,
S.
(
1999
).
EEG correlates of cerebral engagement in reading tasks.
Brain Topography
,
12
(
2
),
99
105
.
[PubMed]
Bruns
,
A.
,
Eckhorn
,
R.
,
Jokeit
,
H.
, &
Ebner
,
A.
(
2000
).
Amplitude envelope correlation detects coupling among incoherent brain signals.
Neuroreport
,
11
(
7
),
1509
1514
.
[PubMed]
Carlson
,
T.
,
Tovar
,
D. A.
,
Alink
,
A.
, &
Kriegeskorte
,
N.
(
2013
).
Representational dynamics of object vision: The first 1000 ms.
Journal of Vision
,
13
(
10
),
1
1
.
[PubMed]
Chan
,
A. M.
,
Halgren
,
E.
,
Marinkovic
,
K.
, &
Cash
,
S. S.
(
2011
).
Decoding word and category-specific spatiotemporal representations from MEG and EEG.
NeuroImage
,
54
(
4
),
3028
3039
.
[PubMed]
Cichy
,
R. M.
,
Pantazis
,
D.
, &
Oliva
,
A.
(
2014
).
Resolving human object recognition in space and time.
Nature Neuroscience
,
17
(
3
), 455.
[PubMed]
Contini
,
E. W.
,
Wardle
,
S. G.
, &
Carlson
,
T. A.
(
2017
).
Decoding the time-course of object recognition in the human brain: From visual features to categorical decisions.
Neuropsychologia
,
105
,
165
176
.
[PubMed]
Dienes
,
Z.
(
2014
).
Using Bayes to get the most out of non-significant results.
Frontiers in Psychology
,
5
, 781.
[PubMed]
Hart
,
P. E.
,
Stork
,
D. G.
, &
Duda
,
R. O.
(
2000
).
Pattern classification.
Hoboken, NJ
:
Wiley
.
Eckhorn
,
R.
,
Bauer
,
R.
,
Jordan
,
W.
,
Brosch
,
M.
,
Kruse
,
W.
,
Munk
,
M.
, &
Reitboeck
,
H. J.
(
1988
).
Coherent oscillations: A mechanism of feature linking in the visual cortex?
Biological Cybernetics
,
60
(
2
),
121
130
.
[PubMed]
Engel
,
A. K.
,
Gerloff
,
C.
,
Hilgetag
,
C. C.
, &
Nolte
,
G.
(
2013
).
Intrinsic coupling modes: Multiscale interactions in ongoing brain activity.
Neuron
,
80
(
4
),
867
886
.
[PubMed]
Fulcher
,
B. D.
, &
Jones
,
N. S.
(
2017
).
HCTSA: A computational framework for automated time-series phenotyping using massive feature extraction.
Cell Systems
,
1
(
5
),
527
531
.
Gabor
,
D.
(
1946
).
Theory of communication. Part 1: The analysis of information.
Journal of the Institution of Electrical Engineers–Part III: Radio and Communication Engineering
,
93
(
26
),
429
441
.
Garrett
,
D. D.
,
Epp
,
S. M.
,
Kleemeyer
,
M.
,
Lindenberger
,
U.
, &
Polk
,
T. A.
(
2020
).
Higher performers upregulate brain signal variability in response to more feature-rich visual input.
NeuroImage
,
217
, 116836.
Gawne
,
T. J.
,
Kjaer
,
T. W.
, &
Richmond
,
B. J.
(
1996
).
Latency: Another potential code for feature binding in striate cortex.
Journal of Neurophysiology
,
76
(
2
),
1356
1360
.
[PubMed]
Gelman
,
A.
,
Hill
,
J.
, &
Yajima
,
M.
(
2012
).
Why we (usually) don't have to worry about multiple comparisons.
Journal of Research on Educational Effectiveness
,
5
(
2
),
189
211
.
Gelman
,
A.
, &
Tuerlinckx
,
F.
(
2000
).
Type S error rates for classical and Bayesian single and multiple comparison procedures.
Computational Statistics
,
15
(
3
),
373
390
.
Grootswagers
,
T.
,
Cichy
,
R. M.
, &
Carlson
,
T. A.
(
2018
).
Finding decodable information that can be read out in behaviour.
NeuroImage
,
179
,
252
262
.
Grootswagers
,
T.
,
Robinson
,
A. K.
, &
Carlson
,
T. A.
(
2019
).
The representational dynamics of visual objects in rapid serial visual processing streams.
NeuroImage
,
188
,
668
679
.
[PubMed]
Grootswagers
,
T.
,
Wardle
,
S. G.
, &
Carlson
,
T. A.
(
2017
).
Decoding dynamic brain patterns from evoked responses: A tutorial on multivariate pattern analysis applied to time series neuroimaging data.
Journal of Cognitive Neuroscience
,
29
(
4
),
677
697
.
[PubMed]
Guo
,
L.
,
Rivero
,
D.
,
Seoane
,
J. A.
, &
Pazos
,
A.
(
2009
).
Classification of EEG signals using relative wavelet energy and artificial neural networks.
In Proceedings of the First ACM/SIGEVO Summit on Genetic and Evolutionary Computation (pp.
177
184
).
New York
:
ACM
.
Hacker
,
C. D.
,
Snyder
,
A. Z.
,
Pahwa
,
M.
,
Corbetta
,
M.
, &
Leuthardt
,
E. C.
(
2017
).
Frequency-specific electrophysiologic correlates of resting state fMRI networks.
NeuroImage
,
149
,
446
457
.
[PubMed]
Hart
,
P. E.
,
Stork
,
D. G.
, &
Duda
,
R. O.
(
2000
).
Pattern classification
.
Hoboken
:
Wiley
.
Hatamimajoumerd
,
E.
, &
Talebpour
,
A.
(
2019
).
A temporal neural trace of wavelet coefficients in human object vision: An MEG study.
Frontiers in Neural Circuits
,
13
, 20.
[PubMed]
Hatamimajoumerd
,
E.
,
Talebpour
,
A.
, &
Mohsenzadeh
,
Y.
(
2020
).
Enhancing multivariate pattern analysis for magnetoencephalography through relevant sensor selection.
International Journal of Imaging Systems and Technology
,
30
(
2
),
473
494
.
Haxby
,
J. V.
,
Gobbini
,
M. I.
,
Furey
,
M. L.
,
Ishai
,
A.
,
Schouten
,
J. L.
, &
Pietrini
,
P.
(
2001
).
Distributed and overlapping representations of faces and objects in ventral temporal cortex.
Science
,
293
(
5539
),
2425
2430
.
[PubMed]
Haynes
,
J. D.
, &
Rees
,
G.
(
2006
).
Decoding mental states from brain activity in humans.
Nature Reviews Neuroscience
,
7
(
7
),
523
534
.
[PubMed]
Hebart
,
M. N.
, &
Baker
,
C. I.
(
2018
).
Deconstructing multivariate decoding for the study of brain function.
NeuroImage
,
180
,
4
18
.
[PubMed]
Hermundstad
,
A. M.
,
Briguglio
,
J. J.
,
Conte
,
M. M.
,
Victor
,
J. D.
,
Balasubramanian
,
V.
, &
Tkačik
,
G.
(
2014
).
Variance predicts salience in central sensory processing.
eLife
,
3
, e03722.
[PubMed]
Higuchi
,
T.
(
1988
).
Approach to an irregular time series on the basis of the fractal theory.
Physica D: Nonlinear Phenomena
,
31
(
2
),
277
283
.
Hjorth
,
B.
(
1970
).
EEG analysis based on time domain properties.
Electroencephalography and Clinical Neurophysiology
,
29
(
3
),
306
310
.
[PubMed]
Hung
,
C. P.
,
Kreiman
,
G.
,
Poggio
,
T.
, &
DiCarlo
,
J. J.
(
2005
).
Fast readout of object identity from macaque inferior temporal cortex.
Science
,
310
(
5749
),
863
866
.
[PubMed]
Intriligator
,
J.
, &
Polich
,
J.
(
1995
).
On the relationship between EEG and ERP variability.
International Journal of Psychophysiology
,
20
(
1
),
59
74
.
[PubMed]
Iranmanesh
,
S.
, & Rodriguez-
Villegas
,
E.
(
2017
).
An ultralow–power sleep spindle detection system on chip.
IEEE Transactions on Biomedical Circuits and Systems
,
11
(
4
),
858
866
.
[PubMed]
Isik
,
L.
,
Meyers
,
E. M.
,
Leibo
,
J. Z.
, &
Poggio
,
T.
(
2014
).
The dynamics of invariant object recognition in the human visual system.
Journal of Neurophysiology
,
111
(
1
),
91
102
.
Jadidi
,
A. F.
,
Zargar
,
B. S.
, &
Moradi
,
M. H.
(
2016, November
).
Categorizing visual objects; Using ERP components.
In
Proceedings of the 2016 23rd Iranian Conference on Biomedical Engineering and 2016 1st International Iranian Conference on Biomedical Engineering
(pp.
159
164
).
Piscataway, NJ
:
IEEE
.
Jeffreys
,
H.
(
1998
).
The theory of probability
.
Oxford: Oxford University Press
.
Joshi
,
D.
,
Panigrahi
,
B. K.
,
Anand
,
S.
, &
Santhosh
,
J.
(
2018
).
Classification of targets and distractors present in visual hemifields using time-frequency domain EEG features.
Journal of Healthcare Engineering
,
2018
.
[PubMed]
Kaneshiro
,
B.
,
Guimaraes
,
M. P.
,
Kim
,
H. S.
,
Norcia
,
A. M.
, &
Suppes
,
P.
(
2015
).
A representational similarity analysis of the dynamics of object processing using single-trial EEG classification.
PLOS One
,
10
(
8
), e0135697.
[PubMed]
Karakaş
,
S.
,
Erzengin
,
Ö. U.
, &
Başar
,
E.
(
2000
).
The genesis of human event-related responses explained through the theory of oscillatory neural assemblies.
Neuroscience Letters
,
285
(
1
),
45
48
.
[PubMed]
Karimi-
Rouzbahani
,
H.
(
2018
).
Three-stage processing of category and variation information by entangled interactive mechanisms of peri-occipital and peri-frontal cortices.
Scientific Reports
,
8
(
1
),
1
22
.
Karimi-
Rouzbahani
,
H.
,
Bagheri
,
N.
, &
Ebrahimpour
,
R.
(
2017a
).
Average activity, but not variability, is the dominant factor in the representation of object categories in the brain.
Neuroscience
,
346
,
14
28
.
[PubMed]
Karimi-
Rouzbahani
,
H.
,
Bagheri
,
N.
, &
Ebrahimpour
,
R.
(
2017b
).
Hard-wired feed-forward visual mechanisms of the brain compensate for affine variations in object recognition.
Neuroscience
,
349
,
48
63
.
[PubMed]
Karimi-
Rouzbahani
,
H.
,
Bagheri
,
N.
, &
Ebrahimpour
,
R.
(
2017c
).
Invariant object recognition is a personalized selection of invariant features in humans, not simply explained by hierarchical feed-forward vision models.
Scientific Reports
,
7
(
1
),
1
24
.
Karimi
Rouzbahani
,
H.
, &
Daliri
,
M. R.
(
2011
).
Diagnosis of Parkinson's disease in human using voice signals.
Basic and Clinical Neuroscience
,
2
(
3
),
12
20
.
Karimi-
Rouzbahani
,
H.
,
Ramezani
,
F.
,
Woolgar
,
A.
,
Rich
,
A.
, &
Ghodrati
,
M.
(
2021
).
Perceptual difficulty modulates the direction of information flow in familiar face recognition.
NeuroImage
,
233
, 117896.
Karimi-
Rouzbahani
,
H.
,
Vahab
,
E.
,
Ebrahimpour
,
R.
, &
Menhaj
,
M. B.
(
2019
).
Spatiotemporal analysis of category and target-related information processing in the brain during object detection.
Behavioural Brain Research
,
362
,
224
239
.
Karimi-
Rouzbahani
,
H.
,
Woolgar
,
A.
, &
Rich
,
A. N.
(
2021
).
Neural signatures of vigilance decrements predict behavioural errors before they occur.
eLife
,
10
, e60563.
[PubMed]
Katz
,
M. J.
(
1988
).
Fractals and the analysis of waveforms.
Computers in Biology and Medicine
,
18
(
3
),
145
156
.
[PubMed]
Kayser
,
C.
,
Montemurro
,
M. A.
,
Logothetis
,
N. K.
, &
Panzeri
,
S.
(
2009
).
Spike-phase coding boosts and stabilizes information carried by spatial and temporal spike patterns.
Neuron
,
61
(
4
),
597
608
.
[PubMed]
Kiani
,
R.
,
Esteky
,
H.
,
Mirpour
,
K.
, &
Tanaka
,
K.
(
2007
).
Object category structure in response patterns of neuronal population in monkey inferior temporal cortex.
Journal of Neurophysiology
,
97
(
6
),
4296
4309
.
[PubMed]
Kosciessa
,
J. Q.
,
Lindenberger
,
U.
, &
Garrett
,
D. D.
(
2021
).
Thalamocortical excitability modulation guides human perception under uncertainty.
Nature Communications
,
12
(
1
),
1
15
.
Kriegeskorte
,
N.
,
Mur
,
M.
,
Ruff
,
D. A.
,
Kiani
,
R.
,
Bodurka
,
J.
,
Esteky
,
H.
, …
Bandettini
,
P. A.
(
2008
).
Matching categorical object representations in inferior temporal cortex of man and monkey.
Neuron
,
60
(
6
),
1126
1141
.
[PubMed]
Langner
,
R.
, &
Eickhoff
,
S. B.
(
2013
).
Sustaining attention to simple tasks: A meta-analytic review of the neural mechanisms of vigilant attention.
Psychological Bulletin
,
139
(
4
), 870.
[PubMed]
Larson
,
J.
,
Wong
,
D.
, &
Lynch
,
G.
(
1986
).
Patterned stimulation at the theta frequency is optimal for the induction of hippocampal long-term potentiation.
Brain Research
,
368
(
2
),
347
350
.
[PubMed]
Le Van Quyen
,
M.
,
Foucher
,
J.
,
Lachaux
,
J. P.
,
Rodriguez
,
E.
,
Lutz
,
A.
,
Martinerie
,
J.
, &
Varela
,
F. J.
(
2001
).
Comparison of Hilbert transform and wavelet methods for the analysis of neuronal synchrony.
Journal of Neuroscience Methods
,
111
(
2
),
83
98
.
Lee
,
M. D.
, &
Wagenmakers
,
E. J.
(
2005
).
Bayesian statistical inference in psychology: Comment on Trafimow (2003).
Psychological Review
,
112
(
3
),
662
668
.
[PubMed]
Lempel
,
A.
, &
Ziv
,
J.
(
1976
).
On the complexity of finite sequences.
IEEE Transactions on Information Theory
,
22
(
1
),
75
81
.
Liu
,
H.
,
Agam
,
Y.
,
Madsen
,
J. R.
, &
Kreiman
,
G.
(
2009
).
Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex.
Neuron
,
62
(
2
),
281
290
.
[PubMed]
Majima
,
K.
,
Matsuo
,
T.
,
Kawasaki
,
K.
,
Kawai
,
K.
,
Saito
,
N.
,
Hasegawa
,
I.
, &
Kamitani
,
Y.
(
2014
).
Decoding visual object categories from temporal correlations of ECoG signals.
NeuroImage
,
90
,
74
83
.
Mazaheri
,
A.
, &
Jensen
,
O.
(
2008
).
Asymmetric amplitude modulations of brain oscillations generate slow evoked responses.
Journal of Neuroscience
,
28
(
31
),
7781
7787
.
[PubMed]
Miyakawa
,
N.
,
Majima
,
K.
,
Sawahata
,
H.
,
Kawasaki
,
K.
,
Matsuo
,
T.
,
Kotake
,
N.
, …
Hasegawa
,
I.
(
2018
).
Heterogeneous redistribution of facial subcategory information within and outside the face-selective domain in primate inferior temporal cortex.
Cerebral Cortex
,
28
(
4
),
1416
1431
.
[PubMed]
Montemurro
,
M. A.
,
Rasch
,
M. J.
,
Murayama
,
Y.
,
Logothetis
,
N. K.
, &
Panzeri
,
S.
(
2008
).
Phase-of-firing coding of natural visual stimuli in primary visual cortex.
Current Biology
,
18
(
5
),
375
380
.
[PubMed]
Mostame
,
P.
, &
Sadaghiani
,
S.
(
2020
).
Phase- and amplitude-coupling are tied by an intrinsic spatial organization but show divergent stimulus-related changes.
NeuroImage
,
219
, 117051.
[PubMed]
Namazi
,
H.
,
Ala
,
T. S.
, &
Bakardjian
,
H.
(
2018
).
Decoding of steady-state visual evoked potentials by fractal analysis of the electroencephalographic (EEG) signal.
Fractals
,
26
(
6
), 1850092.
Ng
,
B. S. W.
,
Logothetis
,
N. K.
, &
Kayser
,
C.
(
2013
).
EEG phase patterns reflect the selectivity of neural firing.
Cerebral Cortex
,
23
(
2
),
389
398
.
Orbán
,
G.
,
Berkes
,
P.
,
Fiser
,
J.
, &
Lengyel
,
M.
(
2016
).
Neural variability and sampling-based probabilistic representations in the visual cortex.
Neuron
,
92
(
2
),
530
543
.
[PubMed]
Panzeri
,
S.
,
Brunel
,
N.
,
Logothetis
,
N. K.
, &
Kayser
,
C.
(
2010
).
Sensory neural codes using multiplexed temporal scales.
Trends in Neurosciences
,
33
(
3
),
111
120
.
[PubMed]
Pincus
,
S. M.
, &
Huang
,
W. M.
(
1992
).
Approximate entropy: Statistical properties and applications.
Communications in Statistics–Theory and Methods
,
21
(
11
),
3061
3077
.
Pouryazdian
,
S.
, &
Erfanian
,
A.
(
2009
).
Detection of steady-state visual evoked potentials for brain-computer interfaces using PCA and high-order statistics.
In World Congress on Medical Physics and Biomedical Engineering
(pp.
480
483
).
Berlin
:
Springer
.
Preissel
,
H.
,
Lutzenberger
,
W.
, Pulvermü
ller
,
F.
, &
Birbaumer
,
N.
(
1997
).
Fractal dimensions of short EEG time series in humans.
Neuroscience Letters
,
225
(
2
),
77
80
.
Procaccia
,
I.
(
1988
).
Universal properties of dynamically complex systems: The organization of chaos.
Nature
,
333
(
6174
),
618
623
.
Pulini
,
A. A.
,
Kerr
,
W. T.
,
Loo
,
S. K.
, &
Lenartowicz
,
A.
(
2019
).
Classification accuracy of neuroimaging biomarkers in attention-deficit/hyperactivity disorder: Effects of sample size and circular analysis.
Biological Psychiatry: Cognitive Neuroscience and Neuroimaging
,
4
(
2
),
108
120
.
[PubMed]
Qin
,
Y.
,
Zhan
,
Y.
,
Wang
,
C.
,
Zhang
,
J.
,
Yao
,
L.
,
Guo
,
X.
, …
Hu
,
B.
(
2016
).
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Cognitive Neurodynamics
,
10
(
4
),
275
285
.
[PubMed]
Racine
,
R.
(
2011
).
Estimating the Hurst exponent
.
Zurich
:
Mosaic Group
.
Rasoulzadeh
,
V. E. S. A. L.
,
Erkus
,
E. C.
,
Yogurt
,
T. A.
,
Ulusoy
,
I.
, &
Zergeroğlu
,
S. A.
(
2017
).
A comparative stationarity analysis of EEG signals.
Annals of Operations Research
,
258
(
1
),
133
157
.
Richman
,
J. S.
, &
Moorman
,
J. R.
(
2000
).
Physiological time-series analysis using approximate entropy and sample entropy.
American Journal of Physiology–Heart and Circulatory Physiology
,
278
(
6
),
H2039
2049
.
[PubMed]
Ritchie
,
J. B.
,
Tovar
,
D. A.
, &
Carlson
,
T. A.
(
2015
).
Emerging object representations in the visual system predict reaction times for categorization.
PLOS Computational Biology
,
11
(
6
), e1004316.
Robinson
,
A. K.
,
Grootswagers
,
T.
, &
Carlson
,
T. A.
(
2019
).
The influence of image masking on object representations during rapid serial visual presentation.
NeuroImage
,
197
,
224
231
.
Rossion
,
B.
,
Gauthier
,
I.
,
Tarr
,
M. J.
,
Despland
,
P.
,
Bruyer
,
R.
,
Linotte
,
S.
, &
Crommelinck
,
M.
(
2000
).
The N170 occipito-temporal component is delayed and enhanced to inverted faces but not to inverted objects: An electrophysiological account of face-specific processes in the human brain.
Neuroreport
,
11
(
1
),
69
72
.
[PubMed]
Rouder
,
J. N.
,
Morey
,
R. D.
,
Speckman
,
P. L.
, &
Province
,
J. M.
(
2012
).
Default Bayes factors for ANOVA designs.
Journal of Mathematical Psychology
,
56
(
5
),
356
374
.
Rousselet
,
G. A.
,
Husk
,
J. S.
,
Bennett
,
P. J.
, &
Sekuler
,
A. B.
(
2007
).
Single-trial EEG dynamics of object and face visual processing.
NeuroImage
,
36
(
3
),
843
862
.
[PubMed]
Rupp
,
K.
,
Roos
,
M.
,
Milsap
,
G.
,
Caceres
,
C.
,
Ratto
,
C.
,
Chevillet
,
M.
, …
Wolmetz
,
M.
(
2017
).
Semantic attributes are encoded in human electrocorticographic signals during visual object recognition.
NeuroImage
,
148
,
318
329
.
[PubMed]
Sammer
,
G.
(
1999
).
Working memory load and EEG-dynamics as revealed by point correlation dimension analysis.
International Journal of Psychophysiology
,
34
(
1
),
89
102
.
[PubMed]
Shourie
,
N.
,
Firoozabadi
,
M.
, &
Badie
,
K.
(
2014
).
Analysis of EEG signals related to artists and nonartists during visual perception, mental imagery, and rest using approximate entropy.
BioMed Research International
,
2014
, 764382.
Siegel
,
M.
,
Donner
,
T. H.
, &
Engel
,
A. K.
(
2012
).
Spectral fingerprints of large-scale neuronal interactions.
Nature Reviews Neuroscience
,
13
(
2
),
121
134
.
Siems
,
M.
, &
Siegel
,
M.
(
2020
).
Dissociated neuronal phase- and amplitude-coupling patterns in the human brain.
NeuroImage
,
209
, 116538.
[PubMed]
Simanova
,
I.
, Van
Gerven
,
M.
,
Oostenveld
,
R.
, &
Hagoort
,
P.
(
2010
).
Identifying object categories from event-related EEG: toward decoding of conceptual representations.
PLOS One
,
5
(
12
), e14465.
Stam
,
C. J.
(
2000
).
Brain dynamics in theta and alpha frequency bands and working memory performance in humans.
Neuroscience Letters
,
286
(
2
),
115
118
.
[PubMed]
Stam
,
C. J.
(
2005
).
Nonlinear dynamical analysis of EEG and MEG: Review of an emerging field.
Clinical Neurophysiology
,
116
(
10
),
2266
2301
.
[PubMed]
Stêpieñ
,
R. A.
(
2002
).
Testing for non-linearity in EEG signal of healthy subjects.
Acta neurobiologiae experimentalis
,
62
(
4
),
277
282
.
[PubMed]
Stewart
,
A. X.
,
Nuthmann
,
A.
, &
Sanguinetti
,
G.
(
2014
).
Single-trial classification of EEG in a visual object task using ICA and machine learning.
Journal of Neuroscience Methods
,
228
,
1
14
.
[PubMed]
Storey
,
J. D.
(
2002
).
A direct approach to false discovery rates.
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
,
64
(
3
),
479
498
.
Subha
,
D. P.
,
Joseph
,
P. K.
,
Acharya
,
R.
, &
Lim
,
C. M.
(
2010
).
EEG signal analysis: A survey.
Journal of Medical Systems
,
34
(
2
),
195
212
.
[PubMed]
Szczepański
,
J.
,
Amigó
,
J. M.
,
Wajnryb
,
E.
, & Sanchez-
Vives
,
M. V.
(
2003
).
Application of Lempel–Ziv complexity to the analysis of neural discharges.
Network: Computation in Neural Systems
,
14
(
2
),
335
350
.
[PubMed]
Tafreshi
,
T. F.
,
Daliri
,
M. R.
, &
Ghodousi
,
M.
(
2019
).
Functional and effective connectivity based features of EEG signals for object recognition.
Cognitive Neurodynamics
,
13
(
6
),
555
566
.
[PubMed]
Taghizadeh-
Sarabi
,
M.
,
Daliri
,
M. R.
, &
Niksirat
,
K. S.
(
2015
).
Decoding objects of basic categories from electroencephalographic signals using wavelet transform and support vector machines.
Brain Topography
,
28
(
1
),
33
46
.
Tononi
,
G.
, &
Edelman
,
G. M.
(
1998
).
Consciousness and complexity.
Science
,
282
(
5395
),
1846
1851
.
[PubMed]
Torabi
,
A.
,
Jahromy
,
F. Z.
and
Daliri
,
M. R.
,
2017
.
Semantic category-based classification using nonlinear features and wavelet coefficients of brain signals.
Cognitive Computation
,
9
(
5
),
702
711
.
Victor
,
J. D.
(
2000
).
How the brain uses time to represent and process visual information.
Brain Research
,
886
(
1–2
),
33
46
.
Vidal
,
J. R.
,
Ossandón
,
T.
,
Jerbi
,
K.
,
Dalal
,
S. S.
,
Minotti
,
L.
,
Ryvlin
,
P.
, …
Lachaux
,
J. P.
(
2010
).
Category-specific visual responses: An intracranial study comparing gamma, beta, alpha, and ERP response selectivity.
Frontiers in Human Neuroscience
,
4
, 195.
[PubMed]
Vidaurre
,
D.
,
Myers
,
N. E.
,
Stokes
,
M.
,
Nobre
,
A. C.
, &
Woolrich
,
M. W.
(
2019
).
Temporally unconstrained decoding reveals consistent but time-varying stages of stimulus processing.
Cerebral Cortex
,
29
(
2
),
863
874
.
[PubMed]
Voloh
,
B.
,
Oemisch
,
M.
, &
Womelsdorf
,
T.
(
2020
).
Phase of firing coding of learning variables across the fronto-striatal network during feature-based learning.
Nature Communications
,
11
(
1
),
1
16
.
[PubMed]
Wairagkar
,
M.
,
Zoulias
,
I.
,
Oguntosin
,
V.
,
Hayashi
,
Y.
, &
Nasuto
,
S.
(
2016, June
).
Movement intention based brain computer interface for virtual reality and soft robotics rehabilitation using novel autocorrelation analysis of EEG.
In
Proceedings of the 6th IEEE International Conference on Biomedical Robotics and Biomechatronics
(pp.
685
685
).
Piscataway, NJ
:
IEEE
.
Wang
,
C.
,
Xiong
,
S.
,
Hu
,
X.
,
Yao
,
L.
, &
Zhang
,
J.
(
2012
).
Combining features from ERP components in single-trial EEG for discriminating four-category visual objects.
Journal of Neural Engineering
,
9
(
5
), 056013.
[PubMed]
Wang
,
Y.
,
Wang
,
P.
and
Yu
,
Y.
,
2018
.
Decoding English alphabet letters using EEG phase information.
Frontiers in Neuroscience
,
12
, 62.
[PubMed]
Wark
,
B.
,
Fairhall
,
A.
, &
Rieke
,
F.
(
2009
).
Timescales of inference in visual adaptation.
Neuron
,
61
(
5
),
750
761
.
[PubMed]
Waschke
,
L.
,
Kloosterman
,
N. A.
,
Obleser
,
J.
, &
Garrett
,
D. D.
(
2021
).
Behavior needs neural variability.
Neuron
,
109
,
751
766
.
[PubMed]
Waschke
,
L.
,
Tune
,
S.
, &
Obleser
,
J.
(
2019
).
Local cortical desynchronization and pupil-linked arousal differentially shape brain states for optimal sensory performance.
eLife
,
8
, e51501.
[PubMed]
Watrous
,
A. J.
,
Deuker
,
L.
,
Fell
,
J.
, &
Axmacher
,
N.
(
2015
).
Phase-amplitude coupling supports phase coding in human ECoG.
eLife
,
4
, e07886.
Williams
,
M. A.
,
Dang
,
S.
, &
Kanwisher
,
N. G.
(
2007
).
Only some spatial patterns of fMRI response are read out in task performance.
Nature Neuroscience
,
10
(
6
),
685
686
.
[PubMed]
Wong
,
K. F. K.
,
Galka
,
A.
,
Yamashita
,
O.
, &
Ozaki
,
T.
(
2006
).
Modelling non-stationary variance in EEG time series by state space GARCH model.
Computers in Biology and Medicine
,
36
(
12
),
1327
1335
.
[PubMed]
Woolgar
,
A.
,
Dermody
,
N.
,
Afshar
,
S.
,
Williams
,
M. A.
, &
Rich
,
A. N.
(
2019
).
Meaningful patterns of information in the brain revealed through analysis of errors
. bioRxiv:673681.
Wyart
,
V.
, &
Tallon-Baudry
,
C.
(
2009
).
How ongoing fluctuations in human visual cortex predict perceptual awareness: Baseline shift versus decision bias.
Journal of Neuroscience
,
29
(
27
),
8715
8725
.
[PubMed]
Zellner
,
A.
, &
Siow
,
A.
(
1980
).
Posterior odds ratios for selected regression hypotheses.
Trabajos de estadística y de investigación operativa
,
31
(
1
),
585
603
.
Zheng
,
C.
, &
Colgin
,
L. L.
(
2015
).
Beta and gamma rhythms go with the flow.
Neuron
,
85
(
2
),
236
237
.
[PubMed]
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Supplementary data