Visual object perception involves neural processes that unfold over time and recruit multiple regions of the brain. Here, we use high-density EEG to investigate the spatiotemporal representations of object categories across the dorsal and ventral pathways. In , human participants were presented with images from two animate object categories (birds and insects) and two inanimate categories (tools and graspable objects). In , participants viewed images of tools and graspable objects from a different stimulus set, one in which a shape confound that often exists between these categories (elongation) was controlled for. To explore the temporal dynamics of object representations, we employed time-resolved multivariate pattern analysis on the EEG time series data. This was performed at the electrode level as well as in source space of two regions of interest: one encompassing the ventral pathway and another encompassing the dorsal pathway. Our results demonstrate shape, exemplar, and category information can be decoded from the EEG signal. Multivariate pattern analysis within source space revealed that both dorsal and ventral pathways contain information pertaining to shape, inanimate object categories, and animate object categories. Of particular interest, we note striking similarities obtained in both ventral stream and dorsal stream regions of interest. These findings provide insight into the spatio-temporal dynamics of object representation and contribute to a growing literature that has begun to redefine the traditional role of the dorsal pathway.
In this article, we describe the results of two experiments in which we applied high-density EEG (HD-EEG) to explore the spatio-temporal representations of visual objects in the human brain. The successful perception of visual objects poses serious computational challenges given that the signals traversing the human retina are ambiguous, noisy, and in constant flux. Despite these obstacles, visual perception is a rapid and efficient process. For example, discernment of different object categories can happen in as little as 120 msec (Kirchner & Thorpe, 2006), whereas neural activity corresponding to some object categories can be detected after only 80 msec (Crouzet, Kirchner, & Thorpe, 2010). The neural activity associated with object perception is not static but rather evolves over time (Contini, Wardle, & Carlson, 2017) and has been studied in terms of various event-related components (Bentin et al., 1996) that can be measured with magnetoencephalography/electroencephalography (M/EEG) techniques.
The cascade of neural activity evoked by object perception is thought to progress across multiple visual regions throughout the brain. Proceeding from V1 to the inferotemporal cortex, neuronal populations display increasingly larger receptive fields and increasingly complex tuning properties, culminating in regions that appear to be category selective (Booth & Rolls, 1998; Desimone, Albright, Gross, & Bruce, 1984; Gross, Rocha-Miranda, & Bender, 1972). fMRI studies have also uncovered object-selective responses in the human ventral temporal cortex (Malach et al., 1995) as well as selectivity toward specific object categories including faces (Ishai, Ungerleider, Martin, Schouten, & Haxby, 1999; Kanwisher, McDermott, & Chun, 1997), places (Epstein, Harris, Stanley, & Kanwisher, 1999; Epstein & Kanwisher, 1998), bodies (Downing, Jiang, Shuman, & Kanwisher, 2001), and words (Cohen & Dehaene, 2004). Neural regions displaying selectivity toward broader, superordinate visual categories, such as animacy (Grill-Spector & Weiner, 2014; Martin, 2007), have also been reported. In addition, many of these areas exhibit invariance to low-level manipulations such as size, viewing perspective, and contour definition (Sawamura, Georgieva, Vogels, Vanduffel, & Orban, 2005; Vuilleumier, Henson, Driver, & Dolan, 2002; Kourtzi & Kanwisher, 2000; Grill-Spector et al., 1998).
Classically, studies of visual object perception have focused on the ventral pathway, a network of neural regions thought to generate invariant representations of the world for the purposes of recognition. However, a growing body of empirical results suggests that the dorsal pathway (thought to be involved in spatial knowledge and visuomotor interaction) may also contribute to this process. This conclusion is corroborated by findings demonstrating that the dorsal pathway can process shape as well as certain object categories (Collins, Freud, Kainerstorfer, Cao, & Behrmann, 2019; Chen, Snow, Culham, & Goodale, 2018; Erlikhman, Gurariy, Mruczek, & Caplovitz, 2016; Zachariou, Klatzky, & Behrmann, 2014; Mruczek, von Loga, & Kastner, 2013; Almeida, Mahon, & Caramazza, 2010; Almeida, Mahon, Nakayama, & Caramazza, 2008; Konen & Kastner, 2008; Valyear, Cavina-Pratesi, Stiglick, & Culham, 2007; Weisberg, Van Turennout, & Martin, 2007; Fang & He, 2005; Chao & Martin, 2000). Graspable objects constitute one category of stimuli that has been shown to drive activation in the dorsal pathway. Tools are a special subset of graspable objects characterized by a specific associated motor plan learned through experience (Frey, 2007) and have been shown to activate regions in the ventral as well as the dorsal pathway (Garcea & Mahon, 2014; Mruczek et al., 2013; Hermsdörfer, Terlinden, Mühlau, Goldenberg, & Wohlschläger, 2007; Chao & Martin, 2000; Chao et al., 1999).
Some have questioned whether the aforementioned findings truly demonstrate neural tuning toward abstract categories, arguing that studies on object perception are often confounded by features that covary within a particular object class. For example, category members often share similarities across a number of dimensions such as real-world size (Konkle & Oliva, 2012), manipulability (Mahon et al., 2007), the potential for self-initiated behavior (Martin & Weisberg, 2003), and numerous low-level image properties (Andrews, Watson, Rice, & Hartley, 2015; Rice, Watson, Hartley, & Andrews, 2014; Watson, Hartley, & Andrews, 2014; Baldassi et al., 2013; O'Toole, Jiang, Abdi, & Haxby, 2005). Multivariate analyses of neural data offer greater sensitivity to subtle differences in patterns of activation (Kriegeskorte, 2011; Kriegeskorte, Goebel, & Bandettini, 2006; Lange et al., 1999) and have further challenged the notion of clearly delineated, object-selective regions. For example, Haxby et al. (2001) showed that category representations may be overlapping and distributed throughout ventral temporal cortex, as opposed to being localized within functionally homogenous neural modules. Similarly, some have questioned studies purporting to show object selectivity in the dorsal pathway, suggesting that shape rather than categorical membership might explain these results. For example, tools are more likely to be elongated along the principal axis relative to members of other nontool categories. Furthermore, many studies that have found dorsal activation to tools have failed to control for elongation (Chen et al., 2018; Sakuraba, Sakai, Yamanaka, Yokosawa, & Hirayama, 2012), leading to the possibility that shape, rather than category membership, best describes tool-evoked activity in the dorsal pathway (Almeida et al., 2014; Sakuraba et al., 2012; Sakata et al., 1998). Together, such findings complicate theoretical interpretations of previous research and underscore the importance of adequate controls and consilience across different methodologies.
In this study, we explored the dynamics of object perception using HD-EEG with a particular focus on the neural representation of visual objects in the dorsal and ventral pathways as well as the contribution of shape and category to these representations. We define representations as informational neural states that may subserve numerous processes such as recognition, visuomotor interactions, and so forth. Importantly, we extended our analysis across both space and time by conducting multivariate pattern analysis (MVPA) on source-localized dipoles within broadly defined dorsal and ventral ROIs. In the first experiment, participants viewed images of objects from different categories. Two superordinate categories (animate and inanimate) were further subdivided into four categories (bird, insect, tool, and graspable object). These object categories were chosen based on their relationship to the known computational properties of the dorsal and ventral pathways. Specifically, regions along the ventral pathway have been shown to process object identity and category membership for both animate and inanimate object categories (Macdonald & Culham, 2015; Garcea & Mahon, 2014; Kravitz, Saleem, Baker, Ungerleider, & Mishkin, 2013; Mahon et al., 2007). Meanwhile, regions along the dorsal pathway exhibit sensitivity to particular object categories based on the presence of affordances and/or specific motor plans in the case of tools (Lewis, 2006; Fang & He, 2005; Chao & Martin, 2000). It is important to note that the inanimate stimuli (tools and graspable objects) used in Experiment 1 contained the shape confound (elongation along the principal axis for tools) commonly found in previous studies. Thus, a follow-up experiment (Experiment 2) was conducted in which participants viewed images of tools and graspable objects that were matched for global shape (elongated vs. stubby), allowing shape-related signals to be disambiguated from category-related responses.
Together, these experiments allowed us to explore the extent to which spatio-temporal responses from neural populations in dorsal and ventral cortex vary according to global shape characteristics versus higher-level object category. On the basis of the preponderance of previous research, we expected that activity within the ventral pathway should contain information pertaining to both shape and category membership. With respect to the dorsal pathway, we expected above-chance classification for tools versus graspable objects using stimuli from Experiment 1. In addition, if the computations along the dorsal pathway extract category information in a manner analogous to the canonical properties of the ventral pathway, then we would also expect above-chance classification for birds versus insects. Furthermore, the temporal resolution afforded by EEG allows for additional inferences regarding the nature of the computations along the two pathways. Both the onset and peak of time-resolved signals provide information regarding the nature of neural processing. For example, comparing these metrics across conditions has been used to make inferences regarding the hierarchy of object recognition (i.e., the order of visual processing stages; Contini et al., 2017; Carlson, Tovar, Alink, & Kriegeskorte, 2013) and to test hypotheses regarding feedback and feedforward modulation between brain regions (Martin et al., 2019). In the context of the current study, comparing the temporal dynamics between the dorsal and ventral pathways can yield insights regarding the relationship and computational dependencies between these networks. For example, category information within the dorsal pathway could either be computed independently or arise as a product of feedforward input from the ventral pathway (Takemura et al., 2016; Cloutman, 2013). Thus, observing a latency offset whereby significant classification of object categories in the ventral pathway precedes significant classification in the dorsal pathway would be consistent with a model in which dorsal object representations are partially dependent on feedforward ventral projections. Conversely, similar temporal dynamics across dorsal and ventral ROIs would be consistent with a model in which object representations are computed independently across the two pathways (Freud, Plaut, & Behrmann, 2016).
Here, we leverage HD-EEG, source localization, and MVPA to explore the spatio-temporal dynamics of object processing across the dorsal and ventral pathways. Specifically, this study addresses the following questions: (a) when does category-specific information arise within the EEG signal, (b) how are object categories represented across dorsal and ventral neural pathways, and (c) to what extent does category, as opposed to shape, contribute to the above representations?
The purpose of Experiment 1 was to explore the spatio-temporal dynamics of object processing using stimuli belonging to animate (bird and insect) and inanimate (tool and graspable object) categories. Participants viewed images from both object categories as electrophysiological data were recorded from 256 electrodes. MVPA was performed on the EEG time-series data to ascertain when category-specific information emerges in the brain. Given the spatial limitations of EEG, source localization, in conjunction with MVPA, was used to examine the spatio-temporal dynamics of object perception in dorsal and ventral regions of the brain.
Twenty, right-handed, neurotypical adults with normal or corrected-to-normal visual acuity participated in the study (12 men, ages 18–38 years). Each participant provided informed written consent. All protocols received approval by the institutional review board at the University of Nevada, Reno. We based our sample size on previous EEG studies that we have successfully carried out in the past (Killebrew, Gurariy, Peacock, Berryhill, & Caplovitz, 2018; Gurariy, Killebrew, Berryhill, & Caplovitz, 2016; Peterson et al., 2014). Furthermore, other studies in the empirical literature that have adopted similar methodological approaches contain samples sizes that are similar or, in some cases, smaller (Grootswagers, Wardle, & Carlson, 2017; Carlson et al., 2016; Cichy, Pantazis, & Oliva, 2014; van de Nieuwenhuijzen et al., 2013). To further improve signal-to-noise ratio (SNR), our study contained large, centrally presented images in addition to a very large number of trials (1680 total; 420 per condition). Even after data cleaning and trial exclusion, the average number of total trials per participant was 1365. Given precedent set by previously published studies and the measures taken to improve SNR, we believe that the sample size used in this study is sufficient to address the empirical questions posed here.
Stimuli were displayed on a Mitsubishi Diamond Pro270 CRT monitor (20 in., 1024 × 768) with a 120-Hz refresh rate, running via a 2.6-Mhz Mac Mini (Apple, Inc.) and presented using the PsychToolbox (Kleiner et al., 2007; Brainard, 1997; Pelli, 1997) for MATLAB (MathWorks Inc., 2007). Viewing distance was 57 cm.
EEG Data Acquisition
The EEG signal was continuously recorded using a 256-channel HydroCel Geodesic Sensor Net via an EGI Net Amps Bio 300 amplifier (Electrical Geodesics Inc.) sampling at 1000 Hz. The digital data were recorded using Netstation 5.0(1) software. Impendence values were kept at or below 100Ω. A photodiode was used to validate frame-accurate timing of stimulus presentation.
The stimuli used in Experiment 1 (Figure 1A) were chosen from two superordinate categories of animate and inanimate objects. Each superordinate category was in turn composed of two basic categories, each consisting of five exemplars (resulting in 20 unique, monochrome images). In the case of animate objects, the two categories were bird and insect, whereas in the case of inanimate objects, the two categories were tools and graspable (nontool) objects. Although both tools and graspable objects evoke inferred action affordances (Gibson, 2014), pictures of tools are associated with a stereotypical motor plan (e.g., the stereotypical twisting motion associated with a screwdriver) more so than are graspable objects (Frey, 2007). All stimuli were processed using the SHINE toolbox (Willenbockel et al., 2010) to control for low-level differences in luminance and spatial frequency. In brief, the Fourier power spectrum of the images without optimization of the structural similarity were matched, followed by an equating of luminance histograms over the entire image.
For each trial, an image was presented at the center of the screen (15° × 15°) for 300 msec followed by an ISI lasting between 800 and 1200 msec. During each trial, participants were instructed to maintain central fixation upon a black fixation square in the center of the screen. Each of the 20 exemplars that made up the four categories was presented 84 times resulting in 420 trials per condition and 1680 trials in total. The order of presentation was randomized. To compel attentive viewing, participants were instructed to press the space bar if an image appeared at a reduced luminance (50%), which occurred on 5% of the trials. The data from these reduced luminance (i.e., target) trials were removed from further analysis. Although this task was orthogonal to the neural processes explored in our main analysis, object perception (especially in the context of well-known object categories) is thought to be an efficient, rapid, and fairly automatic process (Hung, Kreiman, Poggio, & DiCarlo, 2005; Dell'acqua & Job, 1998; Thorpe, Fize, & Marlot, 1996; Potter & Levy, 1969). In addition, the stimuli were large and visible for 300 msec, leaving ample time for participants to apprehend each image at the level of identity. Although slower presentation times are not required for successful object identification (Thorpe et al., 1996; Potter & Levy, 1969), they are conducive to activation of higher-level visual areas and more abstract levels of visual processing (Grootswagers, Robinson, & Carlson, 2019; Robinson, Grootswagers, & Carlson, 2019).
Multiple published studies examining the neural dynamics of visual object processing have successfully utilized behavioral tasks that were orthogonal to perceptual categorization (Cichy, Pantazis, & Oliva, 2016; Kaneshiro, Perreau Guimaraes, Kim, Norcia, & Suppes, 2015; Cichy et al., 2014; Carlson et al., 2013; van de Nieuwenhuijzen et al., 2013). Furthermore, object-related activity in the dorsal pathway has been observed in the absence of action planning (Kourtzi & Kanwisher, 2000; Faillenot, Decety, & Jeannerod, 1999; Grill-Spector et al., 1999; Sereno & Maunsell, 1998), and evidence exists that the elicitation of affordances and motor plans can occur automatically (Grèzes, Tucker, Armony, Ellis, & Passingham, 2003; Gentilucci, 2002) and in response to 2-D images (Craighero, Bello, Fadiga, & Rizzolatti, 2002; Craighero, Fadiga, Rizzolatti, & Umilta, 1998; Craighero, Fadiga, Umiltà, & Rizzolatti, 1996). Thus, although the behavioral task did not require identification of or attention to image category, our experimental design was sufficient to elicit neural activity corresponding to categorical identity, in addition to low-level features. Furthermore, using a task that involves category judgments would elicit additional cognitive operations that could introduce confounds into the results of our classification analysis.
Analysis of EEG data was carried out using the FieldTrip Toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011) along with custom scripts written in MATLAB. A bandpass filter (0.5–40 Hz) was applied to remove slow drift and electrical noise. Next, the data were rereferenced from Cz to an average reference. The filtered time series was then segmented into 550-msec epochs (a 50-msec baseline followed by 500 msec of electrophysiological data after stimulus onset). Segmentation was performed using trigger markers that were sent to the acquisition computer at the onset of each trial. The temporal offset that existed between the physical presentation of the stimulus and the registration of the stimulus marker in the acquisition computer was measured using a photodiode and corrected for during trial segmentation. Next, ocular artifacts (blinks and saccades) were identified using FieldTrip functions (ft_artifact_eog and ft_artifact_zvalue). In short, these functions detect ocular artifacts by thresholding the z-transformed value of the preprocessed raw data. Any trial contaminated by the presence of either artifact was removed from further analysis. Trials in which participants performed the contrast discrimination task were also discarded. After data cleaning and trial removal, all participants had a minimum of 1224 trials, whereas the average number of trials across all participants was 1365. Using a combination of custom MATLAB scripts and visual inspection, bad EEG channels were identified for each trial. Data from these faulty channels were replaced with a weighted average of all neighboring electrodes using spherical spline interpolation (Perrin, Pernier, Bertrand, & Echallier, 1989). For each participant, the EEG epochs described above were grouped into four conditions: bird, insect, tool, and graspable object. However, trial rejection resulted in an unequal number of trials across these conditions. To remedy this issue, the condition with the least number of usable trials was identified for each participant and a random subset of trials equaling this number was selected for the remaining three conditions. This resulted in each condition having an equal number of trials for each participant with no condition having fewer than 295 trials.
To determine the degree of object information present in the electrophysiological data, we employed MVPA with the aid of the CoSMoMVPA MATLAB toolbox (Oosterhof, Connolly, & Haxby, 2016). Given the relative abundance of trials for each condition, we were able to perform subaverages to increase the SNR of the data (Grootswagers et al., 2017). To perform these averages, trials within each condition were grouped by exemplars, split into subsets of five trials (all of the same exemplar), and averaged together. For example, if a condition initially contained 420 trials (84 trials of each exemplar), then performing the subaverage reduced the total number of trials within that condition to 84 (each subaveraged trial being an average of five trials from that condition). Next, MVPA classification was conducted independently for each participant on the subaveraged trials discussed above. A naive Bayes classifier was trained to discriminate patterns of neural activity evoked by trials across the conditions of interest. Classifier performance was evaluated using the leave-one-out cross-validation method. For each millisecond, the EEG data were organized into a matrix consisting of samples by features, where samples refer to amplitude values for each subaveraged trial and features refer to EEG electrodes (256 in total). Next, the data were split into 10 chunks and organized into test and training sets such that nine of the chunks were randomly placed into the training set and one chunk was held out as the test set. This was done 10 times so that, on each iteration, a different chunk was held out as the test data. Classifier performance was averaged across the 10 folds at each time point, followed by a grand average across all participants. The contrasts of interest on which MVPA was performed were as follows: animate (bird vs. insect), inanimate (tool vs. graspable object), animacy (animate vs. inanimate), and all categories (bird vs. insect vs. tool vs. graspable objects).
To identify statistically significant time points and deal with the problem of multiple comparisons, we employed a Monte Carlo simulation technique modeled on the methods of Bae and Luck (2018). First, for each time point, the classifier performance was submitted to a one-tailed, one-sample t test against chance performance (1/number of conditions). Next, we located “temporal clusters” (temporally contiguous time points that were statistically significant at an alpha level of .05) and summed the t scores across these data points to generate a single t score per cluster. Next, we reran the entire time-resolved multivariate analysis 1000 times using randomly reshuffled labels for the training data. The summed t score corresponding to the largest temporal cluster was stored on each iteration. This procedure allowed us to generate a Monte Carlo null distribution of summed t scores and derive a threshold corresponding to the 95th percentile (p = .05). Finally, only clusters from the main analysis whose summed t score exceeded this critical value were treated as statistically significant. Statistical significance for comparisons between conditions was also evaluated using the same Monte Carlo simulation method described above.
To explore the nature of the underlying neural representation, the time series EEG data were subjected to a multivariate time generalization analysis (King & Dehaene, 2014). Unlike standard time-resolved MVPA, in a time generalization analysis, a classifier trained at a given time point is tested on all other time points of the experimental epoch. The results can be represented as a temporal cross-decoding matrix with each cell depicting classifier performance for a particular combination of training and testing times. Standard decoding of time-resolved M/EEG data allows for limited inferences regarding the nature of mental representations being decoded. However, the time generalization method offers further insight into these representations in the context of spatio-temporal dynamics. Specifically, examination of the temporal cross-decoding matrix can reveal information regarding the structure of underlying representations, when those representations are activated and how they change over time. We performed this analysis using the same methodological approach and parameters that were used for regular time-resolved MVPA, but the classifier trained on each time point was subsequently tested on every other time point. These results were evaluated via comparison to chance using a one-tailed, one-sample t test and an adjusted p value corresponding to a false discovery rate corrected threshold of q = .05 (Benjamini & Hochbery, 1995).
Source localization was performed using Brainstorm software (Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011), which is documented and freely available for download online under the GNU general public license (neuroimage.usc.edu/brainstorm). This was done for all 20 participants. However, in the case of 10 participants, the source localization was further constrained by anatomical T2 MRI scans (3-T Skyra MRI, 64-channel phased-array head coil; repetition time = 3 sec, echo time = 304 msec, flip angle = 7°, 640 × 640 matrix, resolution = 0.375 × 0.375 × 0.8 mm) collected from these individuals on a previous occasion. For the remaining participants, the default Colin27 MNI brain (Holmes et al., 1998) was used to constrain the results of the source analysis. These anatomical images were transformed into a unique cortical space for each participant consisting of 15,000 hypothetical sources (7500 per hemisphere) oriented orthogonal to the cortical sheath. Next, the boundary element method (Gramfort, Papadopoulo, Olivi, & Clerc, 2010) was used to model each participant's inner skull, outer skull, and head surface. Using the EGI GPS Solver software, physical locations of all electrodes during the experiment were triangulated, imported into Brainstorm, and coregistered with existing head surfaces. In the case of participants for whom only the default MRI anatomy was available, their surfaces (source space, inner skull, outer skull, and head surface) were warped to match the head shape and size generated by the electrode location data (Leahy, Mosher, Spencer, Huang, & Lewine, 1998). Next, a forward model was computed using the OpenMEEG software (Kybic et al., 2005) followed by a noise covariance matrix from all individual trials for a single participant. Finally, the inverse model was calculated using minimum norm estimation, with a current density map as the measure and constrained diploe orientation (normal to cortex).
To avoid some of the potential limitations and imprecisions associated with EEG source localization, MVPA was performed within two broadly defined ROIs, each covering a substantial swathe of the cortex. This rather conservative approach allowed us to disassociate computations between the dorsal and ventral pathways without making overly precise claims regarding the neural loci of this activity. The two ROIs included a ventral and dorsal region (Figure 2). Each ROI was collated using multiple regions extracted from the Desikan–Killiany Atlas (Desikan et al., 2006). The bilateral dorsal ROIs were composed of the superior parietal, inferior parietal, and supramarginal regions, whereas the bilateral ventral ROIs consisted of the fusiform, inferior temporal, parahippocampal, and entorhinal regions.
Accurate source localization of EEG data depends on several parameters, for example, realistic modeling of the cortex, inner skull, outer skull, and head surface; accurate coregistration of electrode locations, and so forth. Given the numerous challenges involved in source localization, dorsal and ventral ROIs were defined very broadly. However, further measures were taken to ensure that the activity extracted from the two ROIs can be reasonably expected to represent actual signal from those regions. Simulated EEG waveforms were generated by projecting activity from an ROI to the surface electrodes by multiplying the source time series with the forward model. This was done separately in dorsal and ventral ROIs while activity in all other regions was set to zero. Next, these simulated waveforms were localized back into dipole space using the source localization process described above. Examining the localization results for a known source was used to examine the accuracy of the forward and inverse models.
To perform MVPA on source-localized data, individual preprocessed trials from each participant (described in the EEG Data Preprocessing section) were averaged together to create subaverages, each composed of five trials. The trials that composed each subaverage were of the same exemplar image, and the total number of newly created subaveraged trials was held constant across all conditions. Next, source localization (see previous section for details) was performed on each subaveraged trial, and individual sources from each ROI were extracted. This resulted in a distribution of cortical sources for each millisecond of the experimental epoch for each ROI. Given the substantial number of individual sources within each ROI and the fact that this number differed between the ROIs, principal component analysis was performed on the data. The first 40 components were retained to be used as feature inputs for the MVPA. The classification procedure used for these data recapitulated those used in the EEG analysis at the sensor level. At each millisecond, the data were divided into 10 chunks, nine of which would serve as the training set and one as the test set. Next, a leave-one-out method was used where, on each fold, a different chunk was held out as the test set. The performance of a naive Bayes classifier was averaged across the 10 folds for each time point, averaged across all participants, and statistically evaluated using the Monte Carlo simulation described in the Multivariate Analysis section.
Electrode Time Course MVPA
Figure 3 depicts the results of the MVPA performed on the time course EEG data. Above-chance classification was observed for all examined comparisons (exemplars; all categories, animacy; bird vs. insect; tool vs. graspable object). The onset for significant classification occurred at around 60 msec after stimulus onset in all comparisons and slightly earlier (around 40 msec) for the individual exemplar comparison. In comparing the classification performance between the animate and inanimate conditions, the above-chance classification onset occurred at the same latency (∼70 msec). However, in the tool versus graspable object comparison, we observed a pronounced peak from approximately 180 to 230 msec that was not evident for the animate conditions. Peak latencies for the exemplar, basic, and superordinate contrasts (Figure 3A–C) occurred at 114, 207, and 249 msec, respectively.
Although classification accuracy suggests the presence of category-specific information in the neural data, this measure says relatively little regarding the nature of the underlying representation. This question was explored using the time generalization method in which a classifier was trained at each time point and subsequently tested on all other time points. The resulting temporal cross-decoding matrix illustrates whether the neural representation at any single time point generalizes to other latencies, hence providing a window into the neural dynamics of the underlying representation. The results (Figure 4) depict a similar pattern for the four MVPA conditions explored in this analysis (all categories; bird vs. insect, animate vs. inanimate, tool vs. graspable object). Specifically, cells with above-chance classification are confined within a relatively narrow region of the matrix. The temporal window within which the classifier could generalize was approximately 30–40 msec, suggesting a highly dynamic and evolving neural representation.
The presence of a known source allows for the evaluation of the forward and inverse model in terms of spatial accuracy. We therefore conducted an analysis in which simulated EEG waveforms were generated by multiplying the source time series by the forward model and then localizing this activity back into dipole space. The results of the model evaluation are shown in Figure 5 for a representative participant. A qualitative inspection of the data suggests that activity originating from the dorsal or ventral pathway can be correctly localized. Importantly, activity localized to dorsal ROIs did not appear to contaminate activity in ventral ROIs, and vice versa.
Given previously reported differences in object selectivity in the dorsal and ventral pathways, we analyzed the time course classification performance based on source-localized data extracted from dorsal and ventral ROIs. For both the animate (bird vs. insect) and inanimate (tool vs. graspable object) contrasts, above-chance performance was observed in both ventral and dorsal ROIs (Figure 6). Onset latencies for above-chance classification were similar across both ROIs, starting at approximately 75 msec after stimulus onset. Regions shaded in red represent time windows during which classification differed significantly between the two pathways. Of note, significantly higher performance was observed in the dorsal ROI from 100 to 130 msec for the comparison of tools versus graspable objects. For the decoding of the animate object categories (birds vs. insects), the classification time courses were similar across both ROIs with no time points at which statistically significant differences we observed.
Above-chance classification of object categories within the ventral pathway has been observed before and is consistent with the known properties of ventral neural populations. However, above-chance classification of birds versus insects in the dorsal pathway was an unexpected finding—one that may call for a reevaluation of dorsal pathway computations and its role in visual perception. Successful decoding was observed although the stimuli used in this experiment were processed to control for low-level differences (luminance and spatial frequency). However, global shape differences between birds and insects (although not as pronounced as those between tools and graspable objects) may still exist and account for some of the results reported in this section. Interestingly, the onset of above-chance classification for the birds–insects contrast did not differ across the two pathways. This suggests that dorsal object representations reported here are not likely to be a consequence of feedforward signals from ventral regions. Rather, these results suggest that the dorsal pathway is capable of forming independent object representations in the absence of visuomotor behaviors for animate categories lacking motor plans.
The stimuli used in Experiment 1 contain a potential confound that is especially evident for the inanimate conditions. Specifically, there are systematic differences in shape that can be seen between exemplars that make up the tools and graspable object categories. Tools tend to be elongated along the principal axis, whereas graspable objects are typically foreshortened, or “stubby,” along the central axis. Previous research suggests that the selectivity toward tools observed within the dorsal pathway may be, at least in part, explained by this systematic elongation that tends to co-occur in objects that comprise most tool stimulus sets (Chen et al., 2018; Sakuraba et al., 2012). Thus, it is not clear whether the classification performance observed for tools versus graspable objects in Experiment 1 was driven by shape rather than category membership. To address this issue, Experiment 2 replicates the analyses from Experiment 1 on tool and graspable object images chosen from a new stimulus set—one designed to control for shape differences. This new stimulus set contains novel exemplars of tools and graspable objects, with each category including both elongated and stubby exemplars. Furthermore, it allows for the disambiguation of shape from toolness and an examination of how these parameters contribute to the neural representation of objects.
Twenty, right-handed, neurotypical adults with normal or corrected-to-normal visual acuity participated (11 men, ages 18–43 years) and provided informed written consent. Of these 20 individuals, 10 also participated in Experiment 1. The reason for running some participants in both studies had to do with the availability of anatomical MRI scans (which help improve source localization accuracy). The institutional review board at the University of Nevada, Reno approved all protocols. The corresponding section of Experiment 1 provides a discussion and justification of the sample size used in the current study.
Apparatus and EEG Data Acquisition
The stimulus display and EEG recording methods were identical to those used for Experiment 1.
The stimuli used in Experiment 2 (Figure 1B) consisted of 20 unique, monochrome images that varied along two dimensions: shape and toolness. Along the toolness dimension, each image could be classified as either a tool or a graspable object (see the Stimuli subsection of Experiment 1 Methods for the specific distinction between these categories). Along the shape dimension, the profile of each image was either stubby or elongated. Thus, the stimuli could be organized into the following nonoverlapping categories: stubby tool, stubby graspable object, elongated tool, and elongated graspable object. Unlike the stimulus set used for Experiment 1, these stimuli were not processed with the SHINE toolbox given the importance of texture and shading cues for the dorsal pathway (Freud et al., 2016).
The experimental design used in Experiment 2 mimicked that of Experiment 1. Please see the Experimental Procedure section of Experiment 1 for details.
The preprocessing pipeline was identical to the one used for Experiment 1. See the EEG Preprocessing section in Experiment 1 for specific details. In brief, data were filtered, referenced, and segmented into individual trials. Data from defective EEG channels were interpolated via averaging, and trials that were contaminated by the presence of ocular artifacts were removed from further analysis. After data cleaning and trial removal, all participants had a minimum of 1048 trials, whereas the average number of trials across all participants was 1294. Additional trials were randomly removed to ensure that all conditions within a participant had an equal number of trials. All participants had a minimum of 257 trials per condition.
The steps involved in the multivariate analysis were identical to those used in Experiment 1. See the Multivariate Analysis section of Experiment 1 for specific details. The contrasts of interest on which MVPA was performed were as follows: shape (all elongated vs. all stubby objects), toolness (all tools vs. all graspable objects), elongated tools versus elongated graspable objects, and stubby tools versus stubby graspable objects. A time generalization analysis using the methods described in Experiment 1 was also performed for the toolness and shape contrasts.
Source Localization and Source-Localized MVPA
The steps involved in source localization and the corresponding MVPA were identical to those used in Experiment 1. See the corresponding sections of Experiment 1 for details. At the completion of this process, source-localized activity was extracted from dorsal and ventral ROIs and used as input into the MVPA classifier. The classifier was trained to discriminate between the following conditions in both dorsal and ventral ROIs: shape (all elongated vs. all stubby objects), toolness (all tools vs. all graspable objects), elongated tools versus elongated graspable objects, and stubby tools versus stubby graspable objects.
Time Course MVPA
Figure 7 depicts the results of the MVPA performed on the time course EEG data. Above-chance classification performance was observed for all of the examined contrasts. Figure 7A compares classification of shape (all long vs. all stubby) and toolness (all tools vs. all graspable objects). Classification of object shape reached significance at ∼60 msec, whereas classification of toolness reached significance approximately 35 msec later (∼95 msec). At multiple time windows, performance was statistically higher for shape as compared to the toolness condition. Specifically, this occurred at 80–120, 158–180, 240–297, 320–400, and 470–550 msec.
The stimuli used in classifying all tools versus all graspable objects included both stubby and elongated exemplars within each category. Figure 7B displays the classification between tools and graspable objects when shape was held constant. The results suggest that, even when shape is controlled for, the EEG signal contains information distinguishing tools from graspable objects. Overall, performance was better when the classifier was trained on elongated exemplars of tools and graspable objects as opposed to stubby exemplars from these categories. The onset of significant classification for elongated objects began at approximately 70 msec. Classification of stubby objects was weaker and less consistent, reaching significance starting at approximately 184 msec. Statistically significant differences between the two conditions were observed within the time window of 150–182 msec, with higher performance for the elongated condition. As in Experiment 1, the time generalization analysis shows that above-chance classification was constrained to a relatively narrow time window of approximately 30–40 msec, suggesting a dynamic and evolving neural representation (Figure 8).
Source-Localized Time Course MVPA
As in Experiment 1, we analyzed the time course classification performance based on source-localized data extracted from dorsal and ventral ROIs (Figure 9). As demonstrated in Figure 9A, shape could be successfully decoded in both ROIs beginning at 55 msec in the ventral pathway and 66 msec in the dorsal pathway. Classification accuracy for both ROIs remained fairly similar until approximately 337 msec, after which significantly better performance was observed in the dorsal ROI. Figure 9B displays classification performance between elongated tools and stubby graspable objects. This contrast was chosen as it most closely approximates the stimulus characteristics used in Experiment 1. As is evidenced by the data, failure to control for shape between object categories resulted in higher accuracy (relative to Figure 9C and D, in which shape was controlled for). Above-chance classification of long tools and stubby graspable objects occurred in the ventral pathway at 43 msec after stimulus onset, whereas dorsal classification reached statistical significance at 63 msec. Finally, Figure 9C and D suggests that category information regarding toolness was present in the data even when the shape confound between tools and graspable objects was controlled for. Specifically, for both dorsal and ventral ROIs, significant decoding of tool versus graspable objects was observed when limiting the comparison to only long or only stubby objects. Together, these results suggest that both shape and object identity are represented within neural activity across both the dorsal and ventral visual pathways. Furthermore, given similar onset times for successful decoding, these data are consistent with a model in which neural representations of objects are computed independently within the two pathways.
The purpose of this study was to explore the spatio-temporal dynamics of visual object processing in the human cortex. To achieve this goal, we collected evoked responses to different object categories using HD-EEG. MVPA was used to detect the presence of category-specific information within the EEG signal and to explore the temporal dynamics of category classification. Source localization combined with MVPA was employed to explore the neural origins of the EEG signal, focusing on the temporal dynamics within ventral and dorsal neural pathways. We report successful classification of object categories at both the electrode level and the source-localized dipoles extracted from these two neural pathways.
In Experiment 1, participants viewed images from two superordinate categories, each in turn consisting of two basic categories (animate: bird and insect; inanimate: tool and graspable object). Electrode-level MVPA results from Experiment 1 showed that objects could be successfully classified across different hierarchical levels of categorization. Above-chance classification was observed when testing and training sets were organized into superordinate level categories (animate vs. inanimate) and basic level categories (bird vs. insect vs. tool vs. graspable object) as well as individual exemplars. Onset latencies were similar across these conditions, whereas the peak latencies revealed a distinct temporal trend; exemplar decoding showed the earliest peak, followed by the basic and superordinate categories, respectively. These temporal patterns have been observed before (Contini et al., 2017; Cichay et al., 2014; Carlson et al., 2013) and have implications regarding the neural bases of object perception. In particular, these data suggest that object representations in the brain evolve from lower to increasingly higher levels of abstraction (exemplar → basic → superordinate). This is consistent with an object perception framework in which categorization can be understood as evidence accumulation over time (Mack & Palmeri, 2011; Philiastides & Sajda, 2006). This conclusion is further corroborated by a time generalization analysis showing that the neural representations underlying the object categories were not stable but rather evolved dynamically over the course of the epoch.
Whereas some low-level features (i.e., luminance) were controlled in Experiment 1, other potentially confounding variables were not. Specifically, the stimulus set contained shape confounds that were especially pronounced between tools and graspable objects (elongation along the principal axis). To remedy this issue, a follow-up Experiment 2 was performed in which tools and graspable objects were composed of both elongated and stubby exemplars. This change allowed us to examine the contribution of shape to the neural representation of object categories. Data from Experiment 2 suggest that information pertaining to both category and shape are present in the EEG signal. This is evidenced by successful classification for tools versus graspable objects even when restricting the analysis to only elongated or only stubby exemplars.
We used source localization to explore the spatio-temporal dynamics of visual object processing within two broadly defined neural regions: dorsal and ventral cortex. In Experiment 1, without controlling for elongation, we observed above-chance classification for tools versus graspable objects in both dorsal and ventral stream ROIs. The time course of classifier performance was similar across the two pathways, with the exception of a dorsal stream advantage (higher classification accuracy) from approximately 100 to 130 msec. A similar pattern was observed in Experiment 2 when the classifier was trained to discriminate between elongated tools and stubby graspable objects (a replication of the conditions in Experiment 1). Restricting the analysis to all elongated versus all stubby objects (irrespective of toolness) again produced a qualitatively similar time course in both dorsal and ventral ROIs. Furthermore, performance for shape (elongated vs. stubby) was overall more robust compared to category (tool vs. graspable object), partially validating concerns regarding shape as a potential confound. There has been a long-standing debate in the literature about whether the organizing principles of category-selective neural regions are best described by shape and low-level properties (Nasr, Echavarria, & Tootell, 2014; Rice et al., 2014; Watson et al., 2014; Yue, Pourladian, Tootell, & Ungerleider, 2014; Baldassi et al., 2013; Rajimehr, Devaney, Bilenko, Young, & Tootell, 2011) or category membership (Kriegeskorte et al., 2008; Kiani, Esteky, Mirpour, & Tanaka, 2007). Bracci and Op de Beeck (2016) addressed this issue by generating a two-factorial stimulus set in which the contribution of category and shape could be dissociated in humans. Their results suggest that shape and category are coded independently but nevertheless interact in important ways throughout the visual hierarchy. Our findings are in line with those of Bracci and Op de Beeck (2016), showing that successful decoding of tools versus graspable objects can be achieved at the electrode level, as well as in the source-localized data, even when shape has been controlled for. Grootswagers, Robinson, Shatek, and Carlson (2019) reached a similar conclusion upon measuring EEG responses to intact objects from different conceptual categories as well as “textform” versions of each image that were rendered unrecognizable while preserving numerous midlevel features. Their results demonstrate that the statistical regularities maintained in the scrambled images were indeed sufficient for above-chance decoding of animacy. However, classifier performance for intact images was significantly more robust, suggesting that featural confounds contribute to but cannot fully account for the classification of conceptual categories (see Long, Yu, & Konkle, 2018, for similar findings using fMRI).
We also observed that classification performance was generally better when the tool category consisted of elongated (rather than stubby) images. This may reflect the fact that tools tend be elongated in real life, and hence these images may constitute better exemplars of the tool category. Previous studies have reported the existence of regions in the dorsal pathway that show selectivity for elongated objects, irrespective of their semantic category (Sakata et al., 1998). The existence of such regions may partially explain why classification performance was elevated for elongated versus stubby exemplars, especially in the dorsal pathway. Chen et al. (2018) also investigated the relationship between elongation and toolness using fMRI. They reported that both toolness and elongation are processed by both separate and common regions and that elongated tools (but not stubby tools) facilitate reciprocal connectivity between the ventral and dorsal regions.
One unexpected finding revealed by our analyses was the striking similarities between the dorsal and ventral pathways pertaining to the classifier's performance in the decoding of different object categories as well as the categories that could be successfully decoded. The results of Experiment 1 suggest that information in the dorsal pathway is not restricted to inanimate objects with affordances (tools and graspable objects) but also extends to animate objects categories (birds and insects). This result supports and extends previous findings of object selectivity in the dorsal pathway (Freud et al., 2017; Jeong & Xu, 2016; Zachariou et al., 2014; Konen & Kastner, 2008; Valyear et al., 2007; Fang & He, 2005; Chao & Martin, 2000; Sereno & Maunsell, 1998). Furthermore, in addition to evidence of the dorsal pathway exhibiting properties similar to the ventral pathway (Konen & Kastner, 2008), it has been shown that the ventral pathway contains spatial information (Hong, Yamins, Majaj, & Dicarlo, 2016; Zoccolan, Kouh, Poggio, & DiCarlo, 2007) and can be modulated by motor attributes (Gallivan, Chapman, Mclean, Flanagan, & Culham, 2013; Astafiev, Stanley, Shulman, & Corbetta, 2004). In light of such findings, the utility of the dorsal–ventral dichotomy has been questioned, with some suggesting the existence of additional pathways (Haak & Beckmann, 2018) and others arguing for a “patchwork” model defined by multiple interacting neural regions (de Haan & Cowey, 2011). At the very least, such findings urge a reassessment of the two-pathway hypothesis and, as it relates to this study, the role of the dorsal pathway in visual object perception (Freud, Behrmann, & Snow, 2020; Erlikhman, Caplovitz, Gurariy, Medina, & Snow, 2018; Freud et al., 2016).
Interesting possibilities regarding dorsal involvement in visual object perception are bolstered by the anatomical connectivity (Takemura et al., 2016; Cloutman, 2013) and cross-talk (Hutchison & Gallivan, 2018; Janssen, Verhoef, & Premereur, 2018; de Haan & Cowey, 2011; Schenk & McIntosh, 2010) between the two pathways as well as the finding that neural signals propagate faster through the dorsal relative to the ventral pathway (Sim, Helbig, Graf, & Kiefer, 2015; Srivastava, Orban, De Mazière, & Janssen, 2009; Norman, 2002). Together, these functional and anatomical properties give rise to the possibility that object representations computed first in the dorsal pathway may be capable of priming or otherwise modulating object-related computations in the ventral pathway via feedback. For example, Sim et al. (2015) found that priming the dorsal pathway with movie clips of tool use modulated ventral activity related to the recognition of tools. Additional evidence of dorsal pathway involvement in object perception comes from neuropsychology. Several studies have documented that the behavioral deficits associated with dorsal lesions are not always restricted to visuomotor interactions but can also perturb certain perceptual abilities such as global shape perception and 3-D processing (Van Dromme, Premereur, Verhoef, Vanduffel, & Janssen, 2016; Gillebert et al., 2015; Lestou, Lam, Humphreys, Kourtzi, & Humphreys, 2014). In fact, under some circumstances, agnosic patients with a spared dorsal pathway can outperform somebody with the opposite pattern of damage. For example, agnosic patients with ventral damage (but intact dorsal pathways) were able to identify the presence of an object defined by disparity, whereas patients with dorsal damage were unable to detect such objects at all (Vaina, 1989). The involvement of the dorsal pathway in 3-D shape perception has also been documented using neuroimaging in humans (Freud et al., 2017; Katsuyama, Usui, Nose, & Taira, 2011; Georgieva, Peeters, Kolster, Todd, & Orban, 2009; Vanduffel et al., 2002) in addition to monkey physiology (Alizadeh, van Dromme, Verhoef, & Janssen, 2018; Janssen, Srivastava, Ombelet, & Orban, 2008).
However, several studies (including this one) have found that computations in the dorsal pathway may extend beyond the processing of 3-D shape. Indeed, evidence exists that dorsal activity can be elicited from 2-D images, even during passive viewing (Zachariou et al., 2014; Konen & Kastner, 2008; Chao & Martin, 2000; Sereno & Maunsell, 1998). An influential article by Konen and Kastner (2008) showed that, in addition to selectivity for 2-D objects, the dorsal pathway shows evidence of size, retinal position, and viewpoint invariance—properties typically associated with ventral stream computations (Sawamura et al., 2005; Vuilleumier et al., 2002; Kourtzi & Kanwisher, 2000; Grill-Spector et al., 1998). Other findings suggest that the dorsal pathway might facilitate perceptual tasks that require configural processing. Zachariou, Nikas, Safiullah, Gotts, and Ungerleider (2017) used fMRI to show that configural versus featural face differences elicited greater activity in dorsal regions and correlated with behavioral performance on configural tasks whereas the deactivation of the posterior parietal cortex using TMS adversely affected behavior on such tasks. Other findings also suggest that the dorsal pathway may be involved in aspects of perceptual organization occurring over space (Xu & Chun, 2007) as well as time (Erlikhman & Caplovitz, 2017; Erlikhman et al., 2016; McCarthy, Kohler, Tse, & Caplovitz, 2015).
An important caveat to the above discussion is that the dorsal pathway is not a single, functionally homogenous neural region but rather is composed of multiple neural regions, each defined by its own functional profile. Research suggests that the functional–anatomical properties of the dorsal pathway are best captured by a posterior-to-anterior gradient. Along this gradient, neural signals appear to be transformed from perceptual representations to those in service of motoric interactions. In the posterior region of the dorsal pathway, neural populations show selectivity toward object shape, 3-D cues, and global form (Frued et al., 2017; Erlikhman et al., 2016; Tsutsui, Jiang, Yara, Sakata, & Taira, 2001) as well as ventral-like properties, such as invariance to low-level transformations (Konen & Kastner, 2008). Conversely, anterior regions feed into motor cortex and exhibit corresponding sensitivity to hand position, motor plans, and other computations geared toward visuomotor interactions (Stark & Zohary, 2008; Shmuelof & Zohary, 2005). This gradient can help explain the range of disparate findings in the literature as they relate to the functional role of the dorsal pathway.
A number of limitations that may affect the interpretation of our data should be discussed. First, source localization methods applied to M/EEG data constitute an indirect estimation of the underlying neural signal. These methods, although generally considered reliable, lack the spatial precision of fMRI and should be interpreted with some caution. To mitigate some of the uncertainties pertaining to source localization, we opted for a conservative approach in choosing ROIs. Our dorsal and ventral ROIs covered a substantial swathe of cortex. Although this approach minimizes the possibility of false claims regarding activity in specific neural loci, it also precludes a more refined analysis regarding the functional properties of different regions that make up the dorsal and ventral pathways.
Another issue relates to the stimulus sets used for both experiments. The stimuli used in Experiment 1 were manipulated to control luminance and spatial frequency differences; however, there were still shape confounds that existed across object category. This was especially true for the inanimate object categories where systematic differences in elongation can be seen between tools and graspable objects; thus, interpretation of results should be undertaken with these confounds in mind. Experiment 2 addressed this issue by subdividing each basic category into an equal number of elongated and stubby exemplars, thereby controlling for global shape differences; however, this experiment used regular gray-scale images and did not control for other low-level confounds. This decision was made in light of evidence suggesting that the dorsal pathway may be sensitive to texture and shading cues (Freud et al., 2016). However, this makes it harder to rule out other low-level confounds in the data. Finally, objects were not matched for familiarity. This control may be especially germane to the selection of tool exemplars given that membership into this category is defined by a stereotypic motor plan that is instantiated through experience with tool-like objects. Future experiments may do a better job controlling for the myriad confounds such as low-level features, real-world size, and familiarity that lurk between object categories. However, controlling for every confound in a single study is often unrealistic, and all studies must face trade-offs between controls and image integrity. Still, some promising approaches exist that can help researchers mitigate the impact of low-level confounds. For example, Weisberg et al. (2007) presented participants with novel, artificially created objects. Over multiple sessions, participants used these objects to perform specific functions, thereby facilitating a transition from object to tool via the acquisition of a specific motor plan. Comparing neural activity before and after training serves as a control for low-level confounds and ensures that the demarcation between tool and graspable object maps on to the participant’s experience.
In summary, our study used MVPA on HD-EEG time series data in conjunction with source localization to explore the spatio-temporal dynamics of object processing in the human brain. Our results suggest the following: (1) Successful classification of object categories was observed across different levels of abstraction, including exemplar, basic (bird, insect, tool, graspable object), and superordinate (animacy); (2) shape and category information are both represented throughout the ventral and dorsal pathways; (3) successful classification of object categories in the dorsal pathway was not restricted to inanimate categories (tools and graspable nontools) but also included animate categories (birds and insects); and (4) overall, prominent similarities were observed between the two pathways, both with regard to temporal dynamics as well as the neural representations decoded by the MVPA classifier.
Understanding the neural basis of object processing remains an important challenge. The methodological approach used in this study offers a promising avenue for better understanding the neural dynamics of object perception via the combination of spatial and temporal analyses. Furthermore, the successful decoding of animate object categories in the dorsal pathway as well as the general similarities observed between the two pathways raises intriguing questions regarding the role that the dorsal stream plays in visual perception. Although these findings do not undermine the substantial body of work that shows that, broadly speaking, each pathway is specialized toward differing behavioral goals, the notion of functional independence must be reevaluated. Future work is needed to further elucidate the computational processes of the dorsal pathway and how these processes modulate visual object perception.
Reprint requests should be sent to Gennadiy Gurariy, Biomedical Engineering, Medical College of Wisconsin, or via e-mail: firstname.lastname@example.org.
Gennadiy Gurariy: Conceptualization; Data curation; Formal analysis; Methodology; Visualization; Writing—Original draft; Writing—Review & editing. Ryan E. B. Mruczek: Conceptualization; Methodology; Writing—Review & editing. Jacqueline C. Snow: Conceptualization; Methodology; Writing—Review & editing. Gideon P. Caplovitz: Conceptualization; Funding acquisition; Methodology; Project administration; Resources; Supervision; Writing—Review & editing.
National Science Foundation (NSF) (https://dx.doi.org/10.13039/100000001), grant number: EPSCoR Research Infrastructure Awards 1632849 (G. P. C. & J. S.) and 1632738 (G. P. C.); National Institute of General Medical Sciences of the National Institutes of Health (NIH), grant number: P20 GM103650; National Eye Institute of Health, award number: R01EY026701 (J. S.).The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the NSF.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.