Abstract

A hallmark of human cognition is the ability to rapidly assign meaning to sensory stimuli. It has been suggested that this fast visual object categorization ability is accomplished by a feedforward processing hierarchy consisting of shape-selective neurons in occipito-temporal cortex that feed into task circuits in frontal cortex computing conceptual category membership. We performed an EEG rapid adaptation study to test this hypothesis. Participants were trained to categorize novel stimuli generated with a morphing system that precisely controlled both stimulus shape and category membership. We subsequently performed EEG recordings while participants performed a category matching task on pairs of successively presented stimuli. We used space–time cluster analysis to identify channels and latencies exhibiting selective neural responses. Neural signals before 200 msec on posterior channels demonstrated a release from adaptation for shape changes, irrespective of category membership, compatible with a shape- but not explicitly category-selective neural representation. A subsequent cluster with anterior topography appeared after 200 msec and exhibited release from adaptation consistent with explicit categorization. These signals were subsequently modulated by perceptual uncertainty starting around 300 msec. The degree of category selectivity of the anterior signals was strongly predictive of behavioral performance. We also observed a posterior category-selective signal after 300 msec exhibiting significant functional connectivity with the initial anterior category-selective signal. In summary, our study supports the proposition that perceptual categorization is accomplished by the brain within a quarter second through a largely feedforward process culminating in frontal areas, followed by later category-selective signals in posterior regions.

INTRODUCTION

The primate sensory system assigns meaning to visual images with swift proficiency. This ability is mediated by a “simple-to-complex” hierarchy of brain areas, the so-called ventral visual stream (Kravitz, Saleem, Baker, Ungerleider, & Mishkin, 2012), in which the complexity of neurons' preferred features gradually increases along a hierarchy. This processing stream begins with selectivity to oriented edges in primary visual cortex (V1), extends along intermediate visual areas V2 and V4, and culminates in lateral occipital and ventral temporal regions that exhibit selectivity for complex natural objects such as faces, cars, and words. The stimulus representations in visual cortex are then hypothesized to provide input to task-relevant circuits in pFC. Monkey electrophysiology (Freedman, Riesenhuber, Poggio, & Miller, 2003; Op de Beeck, Wagemans, & Vogels, 2001; Thomas, Van Hulle, & Vogels, 2001) as well as human fMRI studies (Jiang et al., 2007) have provided support for a two-stage model of perceptual category learning (Panis, Vangeneugden, Op de Beeck, & Wagemans, 2008; Thomas et al., 2001; Riesenhuber & Poggio, 2000; Ashby & Lee, 1991; Nosofsky, 1986) involving a perceptual learning stage in extrastriate visual cortex in which neurons come to acquire sharper tuning with a concomitant higher degree of selectivity for the training stimuli. These stimulus-selective neurons provide input to task modules located in higher cortical areas, such as pFC, mediating behavioral processes such as the identification, discrimination, or categorization of stimuli. A computationally appealing property of this hierarchical model is that the high-level perceptual representation in visual cortex can be used in support of other tasks involving the same stimuli (Riesenhuber & Poggio, 2002), permitting transfer of learning to novel tasks.

To probe the neural bases of perceptual categorization, it is crucial to distinguish the physical shape of a stimulus from its conceptual attributes. Note that “categorization” here refers to the assignment of a semantic label that is abstracted from the physical shapes of the category members. This capability requires that even physically similar stimuli (such as apple and tennis ball) can be assigned to different categories, whereas physically dissimilar stimuli (such as apple and banana) can belong to the same category (“fruit”). Populations of neurons computing the conceptual attributes of a stimulus are expected to respond similarly to physically dissimilar stimuli from the same category and to respond dissimilarly to physically similar stimuli from different categories, whereas populations of neurons selective for physical shape differences lack the sharp response transition at the category boundary and generalization within each category that are the hallmarks of perceptual categorization.

This dissociation between the shape and conceptual attributes of a stimulus can be difficult to achieve with natural categories (Gotts, Milleville, Bellgowan, & Martin, 2011), as stimuli in one category are commonly visually more similar to each other than to images from another category (e.g., faces vs. places). This difficulty has motivated the use of morphed stimulus spaces (Van der Linden, Van Turennout, & Indefrey, 2010; Gillebert, Op de Beeck, Panis, & Wagemans, 2009; Freedman et al., 2003; Freedman, Riesenhuber, Poggio, & Miller, 2001; Op de Beeck et al., 2001) that allow to precisely define categories and control shape similarity within and across categories. Studies using such morphed spaces have revealed that, although categorization training sharpened the selectivity of neurons in temporal cortex for trained stimuli (Van der Linden et al., 2010; Gillebert et al., 2009; Jiang et al., 2007; Freedman, Riesenhuber, Poggio, & Miller, 2006; Freedman et al., 2003; Thomas et al., 2001), these neurons appeared to respond to physical shape similarity only, rather than to conceptual category membership. Instead, neurons selective for the conceptual categories were observed in pFC (Jiang et al., 2007; Freedman et al., 2001; see also Gotts et al., 2011).

However, the temporal dynamics of how the human brain translates physical stimuli into conceptual labels is less clear. In particular, recordings from monkey IT and pFC in the same animals showed that neurons in IT responded to stimulus shape with a mean latency of approximately 100 msec whereas neurons in pFC responded to more abstract parameters, including stimulus category membership, with a mean latency of nearly 200 msec (Freedman et al., 2003). In humans, surface intracranial field potentials indicated that object-selective information accumulated as early as 100 msec poststimulus onset in temporal cortex (Liu, Agam, Madsen, & Kreiman, 2009), but no recordings have probed the representation of conceptual category information and its latency in frontal cortex. Complicating the picture, some human fMRI studies (Van der Linden et al., 2010; Li, Ostwald, Giese, & Kourtzi, 2007) have reported posterior category selectivity, along with frontal selectivity. However, fMRI methods have poor temporal resolution because they only capture hemodynamic responses that integrate over thousands of milliseconds, an order of magnitude slower than the underlying cognitive processes. This low temporal resolution makes fMRI poorly suited to investigate the nuanced temporal dynamics of conceptual categorization, for instance, to address whether posterior categorical information manifests in the first feedforward processing pass following stimulus onset or is driven by top–down modulations that are invoked only after the first feedforward pass of information processing is complete.

In contrast, EEG has superior temporal resolution at the millisecond level that combines whole-brain coverage with the ability to precisely study the time course of the neural signals underlying visual categorization in humans. Several potentially relevant signal components have been identified in prior EEG studies of human object recognition. The N1 (also referred to as the N170, particularly in the context of face stimuli), the first negative-going ERP observed on posterior electrodes, has been linked to both stimulus detection (Bentin, Allison, Puce, Perez, & McCarthy, 1996) and individual stimulus encoding (Jacques & Rossion, 2006) in the domain of faces as well as objects of expertise in general (Scott, Tanaka, Sheinberg, & Curran, 2006, 2008). A substantial literature using EEG has shown that real world expertise as well as training on specific object recognition tasks lead to a selectively enhanced N1 (e.g., Scott et al., 2006, 2008; Rossion, Gauthier, Goffaux, Tarr, & Crommelinck, 2002; Tanaka & Curran, 2001). Given the lateral occipital origins of the N1 (Schweinberger, Pickering, Jentzsch, Burton, & Kaufmann, 2002; Shibata et al., 2002), this ERP response component is a good candidate for the manifestation of shape-selective neural responses predicted by the two-stage model of visual object categorization.

It is less clear which EEG components might correspond to explicit categorization. Following the N1 response, candidate ERP response components include the posterior N250 and the more anterior P300. However, the posterior topography of the N250 (Scott et al., 2006, 2008) is inconsistent with the anterior neural source of concept-selective signals observed in recent monkey physiology and human neuroimaging studies, and the long latency of the P300 (Polich & Margala, 1997) is inconsistent with the fast behavioral RTs observed for explicit categorization of single stimuli (Fabre-Thorpe, 2011).

A major challenge in probing the temporal dynamics of human object categorization with EEG is that the magnitude of neural responses alone is insufficient to isolate whether a population of neurons responds to the shape of the stimulus or the concept it represents. To disentangle shape and concept-selective signals, we used EEG-rapid adaptation (RA) techniques. Although previous EEG studies have exploited adaptation effects to probe the selectivity of neuronal representations underlying particular signal components, for example, the N1 (Zimmer & Kovács, 2011; Caharel, d'Arripe, Ramon, Jacques, & Rossion, 2009), RA techniques have been mostly used in fMRI (but see, e.g., Vizioli, Rousselet, & Caldara, 2010; Heisz, Watter, & Shedden, 2006). The RA approach is motivated by findings from IT monkey electrophysiology experiments reporting that the second presentation of a stimulus (within a short time period) evokes a smaller neural response than the first (Miller, Gochin, & Gross, 1993). It has been shown that this adaptation can be measured using fMRI and that the degree of adaptation depends on stimulus similarity in line with recent monkey electrophysiology results (De Baene & Vogels, 2010), with repetitions of the same stimulus causing the greatest suppression and dissimilar stimuli causing progressively less adaptation. We (Jiang et al., 2006, 2007) as well as others (Panis et al., 2008; Fang, Murray, & He, 2007; Gilaie-Dotan & Malach, 2007; Murray & Wojciulik, 2004) have provided evidence that parametric variations in visual object parameters (shape, orientation, or viewpoint) are reflected in systematic modulations of the fMRI-RA signal and can thus be used as an indirect measure of neural population tuning (Grill-Spector, Henson, & Martin, 2006). In fMRI-RA, two stimuli are presented in rapid succession in each trial, with the similarity between the two stimuli in each trial varied to investigate neuronal tuning along the dimensions of interest. We applied the same technique to EEG with the goal of probing the selectivity of shape and conceptual category-selective representations in the human brain and how they are dynamically engaged in participants trained on a novel categorization task over morphed shapes.

METHODS

Participants

Twenty participants participated in the study. One participant was eliminated from the group analysis because of inability to learn the task (behavioral performance more than 2 standard deviations below the mean), resulting in a sample of 19 participants (18 right-handed, 11 men, mean age = 24.0 years, range = 18–33 years). Georgetown University's Institutional Review Board approved experimental procedures, and written informed consent was obtained from all participants before the experiment.

Stimuli

Participants were trained on a visual categorization task involving car stimuli generated by a morphing system that was capable of finely and parametrically manipulating stimulus shape (Shelton, 2000). By morphing different amounts of the four car prototypes, we could generate thousands of unique images, continuously vary shape, and precisely define category boundaries (Figure 1A and B). The category of each stimulus was defined by whichever category prototypes contributed more (>50%) to a given morph (Jiang et al., 2007; Freedman et al., 2003). Thus, two sample stimuli could be similar yet situated on opposite sides of the category boundary, whereas stimuli that belonged to the same category could be dissimilar. This careful control of physical similarity within and across categories allowed us to disentangle the neural signals explicitly representing category membership from neural signals related to physical stimulus shape. The grayscale car images were presented on a white background for training and a neutral gray background for testing. Training images varied in size (between 200 and 320 pixels wide) and had different resolutions to prevent participants from focusing on local cues and to discourage a strategy based on latching on to individual local image differences. Images composed of blends of three or four prototypes were used for label training and spanned a four-dimensional morph space excluding a corridor of distances less than 5% from the category boundary. For category label testing, 21 images were positioned at 5% increments from each of the four distinct one-to-one prototype morph lines. The results of category testing were used to select four images from each morph line for the EEG testing (see Figure 1B). These four images were positioned at distances of 0%, 33%, 67%, and 100% relative to participants' subjective category boundaries, as determined by each individual's categorization test. The morph space extended from −20% to +120% for all morph lines, permitting the extraction of a balanced quadruplet for each morph line even when an individual category boundary deviated slightly from 50%. Thus, for a participant whose perceptual categorization results placed the category boundary for a given morph line at 45%, for instance, the EEG testing stimulus quadruplet for that particular morph line and participant would consist of images from positions −5%, 28%, 62%, and 95%.

Figure 1. 

Stimuli and behavioral paradigms. (A) Visual stimuli were generated from blends of four prototypes, with blends composed of >50% of two prototypes belonging to the “Sovor” category and blends composed of >50% of the other two prototypes being to the “Zupud” category. Arrows indicate “cross-category” morph lines between two prototypes belonging to different categories. (B) Stimuli and conditions for the EEG-RA paradigm, illustrated using an example morph line. Each condition presented a pair of stimuli drawn from cross-category morph lines in increments of thirds. Dotted line reflects category boundary. “Same category” conditions (M0, M3w) consisted of two stimuli from the same category; “Different category” conditions (M3b, M6) consisted of pairs of stimuli from different categories. Shape change between stimuli for the different conditions was small (M0), intermediate (M3w, M3b), or large (M6). (C) Category label training paradigm. (D) EEG-RA paradigm. The participants' task was to judge whether the two cars presented in a trial belonged to the same or different categories.

Figure 1. 

Stimuli and behavioral paradigms. (A) Visual stimuli were generated from blends of four prototypes, with blends composed of >50% of two prototypes belonging to the “Sovor” category and blends composed of >50% of the other two prototypes being to the “Zupud” category. Arrows indicate “cross-category” morph lines between two prototypes belonging to different categories. (B) Stimuli and conditions for the EEG-RA paradigm, illustrated using an example morph line. Each condition presented a pair of stimuli drawn from cross-category morph lines in increments of thirds. Dotted line reflects category boundary. “Same category” conditions (M0, M3w) consisted of two stimuli from the same category; “Different category” conditions (M3b, M6) consisted of pairs of stimuli from different categories. Shape change between stimuli for the different conditions was small (M0), intermediate (M3w, M3b), or large (M6). (C) Category label training paradigm. (D) EEG-RA paradigm. The participants' task was to judge whether the two cars presented in a trial belonged to the same or different categories.

Training

Participants completed category label training remotely with a Web-based implementation. Participants learned to categorize stimuli into two categories labeled “SOVOR” and “ZUPUD.” A single training trial consisted of a test stimulus presented for 400 msec, followed by a 300-msec mask, followed by both category labels randomly positioned on the left and right sides of the screen for each trial (illustrated in Figure 1C) to avoid spatial associations with the category labels. Participants had up to 3 sec to indicate the correct label of the test stimulus with the left or right arrow key. Incorrect responses elicited auditory feedback along with a display containing the correct label adjacent to the test stimulus that participants viewed as long as desired. The difficulty of the categorization task was increased by introducing morphs with increasingly greater contributions from the other category until participants could reliably (performance >80%) identify the category membership of randomly chosen images composed of up to 40% of the other category (similar to Jiang et al., 2007).

Categorization Testing

Following the completion of Web-based training, participants' categorization skills were tested in the lab. Trial timing and presentation arrangement were the same as during training but differed in that no feedback was provided and stimulus presentation was driven by the Psychtoolbox package in Matlab (Brainard, 1997). To test participants' ability to generalize, the stimuli used for testing (based on blends of two prototypes belonging to different categories, see above) were distinct from the stimuli used for training (based on blends of three to four prototypes), as in Jiang et al. (2007). Ten repetitions of each stimulus were presented during testing in randomized order. To identify the position of participants' category boundary along each morph line, categorization performance was fit on an individual participant basis for each morph line with a sigmoid function f(x), with x being the morph position variable, and parameters for the location of the category boundary (β), the rate of the change across the boundary (t), and the upper (a) and lower (c) limits of performance, as in Jiang et al. (2007),
formula
The parameter estimate for the category boundary was used to generate subject-specific stimuli used in subsequent EEG testing on a subject- and morph line-specific basis.

Electrophysiological Recordings

Scalp voltages were measured using an Electrical Geodesics (EGI, Eugene, OR) 128-channel Hydrocel geodesic sensor net and Net Amps 300 amplifier. Incoming data were digitally low-pass filtered at 200 Hz and sampled at 500 Hz using common mode rejection with vertex reference. Impedances were set below 40 kΩ before recording began and maintained below this threshold throughout the recording session with an impedance check during each break between blocks. During the experiment, participants performed a categorization task on sequentially presented pairs of stimuli (Figure 1D). As in our previous fMRI-RA study (Jiang et al., 2007), stimulus pairs were drawn from individual morph lines according to one of four conditions, with each condition appearing with equal probability: M0, corresponding to the presentation of the same stimulus twice; M3-within (hereafter abbreviated as M3w), corresponding to the presentation of two stimuli differing by a 33% shape change along the morph line from the same category (i.e., 0% and 33%; or 67% and 100% positions along the morph line relative to participants' category boundary); M3-between (hereafter abbreviated as M3b), corresponding to the presentation of two stimuli from different categories differing by 33% (i.e., 33% and 67% positions along the morph line); and M6, corresponding to the presentation of two stimuli differing by a 67% shape change from different categories, as illustrated in Figure 1B. The M0, M3w, M3b, and M6 conditions occurred with equal probability and frequency in randomized order. Participants responded with their right index or middle finger to indicate whether the pair of images belonged to the same, or different, categories. In a fifth trial type, only one car was presented; participants did not make a response in those trials. The results of these trials are not considered further in this article. Trial timing was 500 msec fixation, blank screen for 500–1000 msec, stimulus 1 for 200 msec, blank for 200 msec, stimulus 2 for 200 msec, blank until participant response or 2300 msec had passed (see Figure 1D). Participants completed five blocks of 240 trials or eight blocks of 180 trials, for a total of 1200 or 1280 trials per participant with equal numbers of trials per condition in all cases. Breaks between blocks of trials were used to maintain impedances below threshold of 40 kΩ.

Data Analysis

Data processing and statistical analyses were performed using EEGLAB (Delorme & Makeig, 2004), FieldTrip (Oostenveld, Fries, Maris, & Schoffelen, 2011) versions 20120204 and 20121212, and custom scripts in Matlab 7.10.0 (R2010a). Data were high-pass filtered at 0.1 Hz and low-pass filtered at 30 Hz using two-way least-squares FIR filtering in EEGLAB, epoched on the interval [−600 800] msec relative to Stimulus 2 onset for ERP visualization and truncated to the time interval [−200 400] relative to Stimulus 2 onset for statistical analyses, and baselined on the interval [−200 0] msec relative to stimulus onset to compare responses to the second stimulus across conditions. Bad channels were identified by visual inspection and replaced by the average of their neighbors through interpolation (Oostenveld et al., 2011). Trials containing artifacts or blinks were rejected if the recorded signal changed more than 75 μV within a trial on four vertical EOG channels, as in Scott et al. (2008). ERPs reflect trials for which participants responded correctly.

EEG signals were analyzed for stimulus selectivity using cluster-based permutation testing (Maris & Oostenveld, 2007). This approach permits the consideration of all the EEG data without imposing a priori constraints regarding which channels and time points reflect experimental manipulations while controlling for multiple comparisons. The space–time clusters were identified by subjecting every (channel, timepoint) pair between two conditions to a paired two-tailed t test to identify points where the conditions differed at p < αthresh before correcting for multiple comparisons. These points were then grouped into space–time clusters based on both temporal and spatial adjacency. Two points were considered temporally adjacent if they occurred at subsequent time points; spatial neighbors were constrained by triangulation, resulting in an average of 7.5 neighbors per channel (minimum 5, maximum 10 neighbors). For each cluster, a single statistic was extracted based on the sum of all t values in the cluster. The significance of each cluster was computed by recalculating each cluster statistic for 104 random partitions of the trial conditions. The overall significance of each cluster was calculated using the proportion of permutations for which the resulting cluster statistic was greater than the statistic calculated with the correct labels, resulting in a probability measure controlled for Type I error (Maris & Oostenveld, 2007).

Stimulus-selective space–time clusters over all 128 channels and selected time windows were identified by contrasting the M0 and M6 conditions to identify clusters that broadly showed adaptation to the stimuli used. For these primary contrasts, the two M3 conditions (M3w and M3b) were not involved in the cluster identification process, making it possible to independently assess the shape or conceptual category-tuning of the identified clusters by comparing the relative amplitudes of M3w and M3b in the stimulus-selective clusters identified by the M0 versus M6 contrast. Similar to our previous fMRI study (Jiang et al., 2007), we reasoned that signals elicited by populations of neurons responsive to a particular stimulus shape would have equivalent response levels to both M3 conditions, as the stimuli in each pair for the two conditions differed by an equivalent amount of physical shape change. The neuronal activation patterns caused by the second stimulus were therefore expected to have similar degrees of overlap and thus exhibit similar levels of adaptation to the M3w and M3b conditions. Unlike the shape-tuned neurons, populations of neurons explicitly selective for stimulus category (i.e., showing conceptual tuning) were predicted to exhibit different response levels in the two M3 conditions because M3w trials, which contained two stimuli that belonged to the same category, should repeatedly stimulate neurons tuned to the same category, causing adaptation. In contrast, because M3b trials contained two stimuli belonging to different categories, the two stimuli should activate different groups of category-selective neurons and therefore cause a stronger, unadapted response. Secondary, follow-up cluster searches were conducted by contrasting “same category” (M0 and M3w) versus “different category” (M3b and M6) conditions and M3w and M3b conditions. These contrasts served to identify the spatial topography and temporal extent of category selectivity.

We performed two cluster searches based on a priori hypotheses: an unconstrained cluster search on the entire 400 msec time window following the onset of the second stimulus to identify signatures of stimulus-selective processing across the whole brain and processing time window (with the [0 400] msec latency window motivated by the fastest participant's median RT for the M0 condition, which was 429.4 msec). We utilized a more temporally focused analysis to target the N1 component using a 50-msec time interval from 150 to 200 msec relative to Stimulus 2 onset. Given the lack of universal consensus regarding which channels to include in the analysis of the N1 (e.g., Eimer, Gosling, Nicholas, & Kiss, 2011; Caharel et al., 2009; Scott et al., 2008; Schweinberger et al., 2002), the N1 cluster search was initially conducted over all 128 channels using a cluster-identification threshold of αthresh = 0.05. Follow-up cluster searches were performed with lower αthresh values to probe the specificity of observed effects, as indicated in Results.

Functional connectivity was assessed by a coherence calculation in FieldTrip. Four groups of channels (left and right hemisphere N1, P2, and posterior category-selective clusters) were identified by cluster analysis, and coherence was calculated pairwise for each channel combination between the posterior category-selective cluster and the N1 cluster channels and between the posterior category-selective cluster channels and the P2 cluster channels. To limit the spatial extent of the P2 cluster channels to a size comparable to the extent of the posterior category-selective cluster (10 channels), channels belonging to the P2 cluster at cluster onset were considered (see Figure 5). To preserve temporal resolution while adequately capturing power changes over time, the time window for the coherence calculation varied by frequency from 2 to 30 Hz in steps of 2 Hz by a duration of 1/f, such that the time window at 2 Hz was 500 msec, at 4 Hz was 250 msec, and so on down to a 33-msec duration window at 30 Hz. Coherence was calculated in 10-msec steps, averaged over frequency bands from 2 to 30 Hz and normalized relative to a 200-msec prestimulus baseline to give percent change in coherence over time. Significance was assessed by contrasting each coherence time course with a null hypothesis of 0% coherence change relative to baseline. Significance is marked where two or more consecutive time windows reach p < .01.

Correlations with behavior were calculated by extracting the mean signal differences between the M3b and M3w conditions within clusters identified by contrasting the M0 and M6 conditions and calculating Pearson's r between these signal differences and the participants' mean accuracy on the M3w and M3b conditions.

RESULTS

Behavioral Data

Participants completed the Web-based category label training in 4.63 ± 0.84 hr (mean ± SEM). Following training, each participant's categorization ability was tested in the lab, and their performance was fit to a sigmoid function including parameters for the location of the category boundary, β, and slope of the change at the boundary, t (see Methods). The boundary parameter t indicating rate of change across the boundary was 0.80 ± 0.22 across participants and morph lines, indicating that, on average, for a category boundary positioned at 50%, participants indicated 93% category A for stimuli composed of 60% category A and 7% category A for a stimulus composed of 60% category B. Figure 2 shows accuracy and RTs on the category-matching task performed during EEG recordings. Accuracy across conditions showed a U-shaped profile, as would be expected, given that M3w and M6 conditions each involved one difficult category decision at the boundary, whereas M3b involved two difficult category decisions at the boundary. Correspondingly, RTs showed an inverted-U profile.

Figure 2. 

Behavioral results during the EEG categorization task, n = 19 participants, expressed as group mean ± SEM. (A) Accuracy (percent correct, dotted line shows chance performance) and (B) median RT (in milliseconds).

Figure 2. 

Behavioral results during the EEG categorization task, n = 19 participants, expressed as group mean ± SEM. (A) Accuracy (percent correct, dotted line shows chance performance) and (B) median RT (in milliseconds).

Electrophysiological Results

ERPs elicited by both stimuli for conditions M0, M3w, M3b, and M6 are shown in Figure 3 for all 128 channels, segregated into two broad topographical sections, anterior and posterior. An average of 11.3 ± 2.8% (mean ± SEM) of the trials were rejected because of artifacts for each participant. Classical ERP responses including the P1 and N1 can be seen in response to the first and second stimuli. Adaptation effects are also visible in these mean ERPs. For instance, although the N1 over posterior channels in response to the first stimulus did not differ across conditions, the N1 in response to the second stimulus showed evidence of adaptation, with an attenuation of M0, M3w, and M3b conditions relative to M6.

Figure 3. 

Overview of ERP waveforms. (A) Mean ERP elicited by both stimuli averaged over all correct trials for all participants, separated by conditions M0, M3w, M3b, and M6 over bilateral anterior channels highlighted in white on right. Time zero is designated relative to the onset of the second stimulus. Stimulus images at the top of the figure show trial schematic: The first and second stimuli appeared at −400 and 0 msec, respectively, and the offsets of the first and second images were at −200 and 200 msec, respectively. (B) ERP elicited on bilateral posterior channels, highlighted in white on right.

Figure 3. 

Overview of ERP waveforms. (A) Mean ERP elicited by both stimuli averaged over all correct trials for all participants, separated by conditions M0, M3w, M3b, and M6 over bilateral anterior channels highlighted in white on right. Time zero is designated relative to the onset of the second stimulus. Stimulus images at the top of the figure show trial schematic: The first and second stimuli appeared at −400 and 0 msec, respectively, and the offsets of the first and second images were at −200 and 200 msec, respectively. (B) ERP elicited on bilateral posterior channels, highlighted in white on right.

To probe for shape- and conceptual category-selective adaptation effects, we focused our analyses on the condition-dependent amplitude modulations emerging in response to the second stimulus. One space–time cluster was identified by contrasting M0 versus M6 trials on all 128 channels over the interval [150 200] msec using αthresh = 0.05. This lenient threshold revealed a cluster broadly distributed across bilateral posterior electrodes with temporal extent [172 200] msec and p = .036, as shown in Figure 4A. Figure 4B shows the ERP for each condition on the union of all channels in this cluster, with the temporal extent of the cluster highlighted in yellow. The nature of the stimulus selectivity exhibited by this space–time cluster was evaluated by extracting the response amplitude in the independent conditions, M3w and M3b, within the clusters identified by contrasting M0 and M6. Responses in the M0 and M6 conditions differed significantly as expected because of the cluster definition, but the key contrast of interest is how M3w and M3b differed relative to each other and also to M0 and M6. The mean response amplitude on this space–time cluster is shown in Figure 4C, with pairwise comparisons between conditions as noted. The M0, M3w, and M3b conditions were all significantly attenuated relative to the M6 condition, and M3w and M3b were equivalent. This response profile indicates that this cluster was broadly shape selective, showing release from adaptation for large shape changes (M6) and no explicit conceptual category selectivity.

Figure 4. 

(A) A posterior bilateral space–time cluster corresponding to the N1 is identified when searching all 128 channels over the interval [150 200] msec at αthresh = 0.05. This cluster has temporal extent [172 200] msec with p = .036, shown here superimposed on the M6–M0 scalp topography over the interval of the cluster. (B) ERP by condition on the union of all channels belonging to the cluster highlighted in A with temporal extent of the cluster highlighted. (C) Mean response amplitude by condition for this broad bilateral cluster. (D, G) Performing a more constrained cluster search limited to posterior channels (65/128 channels) over the interval [150 200] msec at αthresh = 0.01 identified two clusters: one right-lateralized with temporal extent [188 200] msec, p = .027, scalp topography (M6–M0) and channels shown in D; and one left-lateralized with temporal extent [184 194] msec, p = .030, scalp topography (M6–M0) and channels shown in G. (E) ERP on the union of all channels shown in D with the cluster's significant temporal extent highlighted. (F) Mean response amplitude by condition for the right-lateralized N1. (H) ERP on the union of all channels shown in G. (I) Mean response amplitude by condition for the left-lateralized N1. In all cases, the M0, M3w, and M3b conditions are significantly attenuated relative to M6 and the M3w and M3b condition responses are equivalent. ns, p > .05, *p < .05, +p < .01, **p < .005, ++p < .001, ***p < .0005.

Figure 4. 

(A) A posterior bilateral space–time cluster corresponding to the N1 is identified when searching all 128 channels over the interval [150 200] msec at αthresh = 0.05. This cluster has temporal extent [172 200] msec with p = .036, shown here superimposed on the M6–M0 scalp topography over the interval of the cluster. (B) ERP by condition on the union of all channels belonging to the cluster highlighted in A with temporal extent of the cluster highlighted. (C) Mean response amplitude by condition for this broad bilateral cluster. (D, G) Performing a more constrained cluster search limited to posterior channels (65/128 channels) over the interval [150 200] msec at αthresh = 0.01 identified two clusters: one right-lateralized with temporal extent [188 200] msec, p = .027, scalp topography (M6–M0) and channels shown in D; and one left-lateralized with temporal extent [184 194] msec, p = .030, scalp topography (M6–M0) and channels shown in G. (E) ERP on the union of all channels shown in D with the cluster's significant temporal extent highlighted. (F) Mean response amplitude by condition for the right-lateralized N1. (H) ERP on the union of all channels shown in G. (I) Mean response amplitude by condition for the left-lateralized N1. In all cases, the M0, M3w, and M3b conditions are significantly attenuated relative to M6 and the M3w and M3b condition responses are equivalent. ns, p > .05, *p < .05, +p < .01, **p < .005, ++p < .001, ***p < .0005.

To address the possibility that the stimulus selectivity profile observed in this time window depended on the broad spatial extent of the cluster, we re-ran the cluster search over the interval [150 200] msec using a more stringent using αthresh = 0.01 and searching only the posterior 65 channels instead of all 128 channels. This cluster search revealed two bilateral space–time clusters with topography comparable to the N1 in previous studies (Bentin et al., 1996; Bötzel, Schulze, & Stodieck, 1995). One cluster corresponded to the right hemisphere N1 response (Figure 4D) with temporal extent [188 200] msec and cluster level significance p = .027, and a second cluster corresponded to the left hemisphere N1 (Figure 4G) with temporal extent [184 194] msec and cluster significance p = .030. The ERP exhibited on the union of all channels within the clusters in each hemisphere are illustrated in Figure 4E and H. The mean response amplitudes within each cluster are plotted in Figure 4F and I: For both hemispheres, M0, M3w, and M3b were all significantly attenuated relative to the M6 condition, and M3w and M3b were equivalent. This response profile indicates that the N1 was broadly shape selective, showing release from adaptation for large shape changes, with no explicit conceptual category selectivity, and this selectivity profile was robust as the cluster was constrained in space and time by decreasing the αthresh level for cluster identification.

We next probed selectivity over the full [0 400] msec interval. Space–time cluster analysis was performed by contrasting M0 versus M6 over all 128 channels using αthresh = 0.01 and revealed one significant cluster (M6 > M0, p < .001), with temporal extent [210 400] msec. The topographical evolution of this cluster over time is illustrated in Figure 5A. A comparison of responses in the M3w and M3b conditions (which were independent of the conditions used to define the cluster) revealed a significant difference between these conditions, t(18) = −2.361, p = .030 (see Figure 5D), but not between M3b and M6, t(18) = 0.453, p = .656, indicating a conceptual category selectivity of the underlying neural source(s) generating this differential signal. Another contrast, comparing “same category” (M0 and M3w) versus “different category” (M3b and M6) conditions identified a markedly similar cluster (Figure 4B) with temporal extent [216 400] msec and cluster significance p < .001. This cluster was also categorical because responses for M3w versus M3b differed significantly, t(18) = −3.098, p = .006 (Figure 5E). We finally tested selectivity using the M3w versus M3b contrast. No cluster was identified before 300 msec at αthresh = 0.01, but a search with the more lenient criterion of αthresh = 0.05 returned a cluster largely overlapping with the M0 versus M6 and same versus different search clusters (Figure 5C, F, p = .030), again with temporal extent [216 400] msec. Note that the N1 component did not elicit a significant space–time cluster when searching the entire [0 400] msec time range as its temporal extent was too limited to lead to significance when searching over the whole 400-msec interval. This is not surprising given that cluster-based nonparametric tests are optimized for the identification of widespread, sustained effects (Maris & Oostenveld, 2007).

Figure 5. 

(A) Cluster selective for the M0 versus M6 contrast, found by searching 128 channels, [0 400] msec, αthresh = 0.01. Topoplots show M6–M0 response differences. For illustration, space–time points not included in the cluster are set to zero and shown in white; color shows response differences for channels and time points included in the cluster (cf. color bar on right). The inclusion of a given channel at each time window indicates that the channel belongs to the space–time cluster at any time point within that window. The M6–M0 cluster is significant at p < .001 with temporal extent [210 400] msec. (B) Cluster selective for the “same category” versus “different category” contrast, found by searching 128 channels, [0 400] msec, αthresh = 0.01. Topoplots show “same category”–“different category” response differences. Cluster-level significance is p < .001 with temporal extent [216 400] msec. (C) Category-selective cluster identified via the contrast of M3b versus M3w found by searching 128 channels, [0 400] msec, αthresh = 0.05. Topoplots show M3b–M3w response differences, cluster level significance p = .030 with temporal extent also [216 400] msec. (D) Response amplitudes across conditions for entire space–time cluster shown in (A). (E) Response amplitudes across conditions for cluster shown in (B). (F) Response amplitudes across conditions for cluster shown in (C). ns, p > .05, *p < .05, +p < .01, **p < .005, ++p < .001, ***p < .0005, +++p < .0001, ****p < .00005.

Figure 5. 

(A) Cluster selective for the M0 versus M6 contrast, found by searching 128 channels, [0 400] msec, αthresh = 0.01. Topoplots show M6–M0 response differences. For illustration, space–time points not included in the cluster are set to zero and shown in white; color shows response differences for channels and time points included in the cluster (cf. color bar on right). The inclusion of a given channel at each time window indicates that the channel belongs to the space–time cluster at any time point within that window. The M6–M0 cluster is significant at p < .001 with temporal extent [210 400] msec. (B) Cluster selective for the “same category” versus “different category” contrast, found by searching 128 channels, [0 400] msec, αthresh = 0.01. Topoplots show “same category”–“different category” response differences. Cluster-level significance is p < .001 with temporal extent [216 400] msec. (C) Category-selective cluster identified via the contrast of M3b versus M3w found by searching 128 channels, [0 400] msec, αthresh = 0.05. Topoplots show M3b–M3w response differences, cluster level significance p = .030 with temporal extent also [216 400] msec. (D) Response amplitudes across conditions for entire space–time cluster shown in (A). (E) Response amplitudes across conditions for cluster shown in (B). (F) Response amplitudes across conditions for cluster shown in (C). ns, p > .05, *p < .05, +p < .01, **p < .005, ++p < .001, ***p < .0005, +++p < .0001, ****p < .00005.

The broad spatiotemporal extent of the categorical response cluster identified by searching the time window from 0 to 400 msec raises the question whether it might span more than one cognitive process. For instance, although monkey (Freedman et al., 2001, 2003) and human studies (Gotts et al., 2011; Jiang et al., 2007) have established the presence of perceptual categorization circuits in pFC, a number of studies have shown that other frontal areas, including OFC (Kepecs, Uchida, Zariwala, & Mainen, 2008; Hsu, Bhatt, Adolphs, Tranel, & Camerer, 2005; Critchley, Mathias, & Dolan, 2001) and insula (Sarinopoulos et al., 2010; Grinband, Hirsch, & Ferrera, 2006) as well as medial frontal gyrus and ACC (Stern, Gonzalez, Welsh, & Taylor, 2010), encode categorization uncertainty with enhanced responses to stimuli at the category boundary. To probe for a dissociation of perceptual category selectivity and signals related to categorization uncertainty, we analyzed the temporal evolution of signals in the anterior category-selective cluster. Specifically, we reasoned that signals reflecting adaptation of neurons in circuits performing the perceptual categorization task should show significant response differences between M3w and M3b as well as M3w and M6 responses (as both cases involve a “same category” vs. “different category” contrast). On the other hand, neural sources coding for the uncertainty of this categorization would be expected to show significant response differences between M3w and M3b conditions (as the former contains only one stimulus at the boundary, whereas the latter contains two stimuli, and is therefore associated with a higher level of uncertainty) but not between M3w and M6 (as both conditions include one stimulus at the boundary and one at the endpoint). We therefore conducted paired t tests comparing responses in these conditions at 2-msec increments over the course of the entire original cluster that was identified using M0 versus M6 with αthresh = 0.01 over [0 400] msec. This analysis revealed that early on ([232 256] msec; see Figure 6A, B, D), signals in the cluster showed a left-lateralized response profile compatible with perceptual categorization because M3w was significantly different from both M3b and M6. Later in the cluster ([328 384] msec; see Figure 6A, C, E), the response profile shifted to one compatible with reflecting a mixture of category selectivity and categorization uncertainty, with responses in the M3w and M3b conditions still significantly different, but responses for M3w and M6 not being significantly different anymore and a broader distribution of selective signals. This analysis suggests that perceptual categorization was accomplished first, within 250 msec, followed by a computation of perceptual uncertainty within the subsequent 100 msec.

Figure 6. 

Time course of categorical responses within the space–time cluster identified by the M6–M0 contrast with temporal extent [210 400] msec (see Figure 5A). (A) Time course of signal differences between M3w and M3b (blue) and M3w and M6 (green), revealing that, initially, signals indicate category selectivity, whereas later signals show sensitivity to perceptual uncertainty. (B) M3b–M3w scalp topography for the interval [232 256] msec masked by the M6–M0 space–time cluster at this interval. (C) M3b–M3w scalp topography for the interval [328 384] msec masked by the M6–M0 space–time cluster at this interval. (D) Mean response profiles across the four conditions for the interval [232 256] msec, consistent with category selectivity. (E) Mean response profiles across the four conditions for the interval [328 384] msec. Significance level notation as in Figure 5.

Figure 6. 

Time course of categorical responses within the space–time cluster identified by the M6–M0 contrast with temporal extent [210 400] msec (see Figure 5A). (A) Time course of signal differences between M3w and M3b (blue) and M3w and M6 (green), revealing that, initially, signals indicate category selectivity, whereas later signals show sensitivity to perceptual uncertainty. (B) M3b–M3w scalp topography for the interval [232 256] msec masked by the M6–M0 space–time cluster at this interval. (C) M3b–M3w scalp topography for the interval [328 384] msec masked by the M6–M0 space–time cluster at this interval. (D) Mean response profiles across the four conditions for the interval [232 256] msec, consistent with category selectivity. (E) Mean response profiles across the four conditions for the interval [328 384] msec. Significance level notation as in Figure 5.

To further probe the selectivity of the spatially and temporally extensive categorical cluster illustrated in Figure 5A during its initial lateralized stage, subsequent bilateral stage, as well as the later posterior component separately, we performed two additional cluster analyses over all 128 channels constrained to the time windows [200 300] msec and [300 400] msec using the contrast M0 versus M6. We reduced αthresh to 0.005 to separate the spatially expansive clusters into subclusters. The [200 300] msec interval search still revealed only one anterior cluster, but the [300 400] msec search revealed two distinct clusters: an anterior cluster extending over the entire interval and a second cluster exhibiting a strongly categorical response on channels situated directly over parietal cortex, extending from 330 to 360 msec. Figure 7 shows the M6–M0 (part A) and M3b–M3w (part B) topographies within both clusters with the posterior cluster circled in blue. Figure 7C shows the ERP on the union of all channels in this posterior cluster with the temporal extent of the cluster highlighted, and Figure 7D shows the mean signal by condition for this cluster. Figure 7E shows the topography of functional connectivity between this posterior cluster and all other channels.

Figure 7. 

Category selectivity on posterior channels. (A) Two separate clusters appear when contrasting M0 versus M6 at αthresh = 0.005 over the time interval [300 400] msec. The posterior cluster is circled. Scalp topography shows M6–M0 conditions over the temporal extent of the posterior cluster (p = .036, [330 360] msec). (B) The M3b–M3w topography for the same interval, masked by the cluster found by contrasting M0 versus M6. (C) The ERP over the union of all channels in the posterior cluster (circled on A and B). (D) The mean response amplitude over each condition in this space–time cluster. Significance level notation as in Figure 5.

Figure 7. 

Category selectivity on posterior channels. (A) Two separate clusters appear when contrasting M0 versus M6 at αthresh = 0.005 over the time interval [300 400] msec. The posterior cluster is circled. Scalp topography shows M6–M0 conditions over the temporal extent of the posterior cluster (p = .036, [330 360] msec). (B) The M3b–M3w topography for the same interval, masked by the cluster found by contrasting M0 versus M6. (C) The ERP over the union of all channels in the posterior cluster (circled on A and B). (D) The mean response amplitude over each condition in this space–time cluster. Significance level notation as in Figure 5.

Functional connectivity was assessed by calculating the coherence between the posterior seed interval (circled in blue on A and B with its temporal extent highlighted on part C) and the N1 and P2 cluster channels (Figure 8A, see Methods). Following the onset of the first stimulus, functional connectivity is generally higher between the posterior category cluster channels and the P2 cluster channels than between the posterior category cluster channels and the N1 cluster channels. Moreover, although significant increases in coherence between the N1 and posterior category cluster channels begin earlier than those between the P2 and the posterior category cluster channels and this coherence decreases for both regions following stimulus 2 onset, coherence subsequently rebounds between the posterior category cluster and P2 cluster at the latency at which the posterior categorical signal is observed in the amplitude domain ERPs, providing support that input from anterior category-selective circuits might underlie the subsequent category-selective response modulations of posterior signals. We additionally analyzed coherence between the left N1 and P2 as well as between the right N1 and the P2 cluster to assess the lateralization of stimulus processing (see Figure 8B). Both N1 clusters exhibited significant connectivity with the left lateralized P2 beginning around 140 msec after Stimulus 1 onset, gradually decreasing, and then starting to rebound around the latency of the N1 associated with the second stimulus, supporting functional connectivity between the N1 and P2 clusters as postulated by the feedforward model.

Figure 8. 

Functional connectivity assessed by coherence between functionally defined channel clusters. (A) Coherence between EEG response on posterior channels exhibiting categorical responses (circled in blue on Figure 7A and B) and left and right shape-selective N1 cluster channels (see Figure 4D and G, magenta and yellow time courses on this figure) and category-selective P2 onset cluster channels (Figure 5A, second panel, green time course on this figure) over the time interval of both stimulus presentations. Vertical dashed lines show Stimulus 1 and Stimulus 2 onset, respectively. Whereas coherence between the shape-selective clusters and the posterior category-selective cluster returns to baseline around 200 msec, coherence between the category-selective anterior channels and the posterior cluster is sustained and shows a sharp increase starting around 310 msec, peaking around the time of maximal category selectivity. Shaded area shows SEM. Significance of each time course assessed by contrast with 200 msec baseline preceding stimulus onset (0% coherence change), p < .01 as indicated.

Figure 8. 

Functional connectivity assessed by coherence between functionally defined channel clusters. (A) Coherence between EEG response on posterior channels exhibiting categorical responses (circled in blue on Figure 7A and B) and left and right shape-selective N1 cluster channels (see Figure 4D and G, magenta and yellow time courses on this figure) and category-selective P2 onset cluster channels (Figure 5A, second panel, green time course on this figure) over the time interval of both stimulus presentations. Vertical dashed lines show Stimulus 1 and Stimulus 2 onset, respectively. Whereas coherence between the shape-selective clusters and the posterior category-selective cluster returns to baseline around 200 msec, coherence between the category-selective anterior channels and the posterior cluster is sustained and shows a sharp increase starting around 310 msec, peaking around the time of maximal category selectivity. Shaded area shows SEM. Significance of each time course assessed by contrast with 200 msec baseline preceding stimulus onset (0% coherence change), p < .01 as indicated.

Finally, to test the predicted mechanistic relationship between the observed neural signals indicating category selectivity and participants' ability to perform the perceptual categorization task, we examined the correlation of the response difference in the M3 conditions in these clusters (hypothesizing that stronger response differences between the M3B and M3W conditions reflected more sharply tuned category circuits, which should predict higher categorization performance) to the average of the behavioral categorization accuracy in those conditions, predicting a positive correlation between the two variables. Indeed, the signal difference between the M3b and M3w conditions within the cluster found by contrasting M0 versus M6 over the interval [200 300] msec, with an extent of [222 300] msec (Figure 9A) was found to strongly correlate with behavioral performance (r = 0.627, p = .004; Figure 9B). Likewise, the M3b–M3w signal difference on the anterior cluster for the subsequent 100 msec (temporal extent [300 400] msec, cluster significance p < .001; Figure 9C) was strongly correlated with behavior (r = 0.702, p < .001; Figure 9D). On the other hand, the posterior cluster identified during [300 400] msec interval was not significantly correlated with behavior, with r = 0.317 and p = .186.

Figure 9. 

Correlations with behavior. (A) The M3b–M3w signal difference topography masked by the cluster identified when contrasting M0 versus M6 over the interval [200 300] msec (128 channels, αthresh = 0.005), cluster significance p < .001 with temporal extent [222 300] msec. (B) This M3b–M3w signal difference is correlated with mean accuracy on the M3w and M3b conditions, r = 0.627, p < .005. (C) The M3b–M3w signal difference topography masked by the anterior cluster identified when contrasting M0 versus M6 over the interval [300 400] msec (128 channels, αthresh = 0.005), cluster significance p < .001 with temporal extent [300 400] msec. (D) This M3b–M3w signal difference again shows high correlation with mean accuracy on the M3w and M3b conditions, r = 0.702, p < .001. (E) The M3b–M3w signal difference topography masked by the posterior cluster identified when contrasting M0 versus M6 over the interval [300 400] msec, p = .036, with temporal extent [330 360] msec. (F) This M3b–M3w signal difference is not significantly correlated with mean accuracy on the M3w and M3b conditions, r = 0.317, p = .186.

Figure 9. 

Correlations with behavior. (A) The M3b–M3w signal difference topography masked by the cluster identified when contrasting M0 versus M6 over the interval [200 300] msec (128 channels, αthresh = 0.005), cluster significance p < .001 with temporal extent [222 300] msec. (B) This M3b–M3w signal difference is correlated with mean accuracy on the M3w and M3b conditions, r = 0.627, p < .005. (C) The M3b–M3w signal difference topography masked by the anterior cluster identified when contrasting M0 versus M6 over the interval [300 400] msec (128 channels, αthresh = 0.005), cluster significance p < .001 with temporal extent [300 400] msec. (D) This M3b–M3w signal difference again shows high correlation with mean accuracy on the M3w and M3b conditions, r = 0.702, p < .001. (E) The M3b–M3w signal difference topography masked by the posterior cluster identified when contrasting M0 versus M6 over the interval [300 400] msec, p = .036, with temporal extent [330 360] msec. (F) This M3b–M3w signal difference is not significantly correlated with mean accuracy on the M3w and M3b conditions, r = 0.317, p = .186.

DISCUSSION

Taken together, our results, identifying a shape-selective N1 over posterior channels and a subsequent, anterior category-selective cluster of channels, support the two-stage feedforward accounts of perceptual categorization (Panis et al., 2008; Thomas et al., 2001; Riesenhuber & Poggio, 2000; Ashby & Lee, 1991; Nosofsky, 1986) and are in agreement with the model positing that primate object categorization is mediated by frontal category-selective circuits that receive input from object shape-specific representations in occipitotemporal cortex (Jiang et al., 2007; Freedman et al., 2003; Riesenhuber & Poggio, 2000).

The N1 response profile observed in our RA experiment is consistent with a neurophysiological signal elicited by populations of neurons broadly encoding stimulus shape, independent of semantic attributes or category labels, as the M0 and M3 conditions are attenuated while the M6 condition exhibits a release from adaptation. Whereas previous fMRI experiments using the same stimuli demonstrated sharper tuning for shape after comparable amounts of training (Jiang et al., 2007), the broader response profile suggested by the results in our EEG-RA study is not surprising because EEG records the neural responses from multiple neural sources simultaneously and, in our case, likely included populations of neurons that specifically encode the shape of stimuli used in this experiment (such as the car shape-selective neurons in LOC identified in Jiang et al., 2007) as well as neurons less selective for (but still responsive to) the stimuli of interest. This shape selectivity of the N1 was found bilaterally. Category effects were more lateralized, with a dominance of the left hemisphere. This left hemisphere lateralization of the initial conceptual category effect contrasts with the finding of predominantly right hemispheric category selectivity found in the fMRI experiment (Jiang et al., 2007). This difference in lateralization could be because of the fact that this study, in contrast to Jiang et al. (2007), used verbal labels for the object categories, which might have led to a lateralization of category circuits in the language-dominant hemisphere. Interestingly, left lateralization of category circuits was also found in another study in which participants were trained on categories with verbal labels defined in a morphed stimulus space (Van der Linden et al., 2010). Left lateralization of category circuits is also consistent with evidence that left posterior dorsolateral pFC plays a key role in perceptual decision-making (Heekeren, Marrett, Ruff, Bandettini, & Ungerleider, 2006). This anterior category selectivity showed significant functional connectivity with shape-selective responses in both hemispheres, consistent with previous reports showing cross-hemispheric connectivity of right high level visual cortex and left pFC by way of the anterior corpus callosum (Tomita, Ohbayashi, & Nakahara, 1999; Gazzaniga, 1995), in addition to ipsilateral connections.

Our study reveals that information flow in perceptual categorization follows three distinct phases: posterior shape selectivity up to 200 msec following stimulus onset, anterior category selectivity after 200 msec following stimulus onset, and posterior category selectivity after 300 msec following stimulus onset. The spatially and temporally expansive conceptual category-selective cluster observed in our data was initially categorical on anterior channels and subsequently, around 330 msec poststimulus onset, exhibited response modulations consistent with perceptual uncertainty. The complex scalp topographies observed after 300 msec almost certainly reflect multiple neural processes taking place in parallel. An anterior signal is spatially dissociable from a posterior signal, but the anterior signal itself may likely reflect the convolution of multiple processes. The equivalent response levels for the M3w and M6 conditions, both of which reflect the categorization of one stimulus near the category boundary and one prototype stimulus, may reflect general processes related to uncertainty. Indeed, a number of studies have provided evidence that activity in particular frontal areas (Kepecs et al., 2008; Grinband et al., 2006; Hsu et al., 2005; Critchley et al., 2001) reflect uncertainty, but it is less clear how this uncertainty is computed. Our data, with their higher temporal resolution compared with fMRI, are compatible with the simple model that categorization is performed first, and uncertainty estimates are then derived from the level of activation of category units. That is, strong activation of units coding for just one category would be associated with low uncertainty, whereas activation of units for both categories (e.g., for more ambiguous stimuli close to the boundary) would be associated with higher degrees of uncertainty.

Around the same time, the signal consistent with decision certainty was observed on frontal channels, a strongly categorical response was observed on posterior channels. We can rule out the possibility that this posterior categorical signal arises from the opposite side of an anterior dipole because the direction of attenuation (“same category” < “different category”) was identical across the entire scalp topography. These posterior category-selective responses demonstrated strong functional connectivity with the preceding anterior signals (see Figure 7E). The selective coupling observed between anterior channels at the initial onset of category selectivity and the later posterior category signal reflect the synchronization of phases as well as covariation of power of neural signal during these intervals (Fell & Axmacher, 2011). Such “feedback” category signal may serve purposes proposed by other groups, including conscious awareness of the stimulus (Fahrenfort, Scholte, & Lamme, 2007), stimulus learning (Seitz & Dinse, 2007), and the directing of task-driven attention (Buschman & Miller, 2007). Interestingly, although showing strongly categorical responses (Figure 7C, D), the category selectivity of the posterior signal did not correlate significantly with individual behavioral categorization ability (Figure 9F). Although this might be because of the smaller spatiotemporal extent of the posterior cluster compared with the anterior clusters (Figure 9AD), the observed lack of a correlation is in agreement with the postulated postdecisional status of the posterior signal (which would be expected to produce strongly categorical signals on correct trials, as in our analysis).

The anterior-to-posterior sequence of conceptual category effects observed in this study are consistent with similar task effects reported elsewhere. For example, a recent monkey electrophysiology study (Goodwin, Blackman, Sakellaridi, & Chafee, 2012) found rule-dependent category signals in pFC with earlier latencies than similar signals in parietal cortex. In contrast, another recent monkey electrophysiology study (Swaminathan & Freedman, 2012), using a motion categorization task, found that parietal concept-selective signals preceded the onset of such signals in pFC. Although this might reflect a difference between shape and motion categorization (with motion categorization preferentially engaging the dorsal pathway, leading to earlier activation of parietal areas), another intriguing possibility is that categorization circuits in parietal cortex develop as a result of extensive task experience, leading to an anterior-to-posterior shift of task-selective activation, as observed in fMRI and EEG studies (Rivera, Reiss, Eckert, & Menon, 2005; Staines, Padilla, & Knight, 2002; Sakai et al., 1998). It will be interesting to probe in future studies in which task elements influence the order of information flow between anterior and posterior brain regions and the engagement of these areas in categorization.

Acknowledgments

This work was supported by the National Science Foundation (0449743 and 1232530 to M. R. and a Graduate Research Fellowship to C. A. S.) and the Research Program in Applied Neuroscience. We would like to thank Brian Jucha for implementing the Web-based training and Tim Curran for helpful advice.

Reprint requests should be sent to Clara A. Scholl, Georgetown University, 3970 Reservoir Rd, Washington, DC 20007, or via e-mail: cas243@georgetown.edu, clara.scholl@gmail.com.

REFERENCES

Ashby
,
F. G.
, &
Lee
,
W. W.
(
1991
).
Predicting similarity and categorization from identification.
Journal of Experimental Psychology
,
120
,
150
172
.
Bentin
,
S.
,
Allison
,
T.
,
Puce
,
A.
,
Perez
,
E.
, &
McCarthy
,
G.
(
1996
).
Electrophysiological studies of face perception in humans.
Journal of Cognitive Neuroscience
,
8
,
551
565
.
Bötzel
,
K.
,
Schulze
,
S.
, &
Stodieck
,
S. R.
(
1995
).
Scalp topography and analysis of intracranial sources of face-evoked potentials.
Experimental Brain Research. Experimentelle Hirnforschung. Expérimentation Cérébrale
,
104
,
135
143
.
Brainard
,
D. H.
(
1997
).
The psychophysics toolbox.
Spatial Vision
,
10
,
433
436
.
Buschman
,
T. J.
, &
Miller
,
E. K.
(
2007
).
Top–down versus bottom–up control of attention in the prefrontal and posterior parietal cortices.
Science
,
315
,
1860
1862
.
Caharel
,
S.
,
d'Arripe
,
O.
,
Ramon
,
M.
,
Jacques
,
C.
, &
Rossion
,
B.
(
2009
).
Early adaptation to repeated unfamiliar faces across viewpoint changes in the right hemisphere: Evidence from the N170 ERP component.
Neuropsychologia
,
47
,
639
643
.
Critchley
,
H. D.
,
Mathias
,
C. J.
, &
Dolan
,
R. J.
(
2001
).
Neural activity in the human brain relating to uncertainty and arousal during anticipation.
Neuron
,
29
,
537
545
.
De Baene
,
W.
, &
Vogels
,
R.
(
2010
).
Effects of adaptation on the stimulus selectivity of macaque inferior temporal spiking activity and local field potentials.
Cerebral Cortex
,
20
,
2145
2165
.
Delorme
,
A.
, &
Makeig
,
S.
(
2004
).
EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis.
Journal of Neuroscience Methods
,
134
,
9
21
.
Eimer
,
M.
,
Gosling
,
A.
,
Nicholas
,
S.
, &
Kiss
,
M.
(
2011
).
The N170 component and its links to configural face processing: A rapid neural adaptation study.
Brain Research
,
1376
,
76
87
.
Fabre-Thorpe
,
M.
(
2011
).
The characteristics and limits of rapid visual categorization.
Frontiers in Psychology
,
2
,
243
.
Fahrenfort
,
J. J.
,
Scholte
,
H. S.
, &
Lamme
,
V. A. F.
(
2007
).
Masking disrupts reentrant processing in human visual cortex.
Journal of Cognitive Neuroscience
,
19
,
1488
1497
.
Fang
,
F.
,
Murray
,
S. O.
, &
He
,
S.
(
2007
).
Duration-dependent fMRI adaptation and distributed viewer-centered face representation in human visual cortex.
Cerebral Cortex
,
17
,
1402
1411
.
Fell
,
J.
, &
Axmacher
,
N.
(
2011
).
The role of phase synchronization in memory processes.
Nature Reviews Neuroscience
,
12
,
105
118
.
Freedman
,
D. J.
,
Riesenhuber
,
M.
,
Poggio
,
T.
, &
Miller
,
E. K.
(
2001
).
Categorical representation of visual stimuli in the primate prefrontal cortex.
Science
,
291
,
312
316
.
Freedman
,
D. J.
,
Riesenhuber
,
M.
,
Poggio
,
T.
, &
Miller
,
E. K.
(
2003
).
A comparison of primate prefrontal and inferior temporal cortices during visual categorization.
The Journal of Neuroscience
,
23
,
5235
5246
.
Freedman
,
D. J.
,
Riesenhuber
,
M.
,
Poggio
,
T.
, &
Miller
,
E. K.
(
2006
).
Experience-dependent sharpening of visual shape selectivity in inferior temporal cortex.
Cerebral Cortex
,
16
,
1631
1644
.
Gazzaniga
,
M. S.
(
1995
).
Principles of human brain organization derived from split-brain studies.
Neuron
,
14
,
217
228
.
Gilaie-Dotan
,
S.
, &
Malach
,
R.
(
2007
).
Sub-exemplar shape tuning in human face-related areas.
Cerebral Cortex
,
17
,
325
338
.
Gillebert
,
C. R.
,
Op de Beeck
,
H.
,
Panis
,
S.
, &
Wagemans
,
J.
(
2009
).
Subordinate categorization enhances the neural selectivity in human object-selective cortex for fine shape differences.
Journal of Cognitive Neuroscience
,
21
,
1054
1064
.
Goodwin
,
S. J.
,
Blackman
,
R. K.
,
Sakellaridi
,
S.
, &
Chafee
,
M. V.
(
2012
).
Executive control over cognition: Stronger and earlier rule-based modulation of spatial category signals in prefrontal cortex relative to parietal cortex.
The Journal of Neuroscience
,
32
,
3499
3515
.
Gotts
,
S. J.
,
Milleville
,
S. C.
,
Bellgowan
,
P. S. F.
, &
Martin
,
A.
(
2011
).
Broad and narrow conceptual tuning in the human frontal lobes.
Cerebral Cortex
,
21
,
477
491
.
Grill-Spector
,
K.
,
Henson
,
R.
, &
Martin
,
A.
(
2006
).
Repetition and the brain: Neural models of stimulus-specific effects.
Trends in Cognitive Sciences
,
10
,
14
23
.
Grinband
,
J.
,
Hirsch
,
J.
, &
Ferrera
,
V. P.
(
2006
).
A neural representation of categorization uncertainty in the human brain.
Neuron
,
49
,
757
763
.
Heekeren
,
H. R.
,
Marrett
,
S.
,
Ruff
,
D.
,
Bandettini
,
P.
, &
Ungerleider
,
L. G.
(
2006
).
Involvement of human left dorsolateral prefrontal cortex in perceptual decision making is independent of response modality.
Proceedings of the National Academy of Sciences
,
103
,
10023
10028
.
Heisz
,
J. J.
,
Watter
,
S.
, &
Shedden
,
J. M.
(
2006
).
Progressive N170 habituation to unattended repeated faces.
Vision Research
,
46
,
47
56
.
Hsu
,
M.
,
Bhatt
,
M.
,
Adolphs
,
R.
,
Tranel
,
D.
, &
Camerer
,
C. F.
(
2005
).
Neural systems responding to degrees of uncertainty in human decision-making.
Science
,
310
,
1680
1683
.
Jacques
,
C.
, &
Rossion
,
B.
(
2006
).
The speed of individual face categorization.
Psychological Science
,
17
,
485
492
.
Jiang
,
X.
,
Bradley
,
E.
,
Rini
,
R. A.
,
Zeffiro
,
T.
,
Vanmeter
,
J.
, &
Riesenhuber
,
M.
(
2007
).
Categorization training results in shape- and category-selective human neural plasticity.
Neuron
,
53
,
891
903
.
Jiang
,
X.
,
Rosen
,
E.
,
Zeffiro
,
T.
,
Vanmeter
,
J.
,
Blanz
,
V.
, &
Riesenhuber
,
M.
(
2006
).
Evaluation of a shape-based model of human face discrimination using fMRI and behavioral techniques.
Neuron
,
50
,
159
172
.
Kepecs
,
A.
,
Uchida
,
N.
,
Zariwala
,
H. A.
, &
Mainen
,
Z. F.
(
2008
).
Neural correlates, computation and behavioural impact of decision confidence.
Nature
,
455
,
227
231
.
Kravitz
,
D. J.
,
Saleem
,
K. S.
,
Baker
,
C. I.
,
Ungerleider
,
L. G.
, &
Mishkin
,
M.
(
2012
).
The ventral visual pathway: An expanded neural framework for the processing of object quality.
Trends in Cognitive Sciences
,
17
,
26
49
.
Li
,
S.
,
Ostwald
,
D.
,
Giese
,
M.
, &
Kourtzi
,
Z.
(
2007
).
Flexible coding for categorical decisions in the human brain.
The Journal of Neuroscience
,
27
,
12321
12330
.
Liu
,
H.
,
Agam
,
Y.
,
Madsen
,
J. R.
, &
Kreiman
,
G.
(
2009
).
Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex.
Neuron
,
62
,
281
290
.
Maris
,
E.
, &
Oostenveld
,
R.
(
2007
).
Nonparametric statistical testing of EEG- and MEG-data.
Journal of Neuroscience Methods
,
164
,
177
190
.
Miller
,
E. K.
,
Gochin
,
P. M.
, &
Gross
,
C. G.
(
1993
).
Suppression of visual responses of neurons in inferior temporal cortex of the awake macaque by addition of a second stimulus.
Brain Research
,
616
,
25
29
.
Murray
,
S. O.
, &
Wojciulik
,
E.
(
2004
).
Attention increases neural selectivity in the human lateral occipital complex.
Nature Neuroscience
,
7
,
70
74
.
Nosofsky
,
R. M.
(
1986
).
Attention, similarity, and the identification-categorization relationship.
Journal of Experimental Psychology: General
,
115
,
39
61
.
Oostenveld
,
R.
,
Fries
,
P.
,
Maris
,
E.
, &
Schoffelen
,
J.-M.
(
2011
).
FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data.
Computational Intelligence and Neuroscience
,
2011
,
156869
.
Op de Beeck
,
H.
,
Wagemans
,
J.
, &
Vogels
,
R.
(
2001
).
Inferotemporal neurons represent low-dimensional configurations of parameterized shapes.
Nature Neuroscience
,
4
,
1244
1252
.
Panis
,
S.
,
Vangeneugden
,
J.
,
Op de Beeck
,
H. P.
, &
Wagemans
,
J.
(
2008
).
The representation of subordinate shape similarity in human occipitotemporal cortex.
Journal of Vision
,
8
,
9.1
9.15
.
Polich
,
J.
, &
Margala
,
C.
(
1997
).
P300 and probability: Comparison of oddball and single-stimulus paradigms.
International Journal of Psychophysiology
,
25
,
169
176
.
Riesenhuber
,
M.
, &
Poggio
,
T.
(
2000
).
Models of object recognition.
Nature Neuroscience
,
3(Suppl.)
,
1199
1204
.
Riesenhuber
,
M.
, &
Poggio
,
T.
(
2002
).
Neural mechanisms of object recognition.
Current Opinion in Neurobiology
,
12
,
162
168
.
Rivera
,
S. M.
,
Reiss
,
A. L.
,
Eckert
,
M. A.
, &
Menon
,
V.
(
2005
).
Developmental changes in mental arithmetic: Evidence for increased functional specialization in the left inferior parietal cortex.
Cerebral Cortex
,
15
,
1779
1790
.
Rossion
,
B.
,
Gauthier
,
I.
,
Goffaux
,
V.
,
Tarr
,
M. J.
, &
Crommelinck
,
M.
(
2002
).
Expertise training with novel objects leads to left-lateralized facelike electrophysiological responses.
Psychological Science
,
13
,
250
257
.
Sakai
,
K.
,
Hikosaka
,
O.
,
Miyauchi
,
S.
,
Takino
,
R.
,
Sasaki
,
Y.
, &
Pütz
,
B.
(
1998
).
Transition of brain activation from frontal to parietal areas in visuomotor sequence learning.
The Journal of Neuroscience
,
18
,
1827
1840
.
Sarinopoulos
,
I.
,
Grupe
,
D. W.
,
Mackiewicz
,
K. L.
,
Herrington
,
J. D.
,
Lor
,
M.
,
Steege
,
E. E.
,
et al
(
2010
).
Uncertainty during anticipation modulates neural responses to aversion in human insula and amygdala.
Cerebral Cortex
,
20
,
929
940
.
Schweinberger
,
S. R.
,
Pickering
,
E. C.
,
Jentzsch
,
I.
,
Burton
,
A. M.
, &
Kaufmann
,
J. M.
(
2002
).
Event-related brain potential evidence for a response of inferior temporal cortex to familiar face repetitions.
Brain Research. Cognitive Brain Research
,
14
,
398
409
.
Scott
,
L. S.
,
Tanaka
,
J. W.
,
Sheinberg
,
D. L.
, &
Curran
,
T.
(
2006
).
A reevaluation of the electrophysiological correlates of expert object processing.
Journal of Cognitive Neuroscience
,
18
,
1453
1465
.
Scott
,
L. S.
,
Tanaka
,
J. W.
,
Sheinberg
,
D. L.
, &
Curran
,
T.
(
2008
).
The role of category learning in the acquisition and retention of perceptual expertise: A behavioral and neurophysiological study.
Brain Research
,
1210
,
204
215
.
Seitz
,
A. R.
, &
Dinse
,
H. R.
(
2007
).
A common framework for perceptual learning.
Current Opinion in Neurobiology
,
17
,
148
153
.
Shelton
,
C.
(
2000
).
Morphable surface models.
International Journal of Computer Vision
,
38
,
75
91
.
Shibata
,
T.
,
Nishijo
,
H.
,
Tamura
,
R.
,
Miyamoto
,
K.
,
Eifuku
,
S.
,
Endo
,
S.
,
et al
(
2002
).
Generators of visual evoked potentials for faces and eyes in the human brain as determined by dipole localization.
Brain Topography
,
15
,
51
63
.
Staines
,
W. R.
,
Padilla
,
M.
, &
Knight
,
R. T.
(
2002
).
Frontal-parietal event-related potential changes associated with practising a novel visuomotor task.
Brain Research. Cognitive Brain Research
,
13
,
195
202
.
Stern
,
E. R.
,
Gonzalez
,
R.
,
Welsh
,
R. C.
, &
Taylor
,
S. F.
(
2010
).
Updating beliefs for a decision: Neural correlates of uncertainty and underconfidence.
The Journal of Neuroscience
,
30
,
8032
8041
.
Swaminathan
,
S. K.
, &
Freedman
,
D. J.
(
2012
).
Preferential encoding of visual categories in parietal cortex compared with prefrontal cortex.
Nature Neuroscience
,
15
,
315
320
.
Tanaka
,
J. W.
, &
Curran
,
T.
(
2001
).
A neural basis for expert object recognition.
Psychological Science
,
12
,
43
47
.
Thomas
,
E.
,
Van Hulle
,
M. M.
, &
Vogels
,
R.
(
2001
).
Encoding of categories by noncategory-specific neurons in the inferior temporal cortex.
Journal of Cognitive Neuroscience
,
13
,
190
200
.
Tomita
,
H.
,
Ohbayashi
,
M.
, &
Nakahara
,
K.
(
1999
).
Top–down signal from prefrontal cortex in executive control.
Nature
,
401
,
699
703
.
Van der Linden
,
M.
,
Van Turennout
,
M.
, &
Indefrey
,
P.
(
2010
).
Formation of category representations in superior temporal sulcus.
Journal of Cognitive Neuroscience
,
22
,
1270
1282
.
Vizioli
,
L.
,
Rousselet
,
G. A.
, &
Caldara
,
R.
(
2010
).
Neural repetition suppression to identity is abolished by other-race faces.
Proceedings of the National Academy of Sciences
,
107
,
20081
20086
.
Zimmer
,
M.
, &
Kovács
,
G.
(
2011
).
Position specificity of adaptation-related face aftereffects.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
366
,
586
595
.