Objective: Visual expertise for particular categories of objects (e.g., mushrooms, birds, flowers, minerals, and so on) is known to enhance cortical responses in parts of the ventral occipitotemporal cortex. How is such additional expertise integrated into the prior cortical representation of life-long visual experience? To address this question, we presented synthetic visual objects rotating in three dimensions and recorded multivariate BOLD responses as initially unfamiliar objects gradually became familiar.

Main results: An analysis of pairwise distances between multivariate BOLD responses (“representational similarity analysis,” RSA) revealed that visual objects were linearly discriminable in large parts of the ventral occipital cortex, including the primary visual cortex, as well as in certain parts of the parietal and frontal cortex. These cortical representations were present from the start, when objects were still unfamiliar, and even though objects were shown from different sides. As shapes became familiar with repeated viewing, the distribution of responses expanded to fill more of the available space. In contrast, the distribution of responses to novel shapes (which appeared only once) contracted and shifted to the margins of the available space.

Conclusion: Our results revealed cortical representations of object shape and gradual changes in these representations with learning and consolidation. The cortical representations of once-viewed shapes that remained novel diverged dramatically from repeatedly viewed shapes that became familiar. This disparity was evident in both the similarity and the diversity of multivariate BOLD responses.

An essential aspect of visual object recognition is the processing of visual shapes. The neural substrate of shape processing includes the ventral visual pathway, which in humans extends over the ventral occipitotemporal cortex from the occipital pole to the lateral occipital cortex, fusiform gyrus, and beyond (reviewed by Bi et al., 2016; Grill-Spector & Weiner, 2014; Kravitz et al., 2013; Weiner & Zilles, 2016). Functional imaging studies of ventral occipitotemporal cortex reveal intriguing functional anatomy, with responsiveness to specific object categories (e.g., faces, scenes, body parts) changing systematically over the cortical surface along several large-scale anatomical gradients (e.g., animate-inanimate, large-small, feature-whole, or perception-action; Freud et al., 2017; Grill-Spector & Weiner, 2014; Grill-Spector et al., 2004; Konkle & Oliva, 2012; Wurm & Caramazza, 2022; Yildirim et al., 2019).

Experience and learning improve object recognition performance, and also modify shape processing in the ventral occipitotemporal cortex. Indeed, functional imaging evidence shows that particular visual expertise—being able to identify and categorize visually similar objects of a particular kind—often entails moderate but anatomically distributed changes in the pre-existing responsiveness to shape (reviewed by Bukach et al., 2006; de Beeck & Baker, 2010; Gauthier & Tarr, 2016; Harel et al., 2013). This has been established by comparing novices and experts for identifying particular categories of natural objects (e.g., birds, mushrooms, minerals, degraded images; Cetron et al., 2019; Connolly et al., 2012; Duyck et al., 2021; Freud et al., 2017; Martens et al., 2018; McGugin et al., 2012; Roth & Zohary, 2015), as well as by comparing observers before and after they have learned to categorize initially unfamiliar synthetic shapes (e.g., computer-generated “greebles,” “spikies,” or “ziggerins”; Brants et al., 2011; de Beeck et al., 2006; Gauthier et al., 1999; A. C.-N. Wong et al., 2009; Y. K. Wong et al., 2012; Yue et al., 2006).

Here, we map the cortical representation of synthetic visual objects and track gradual changes as initially unfamiliar objects become progressively familiar with learning. We wondered how pre-existing shape representations would accommodate and integrate novel synthetic objects. We further wondered whether representational changes would be specific to learned objects or extend also to other objects of the same kind. To explore these questions, we analyzed “representational similarity” of spatiotemporal BOLD patterns (Haxby, 2012; Kriegeskorte, Mur, Ruff, et al., 2008), which offers a potentially sensitive measure for the information encoded in neural activity and may also be related to similarity as perceived by human observers (Charest & Kriegeskorte, 2015; Collins & Behrmann, 2020; Nestor et al., 2016).

Most previous studies of visual expertise identified cortical sites associated with a particular object category by comparing BOLD activity either between novices and experts or before and after learning. We extend this work in three ways: firstly, by establishing representational distance at the level of object exemplars rather than object categories; secondly, by monitoring gradual changes as observers gain familiarity with object exemplars; and thirdly, by analyzing changes in the diversity of multivariate BOLD activity. Few previous studies have attempted to resolve shape representations in such detail (Brants et al., 2016; Duyck et al., 2021; Eger et al., 2008; Visconti di Oleggio Castello et al., 2021). To progress fine-grained analysis of representational geometry, we developed synthetic shapes for which visual expertise is acquired comparatively slowly (Kakaei et al., 2021) and took advantage of a numerically tractable method for linear discriminant analysis in O(103)-dimensional multivariate activity (DLDA; Yu & Yang, 2001).

Our results showed view-invariant representations of shape over surprisingly extensive regions of the ventral occipitotemporal cortex, including the fusiform gyrus, lateral occipital areas, and primary visual cortex. Representational distances were high from the start, even before learning, suggesting that new visual expertise was accommodated and encoded within pre-existing representations. However, shapes that appeared repeatedly (and were memorized by observers) and shapes that appeared just once (and were ignored) diverged dramatically, in terms of their cortical representations, while visual expertise was being acquired and consolidated.

2.1 Observers and behavior

Eight healthy observers (4 female and 4 male; aged 25 to 32 years) took part in behavioral training (“sham experiment,” one session per observer), the functional imaging experiment (“main experiment,” six scanning sessions per observer), and a final behavioral assessment (two sessions). All observers were paid and gave informed consent. Ethical approval was granted under Chiffre 30/21 by the ethics committee of the Faculty of Medicine of the Otto-von-Guericke University, Magdeburg.

In both sham and main experiments, observers viewed sequences of 200 recurring and non-recurring objects (see below and Fig. 1A) and attempted to classify each object as “familiar” or “novel” (by pressing the appropriate button). Over the course of multiple sessions, observers gradually became familiar with recurring objects and thus became able to distinguish them from non-recurring objects. Objects of the sham experiment were two-dimensional shapes, whereas objects of the main experiment were rotating, three-dimensional shapes (see below and Fig. 1A).

Fig. 1.

Experimental paradigm. (A) Complex objects were shown for 2.5 s each, separated by 0.5 s transition, in sequences of 200 presentations, with a total duration of 600 s. Over 1 week, observers participated in 3 sessions, viewing 6 sequences during each session (18 sequences in total). Fifteen objects appeared many times each (“recurring objects”), while other objects appeared exactly once (“non-recurring objects”). Observers were required to categorize each object as either “familiar” or “unfamiliar” (by button press). (B) Objects appeared randomly rotated and revolved for one full turn (clockwise or counter-clockwise about variable axes in the frontal plane, inclination of 0, 45, or 45). (C) Over the course of the week, as observers became familiar with recurring objects, classification performance improved. Here, performance (average and S.E.M.) is shown as a function of the number of presentations for 15 recurring objects, 8 observers, and 2 conditions. The relation between presentations and sessions was probabilistic (indicated by gray shading). (D) Reaction time (average and S.E.M.) as a function of presentation number. With increasing familiarity, reaction times decrease by 50% (from 1.7s to 0.9s) and become considerably shorter than the presentation time.

Fig. 1.

Experimental paradigm. (A) Complex objects were shown for 2.5 s each, separated by 0.5 s transition, in sequences of 200 presentations, with a total duration of 600 s. Over 1 week, observers participated in 3 sessions, viewing 6 sequences during each session (18 sequences in total). Fifteen objects appeared many times each (“recurring objects”), while other objects appeared exactly once (“non-recurring objects”). Observers were required to categorize each object as either “familiar” or “unfamiliar” (by button press). (B) Objects appeared randomly rotated and revolved for one full turn (clockwise or counter-clockwise about variable axes in the frontal plane, inclination of 0, 45, or 45). (C) Over the course of the week, as observers became familiar with recurring objects, classification performance improved. Here, performance (average and S.E.M.) is shown as a function of the number of presentations for 15 recurring objects, 8 observers, and 2 conditions. The relation between presentations and sessions was probabilistic (indicated by gray shading). (D) Reaction time (average and S.E.M.) as a function of presentation number. With increasing familiarity, reaction times decrease by 50% (from 1.7s to 0.9s) and become considerably shorter than the presentation time.

Close modal

The main experiment extended over 3 successive weeks, with three sessions on separate days of both the 1st and 3rd week (no sessions took place in the 2nd week). The experiments of the 1st and 3rd week differed in four aspects: sequence type (structured or unstructured), the set of recurring objects, object color (red or blue), and responding hand (left or right). All aspects were counterbalanced across observers.

After the three scanning sessions of a week, observers participated in an additional behavioral session to confirm that they had in fact become familiar with every recurring object. Specifically, they performed a spatial search task in which they pointed out recurring target objects among non-recurring distractor objects (Kakaei et al., 2021). In addition, observers were offered the opportunity to voice anything they might have noticed about the experiment.

2.2 Experimental paradigm

Complex three-dimensional objects were computer-generated and presented as described previously (Kakaei et al., 2021). A movie can be viewed under this LINK. All objects were highly characteristic and dissimilar from each other as confirmed computationally in terms of vector distances between depth maps (Kakaei et al., 2021). Objects were presented every 3s, with 2.5s viewing and 0.5s transition time (Fig. 1A). Objects were shown from all sides and, after appearing at an arbitrary angle, revolved smoothly for one full turn (period 2.5s, frequency 0.4Hz, angular frequency 144/s) about one of several axes in the frontal plane (45, 0, 45, clockwise or counter-clockwise). Axes and directions were counterbalanced for each object, and initial viewing angles were chosen randomly (Fig. 1B). All stimuli were generated with MATLAB (The MathWorks, Inc.), presented with the psychophysics toolbox (Brainard, 1997), and viewed in a mirror mounted to the MR head coil (screen resolution 960×720 pixels, frame rate 60Hz, subtending approximately 8×6 of visual angle, average luminance 50Cd/m2, background luminance 5Cd/m2). Observers responded with the right or left index finger on an MR-safe response box.

Fifteen objects recurred many times during three sessions (“recurring” objects), whereas other objects appeared exactly once (“non-recurring” or “singular” objects). As mentioned, observers classified every object as either “familiar” or “unfamiliar” by pressing a button during its presentation. Over the course of three sessions, all observers gradually became familiar with the “recurring objects” (see below). The average time-course of learning, as established by a simplified signal detection and reaction-time (RT) analysis, is shown in Figure 1C.

Every session comprised six sequences (“runs”), each lasting 600s and presenting 180 “recurring” and 20 “non-recurring” objects (200 objects in total). As there were 15 different recurrent objects, each such object was seen 12±1.9 times during every sequence. Over the three sessions (or 18 sequences), each recurring object appeared at least 190 times each (mean ± S.D.: 216±9), whereas non-recurring objects appeared only once. Altogether, there were 3,240 presentations of recurring objects (3×6×180) and 360 presentations of non-recurring objects (3×6×180).

Presentation sequences started with a random recurring object and continued randomly to one of the possible next objects, with neither immediate repetitions (XX) nor direct returns (XYX) being allowed. Sequences comprised 200 objects, of which 180 were recurring and 20 objects non-recurring and were interspersed at random intervals. Object sequences were post-selected such as to counterbalance the number of appearances of every recurring object in every session.

All observers performed the experiment twice in the scanner, once during the 1st week and again during the 3rd week of the main experiment (so that 8 observers provided 16 data sets). As mentioned, the 2 weeks differed in terms of the recurring objects and the presentation sequence. “Structured” sequences exhibited predictive sequential dependencies (3 possible recurring next objects), whereas “unstructured” sequences did not (14 possible recurring next objects, see Kakaei et al., 2021 for details). As a result, the repetition latency (i.e., the latency of successive presentations of the same object) was 5.5±15 (median and S.D.) for “structured” and 10.5±11 for “unstructured” sequences. Further aspects and effects of sequence structure are reported and discussed in detail in a companion paper.

To verify that recurring objects had become familiar to observers, every observer performed 60 trials of a spatial search task with 3 recurring and 9 non-recurring objects. The 12 objects were positioned randomly in a 3×4 array and were presented for 30 s while rotating in three dimensions (as in the main experiment). After each presentation, observers indicated the recurring object positions with the computer mouse. Performance was consistently above 95% correct.

2.3 MRI acquisition

All magnetic-resonance images were acquired on a 3T Siemens Prisma scanner with a 64-channel head coil. Structural images were T1-weighted sequences (MPRAGE TR = 2,500 ms, TE = 2.82 ms, TI = 1,100 ms, 7 flip angle, isotropic resolution 1×1×1mm and matrix size of 256×256×192). Functional images were T2*-weighted sequences (TR = 1,000 ms, TE = 30 ms, 65 flip angle, resolution of 3×3×3.6mm and matrix size of 72×72×36). Field maps were obtained by gradient dual-echo sequences (TR = 720 ms, TE1 = 4.92 ms, TE2 = 7.38 ms, resolution of 1.594×1.594×2mm and matrix size of 138×138×72).

2.4 fMRI pre-processing

Our approach to fMRI analysis was influenced by recent advances in comparing uni- and multivariate responses of corresponding voxels between different observers (e.g., Kumar et al., 2022; Nastase et al., 2019). The local correlation structure of voxel response, which is similar in different observers, provided the basis for our functional parcellation (Dornas & Braun, 2018). The parcellation obviated “searchlight” strategies by defining for all observers corresponding brain “parcels” with corresponding episodes of high-dimensional (O(1000)) multivariate activity.

The fMRI pre-processing procedure was similar to that published previously (Dornas & Braun, 2018). First, DICOM files were converted into NIFTI format using MRICRON (MRICRON Toolbox, Maryland, USA, NIH). Then, brain tissues were extracted and segmented using BET (Smith, 2002) and FAST (Zhang et al., 2001). Field map correction, head motion correction, spatial smoothing, high-pass temporal filtering, and registration to structural and standard images were performed with the MELODIC package of FSL (Beckmann & Smith, 2004).

Field map correction and registration to structural image were carried out using Boundary-Based Registration (BBR; Greve & Fischl, 2009). MELODIC uses MCFLIRT (Jenkinson et al., 2002) to correct for head motion. Spatial smoothing was performed with SUSAN (Smith & Brady, 1997), with full width at half maximum set at FWHM =5mm. To remove low-frequency artifacts, we applied a high-pass filter of the cut-off frequency f=0.01Hz, that is, oscillations/events with periods of more than 100 s were removed. To register the structural image to Montreal MNI152 standard space with isotropic 2mm voxel size, we used FLIRT (FMRIB’s Linear Image Registration Tool; Jenkinson & Smith, 2001; Jenkinson et al., 2002) with 12 degrees of freedom (DOF) and FNIRT (FMRIB’s Nonlinear Image Registration Tool) to apply the non-linear registration. To further reduce artifacts arising from head motion, we applied despiking with a threshold of λ=100 using BrainWavelet toolbox (Patel et al., 2014). Later, we regressed out the mean CSF activity as well as 12 DOF translation and rotation factors predicted by a motion correction algorithm (MCFLIRT). Afterward, the time series of each voxel was detrended linearly and whitened (with Matlab functions “detrend” and “zscore”).

Finally, the 160,099 voxels of MNI152 space were grouped into 758 functional parcels according to the MD758 atlas (Dornas & Braun, 2018). Each functional parcel is associated with an anatomically labeled region of the AAL atlas (Tzourio-Mazoyer et al., 2002) and comprises approximately 200 voxels or approximately 1.7cm3 of gray matter volume (212±70 voxels, range 45 to 462 voxels). Parcels were defined for a small population of observers such as to maximize signal covariance within and minimize covariance between parcels in the resting state. In contrast to other parcellation schemes, this was based exclusively on the (typically strong) functional correlations within each anatomical region and disregarded the (typically weak) correlations between different anatomical regions. The MD758 parcellation offers superior cluster quality, correlational structure, sparseness, and consistency with fiber tracking, compared to other parcellation schemes of similar resolution (Albers et al., 2021; Dornas & Braun, 2018).

2.5 fMRI data analysis

To study the neural representation of objects, we extracted the multivoxel activity pattern at Nt=9 time points following object onset. In a functional parcel with Nvox voxels, this response pattern constituted a point (or vector) in an Ndim-dimensional space, where Ndim=NtNvox (Fig. 2A). To identify parcels with significant selectivity for individual recurring objects, we employed a representational similarity analysis (RSA; Kriegeskorte, Mur, & Bandettini, 2008) (Fig. 2B). This analysis uses the standardized Euclidean (Mahalanobis) distance between responses in a high-dimensional space to examine the separability of neural object representations as a function of learning, or object type (recurring or non-recurring), or both. Over all 758 parcels, response dimensionality was Ndim=1,911±634 (mean and standard-deviation), with a range from 405 (Calcarine-L 329, with 45 voxels) to 4,113 (Postcentral-R-484, with 457 voxels).

Fig. 2.

Analysis of fMRI activity with direct linear discriminant analysis, or DLDA. For each functional parcel, DLDA identified the 14-dimensional space that optimally discriminated the 15 classes of activity patterns associated with 15recurring objects. Other activity patterns, such as those associated with nonrecurring objects, were also analyzed in this space. (A) For a given parcel with Nvox voxels (e.g., yellow region Frontal-Inf-R-8), activity was recorded over 9s during and following object presentation (2 to 11s after onset). Each such activity pattern corresponds to a point (or vector) in a 9Nvox-dimensional space (right). Here, activity patterns associated with three object presentations are represented schematically (red, green, and blue spheres). (B) To cross-validate discriminability, recurrent object presentations were divided randomly into a training set (90%) and a test set (10%). From the training set, the DLD subspace S was established. Here, exemplars (solid spheres) and class centroids (crosses) are represented schematically. Next, the projections into this space of test set patterns were compared to class centroids. (C) Projection onto the line connecting class centroids i and j revealed the pairwise discriminability/dissimilarity δi,j of object classes i and j (top), and the distances to class centroids yielded the within-class and between-class variance of representations, SSW and SSB, and the associated variance ratio F=SSB/SSW (bottom). Additionally, a matrix of (mis-)classification probabilities P(reportedi|truej) (a.k.a confusion matrix) could be obtained (not shown). (D) To assess object representation generally, test presentations were drawn randomly from the complete set of object presentations (left). To assess changes over the duration of the experiment, the set of presentations was divided into five successive “batches” and test presentations were drawn from one of these batches (bottom). In either case, the training set comprised all remaining presentations (i.e., the complement of the test set).

Fig. 2.

Analysis of fMRI activity with direct linear discriminant analysis, or DLDA. For each functional parcel, DLDA identified the 14-dimensional space that optimally discriminated the 15 classes of activity patterns associated with 15recurring objects. Other activity patterns, such as those associated with nonrecurring objects, were also analyzed in this space. (A) For a given parcel with Nvox voxels (e.g., yellow region Frontal-Inf-R-8), activity was recorded over 9s during and following object presentation (2 to 11s after onset). Each such activity pattern corresponds to a point (or vector) in a 9Nvox-dimensional space (right). Here, activity patterns associated with three object presentations are represented schematically (red, green, and blue spheres). (B) To cross-validate discriminability, recurrent object presentations were divided randomly into a training set (90%) and a test set (10%). From the training set, the DLD subspace S was established. Here, exemplars (solid spheres) and class centroids (crosses) are represented schematically. Next, the projections into this space of test set patterns were compared to class centroids. (C) Projection onto the line connecting class centroids i and j revealed the pairwise discriminability/dissimilarity δi,j of object classes i and j (top), and the distances to class centroids yielded the within-class and between-class variance of representations, SSW and SSB, and the associated variance ratio F=SSB/SSW (bottom). Additionally, a matrix of (mis-)classification probabilities P(reportedi|truej) (a.k.a confusion matrix) could be obtained (not shown). (D) To assess object representation generally, test presentations were drawn randomly from the complete set of object presentations (left). To assess changes over the duration of the experiment, the set of presentations was divided into five successive “batches” and test presentations were drawn from one of these batches (bottom). In either case, the training set comprised all remaining presentations (i.e., the complement of the test set).

Close modal

Our approach to RSA differed from previous work in some respects. Firstly, we analyzed high-dimensional spatiotemporal patterns of BOLD activity (200 voxels ×9 s, or O(103) dimensions) in non-overlapping gray matter volumes (758 functional subdivisions of 90 anatomical regions, averaging 1.7 cm3; Dornas & Braun, 2018). Other studies have used lower-dimensional spatial activity patterns in overlapping searchlight volumes (O(102) voxels or dimensions, covering 0.25 to 1.0cm3; Kriegeskorte et al., 2006). Secondly, we employed multi-class linear discriminant analysis (“direct linear discriminant analysis,” DLDA; Yu & Yang, 2001), rather than pairwise discriminability or one-versus-all discriminability (e.g., Hung et al., 2005; Liu et al., 2009). With these modifications, RSA revealed representational geometry at the level of object exemplars, as well as gradual changes in this geometry over sessions and runs.

2.5.1 Linear discriminant analysis

To analyze the response variance that discriminates κ=15 recurring objects, at most (κ1)-dimensions are required. Restricting the analysis to 14 principal components of the response could potentially have neglected smaller but more discriminating components. Accordingly, we performed a Linear Discriminant Analysis (LDA), which amounts to a “supervised” principal component analysis (PCA) and yields the (κ1)-dimensional orthonormal subspace S that optimally discriminates the κ response classes. Here, optimality is defined as simultaneously minimizing within-class variance and maximizing between-class variance of responses.

The results of LDA and PCA showed considerable commonality. Over the 758 parcels, the first 14 principal components captured 53±7% (mean and S.D.) of the total response variance, whereas the 14-dimensional subspaces S captured 33±7% of the total variance (or 61±6% of the principal component variance). Almost all of the subspace variance overlapped with the principal component variance (i.e., 88±5% of subspace variance projected into the space of the first 14 principal components, while the remaining 12±5% projected into the space of the remaining principal components).

Similar numbers were obtained for the 124 identity-selective parcels. The first 14 principal components captured 57±6% (mean and S.D.) of the total response variance, and subspaces S captured 38±6% of the total variance (or 67±4% of the principal component variance). Almost all of the subspace variance (91±3%) overlapped with the first 14 principal components. In summary, Linear Discriminant Analysis captured the useful (discriminating) part of correlated variance and distributed this variance more uniformly over its 14 orthonormal dimensions (6±3% per dimension) than principal component analysis could (4±6% per dimension).

A numerically tractable procedure for identifying the optimal subspace S is available in terms of “direct LDA” or DLDA (Ye et al., 2006; Yu & Yang, 2001). Briefly, this method first diagonalizes between-class variance to identify κ1 discriminative eigenvectors with non-zero eigenvalues, next diagonalizes within-class variance, and finally yields a rectangular matrix for projecting activity patterns from the original activity space (dimensionality Ndim) to the maximally discriminative subspace S and back. As this method is linear and relies on all available degrees of freedom, its results are deterministic. An important feature of this particular algorithm is that within-class variance is maintained near unity for all classes, by means of a suitable scaling of the subspace dimensions. The link github.com/cognitive-biology/DLDA provides a Matlab implementation of DLDA.

2.5.2 Amplitudes, distances, and correlations

Activity patterns xjk associated with trials k were analyzed in the maximally discriminative subspace S. The normalized amplitude ak=1κ1j=1κ1xjk2 of such patterns exhibited an average value of a=0.99. The normalized distance dkl=1κ1j=1κ1(xjkxjl)2 between patterns associated with trials k and l measured on average d=1.40, consistent with distance expected between random patterns of this amplitude (2). Averaging over trials k produced normalized response amplitudes A=akk. Averaging over pairs of trials k, l separated by a given latency lk, produced normalized response distances D(lk)=dklk,l.

The patterns from successive trials exhibited a weak temporal correlation, with approximately 5% smaller distances at delays below 4 trials and approximately 2% larger distances at delays ranging from 6 to 15 trials (see Supplementary Fig. S1A, B). Comparing pairs of trials with different types of objects, we observed approximately 3%larger response distances D (at all latencies) for the same recurring objects than for either different recurring or non-recurring objects (Supplementary Fig. S1C). Differential response amplitudes A increased marginally with latency, because response amplitudes tended to increase slightly over the course of each run (Supplementary Fig. S1D). This trend was evident for all types of objects and with both “structured” and “unstructured” sequences. In other words, the effect of object type on multivariate hemodynamic responses was limited to response distances and did not extend to response amplitudes. Thus, our data provided no evidence for “repetition suppression.”

For certain analyses (Sections 2.5.8 and 2.5.9), we established for each parcel w the average delay-dependent distance Tw(Δk)=dw,u,r(Δk)u,r between patterns with a relative delay of Δk trials, where the average was taken over subjects u and runs r. The time-course Tw allowed us to discount temporal correlations by computing dw,u,rcorrected(Δk)=dw,u,r(Δk)Tw(Δk)+Tw(Δk)Δk, where Tw(Δk)Δk is the average value over delays Δk.

2.5.3 Representation of shape “identity” for recurring objects

Our observations comprised approximately 200 activity patterns for each of the 15 recurring object classes (per observer and condition). To allow for cross-validation, we randomly divided these patterns in a larger “training set” (90% or 190±7.7 per object class) and a smaller test set (10% or 22±0.9 per object class) (Fig. 2B). Note that the “training set” comprised exclusively activity patterns associated with recurring objects. To reduce the variability introduced by random test sets, this selection was repeated Nr=20 times and all statistical measures described below represent the average over repetitions. As illustrated in Figure 2C, in the discriminative subspace S, we compared the ni test set exemplars xki (where k=1,,ni) of class i to the centroids cjtrain established for the training exemplars of class j. To compute Mahalanobis distances and variance ratios (see below), we compared test set exemplars xki of class i to the centroids cjtest of test set exemplars of class j.

We used three measures for this comparison, all with comparable results. Firstly, the nearest class centroid citrain to each pattern exemplar xki was identified to establish a matrix of classification probabilities P(j|i) (probability that an exemplar of class i is nearest to the centroid of class j), also known as “confusion matrix,” as well as the “classification accuracy” α=iP(i|i)P(i), which is the probability that the nearest centroid is the correct one.

Secondly, for each pair of object classes (i,j), object exemplars xki and xkj from the test set were projected onto the line connecting the two test set centroids, citest and cjtest, and a pairwise discriminability/dissimilarity/Mahalanobis distance δi,j was computed from the means, μi and μj, and variances, σi2 and σj2, of these projections, as δi,j=|μiμj|12(σi2+σj2). The average over all pairs of object classes was computed as δ=2κ(κ1)i,jδi,j.

Thirdly, given class centroids citest and overall centroid ctest, we computed the Euclidean distances dki=xkicitest between exemplars xki and class centroid citest and, for each object class i, the “sum of squares” as SSWi=k=1nidki2. The “within-class” variance of all classes was computed as SSW=1Ni=1κSSWi, where N=i=1κni. Similarly, from the Euclidean distances di=citestctest between individual and overall centroids, we computed “between-class” variance SSB=1Ni=1κnidi2. From the Euclidean distances dki=xkitestctest between exemplars and overall centroid, we computed “total” variance SST=1Ni=1κk=1nidki2. Variances SSW, SSB, and SST are also denoted, respectively, SSsame, SSdiff, and SSfam further below. To quantify the discriminability of classes, the variance ratio Fidentity=SSB(Nκ)/SSW(κ1) provided a non-parametric multivariate statistic (PERMANOVA; Anderson, 2001). The average within-class and between-class dispersion per dimension could be estimated as σW=SSW/(Nκ) and σB=SSB/(κ1), respectively.

2.5.4 Minimum statistic

To test for statistical significance, we computed average classification performance (in terms of both classification accuracy αobs and f-ratio Fobs) over Nr test sets, as well as over 103 first-level permutations of object identities (in each of the Nr test sets). In principle, we could have tested an “individual null” hypothesis for every parcel and every data set, namely, the probability of obtaining the observed performance αobs (or Fobs) purely by chance. Instead, we computed the “minimum statistic” m=minkαk (or m=minkFk) over data sets k, as well as over 105 second-level permutations (drawn randomly from the first level permutations) and tested the “global null” hypothesis, namely, the probability pn(m) of obtaining the observed minimum performance over n data sets purely by chance. This computation was performed separately for each of the 2 conditions (8 data sets from 8 observers per condition) as well as for the union of conditions (16 data sets from 8 observers). When the “global null” hypothesis could be rejected, we inferred statistically significant classification performance in at least some data sets. Our threshold for significance was pn(m)<0.05 after correction for multiple comparisons (758 parcels and 2 conditions) (Allefeld et al., 2016).

2.5.5 Prevalence analysis

To summarize the results from all observers and conditions, we used a “prevalence analysis” (Allefeld et al., 2016). Prevalence γtrue is the fraction of significant performance over n=16 data sets. To test the “prevalence null” hypothesis that γtrue is below a threshold γ0=0.5, an upper bound for P(γtrue<γ0) was obtained from the probability pn(m) of the minimum statistic over n=16 data sets, after correction for multiple comparisons:

This was the criterion used to label parcels as “identity selective.” Threshold prevalence γ0.5 corresponded to corrected probability pn(m)0.0012 and minimal accuracy of 6.67% (i.e., near chance).

Additionally, we computed γest as the largest value for which the “prevalence null” hypothesis could be rejected from

where pn(m) is the corrected minimum probability, n=16 the number of data sets, and α=0.05 the significance threshold.

2.5.6 Representation of shape “novelty” for non-recurring objects

Although recurring and non-recurring objects were comparable and generated in the same way, it seemed possible that neural representations might discriminate the class of 15 recurring objects from the class of 360 non-recurring objects. Indeed, the two classes became discriminable after observers had learned to classify recurring objects as “familiar” and non-recurring objects as “novel.” Accordingly, we considered this discriminability a representation of “novelty.”

To assess the neural representation of “novelty,” we divided non-recurring and recurring objects into two sets of unequal size (approximately N=216×15 recurrent or “familiar” exemplars vs. M=360 non-recurrent or “novel” exemplars). From the Euclidean distances dk=xkc between test set exemplars xk and centroids cfam=1Nk=1Nxk or cnov=1Mk=1Mxk, we obtained “within-class” variance SSW=SSfam+SSnov, where SSfam=1N+Mk=1Ndk,fam2 and SSnov=1N+Mk=1Mdk,nov2. From distances dfam=cfamctot and dnov=cnovctot between class centroids and overall centroid ctot=NN+Mcfam+MN+Mcnov, we obtained “between-class” variance SSB=SSnovfam=NN+Mdfam2+MN+Mdnov2=NM(N+M)2(cfamcnov)2. Finally, from distances dk=xkctot between exemplars and overall centroid, we obtained total variance SST=1N+Mk=1N+Mdk2. To quantify the discriminability of non-recurring and recurring objects, we formed the variance ratio Fnovelty=SSB(N+M2)/SSW (Anderson, 2001). Average within-class and between-class dispersion per dimension was obtained from σW=SSW/(N+M2) and σB=SSB, respectively.

2.5.7 Changes of representation analyzed in “batches”

To assess changes in neural representations over the course of the experiment, while also allowing for cross-validation, we divided all recurring object presentations into five successive “batches” B1,B2,, each with 20% of the presentations (Fig. 2D). In this way, we could select “test sets” for cross-validated DLDA from one particular batch, while retaining all other presentations as a “training set.” As every recurrent object was presented 210±9 times over all sessions, a batch would comprise 42±1.8 presentations, a test set 21±0.9, and a training set 189±8.1 presentations. To reduce the variance deriving from test set selection, we repeated the random selection Nr=20 times and averaged over repetitions.

To quantify representational changes over the course of learning, we computed the variance ratios Fm,w,uidentity for each temporal window or batch m, identity-selective parcel w, and data sets u{1,,16}. We formed the average ratio over 16 data sets, Fm,widentity=Fm,w,uidentityu, and assessed statistical significance by shuffling (103 permutations) the identity of recurring objects to obtain the distribution of variance ratios due to chance or data structure. The mean μm,w and variance σm,w2 of this distribution could also be used to convert Fm,widentity into z-score values Zm,widentity=(Fm,widentityμm,w)/σm,w.

Additionally, we performed a regression analysis and quantified representational changes in terms of linear trends. Specifically, we determined a “rate” parameter βwidentity by fitting a linear mixed-model Fm,w,uidentity=β0,w+βwidentitym+ξ0,w,u+ξ1,w,um+ϵm,w,u with data sets u as the grouping variable, where β0,w was a fixed-effect coefficient, ξ0,w,u and ξ1,w,u were random effect coefficients, and ϵm,w,u was residual error.

Similarly, to assess whether neural representations of non-recurring objects change with learning, we divided all object presentations (recurring and non-recurring) into five successive “batches” B1,B2,..., each with 20% of the presentations (Fig. 2D), to obtain variance ratios Fm,w,unovelty for each temporal window or batch m, identity-selective parcel w, and data sets u{1,,16}. After averaging over 16 data sets, Fm,wnovelty=Fm,w,unoveltyu, we assessed statistical significance by shuffling (103 permutations) the identity of recurring and non-recurring objects to obtain the distribution of variance ratios due to chance or data structure. The mean μm,w and variance σm,w2 of this distribution were used to convert Fm,wnovelty into z-score values Zm,wnovelty=(Fm,wnoveltyμm,w)/σm,w.

Additionally, we performed a regression analysis to establish linear trends. Changes in the representation of object “novelty” were assessed by fitting the “rate” parameter βwnovelty in a linear mixed-model Fm,w,unovelty=β0,w+βwnoveltym+ξ0,w,u+ξ1,w,um+ϵm,w,u, with data sets u as the grouping variable, where β0,w was a fixed-effect coefficient, ξ0,w,u and ξ1,w,u were random effect coefficients, and ϵm,w,u was a residual error.

To establish linear trends Fm=Fm,w,uw,u (of either identity and novelty) that average over both parcels w and data sets u, we obtained a rate parameter β1 by fitting linear mixed-model Fm,w,u=β0+β1m+ξ0,w,u+ξ1,w,um+ϵm,w,u with both parcels and data sets as grouping variables.

2.5.8 Geometry of representations

In the cross-validated analyses described above, subspaces S differed slightly between different batches (and training sets). To analyze the geometry of neural representations in a stable framework, we repeated some analyses in fixed subspaces S that reflected all observations (i.e., all recurring activity patterns xk). In the fixed subspace, we calculated the normalized amplitude ak=xk/κ1=j=1κ1xjk2/κ1 of individual patterns k and the normalized pairwise distance dkl=xkxl/κ1=j=1κ1(xjkxjl)2/κ1 between two patterns k and l.

For each parcel w, data set u, and run r, we obtained the average amplitude Aw,u,rtot=1N+Mk=1N+Mak of all patterns, the average amplitude Aw,u,rfam=1Nk=1Nak of recurring patterns, and the average amplitude Aw,u,rnov=1Mk=1Mak of non-recurring patterns. Similarly, we obtained the average pairwise distance Dw,u,rtot=2(N+M)(N+M1)k=1N+Ml=kN+Mdkl between all patterns, the average distance Dw,u,rnov=2M(M1)k=1Ml=kMdkl between non-recurring patterns, the average distance Dw,u,rfam=2N(N1)k=1Nl=kNdkl between recurring patterns, and the average distance Dw,u,rnovfam=1MNk=1Ml=1Ndkl between pairs comprising one recurring and one non-recurring pattern. For recurring patterns, we further obtained the average distance Dw,u,rsame=2N(N/κ1)i=1κk=1nil=knidkl between pairs of recurring patterns in the same class and the average distance Dw,u,rdiff=1N(NN/κ)i=1κk=1nil=1Nnidkl between pairs in different classes. All distances were corrected for the temporal auto-correlation by subtracting the time course of Tw(i,j), as described above.

As described further above, the distances between individual activity patterns and different centroids—such as ctot, cnov, and cfam—yielded total variance SST=SStot, within-class variance SSW=SSfam+SSnov, and between-class variance SSB=SSnovfam. For recurring patterns, distances to individual class centroids ci and overall centroid cfam yielded total variance SST=SSfam, within-class variance SSW=SSsame, and between-class variance SSB=SSdiff.

These values were computed for each parcel w, observer u, and run r, in order to obtain variance fractions Fw,u,rfam=SSfam/SStot, Fw,u,rnov=SSnov/SStot, Fw,u,rnovfam=SSnovfam/SStot, Fw,u,rsame=SSsame/SSfam, and Fw,u,rdiff=SSdiff/SSfam, as well as variance ratios Rw,u,ridentity=SSdiff(Nκ)/SSsame(κ1) and Rw,u,rnovelty=SSnovfam(N+M2)/(SSnov=SSfam).

2.5.9 Changes with learning analyzed by “runs”

Fixed subspaces permitted us to assess representational changes between successive “runs.” To this end, we computed average amplitudes Aw,u,r, distances Dw,u,r, variances SSw,u,r, and variance ratios Fw,u,r, as described above, for each parcel w, data set u{1,,16}, and run r. Within each session s, we assessed the changes of these parameters Y{A,D,SS,F} over runs rs by determining a “rate” parameter βs for identity-selective w and non-selective parcels w. Each βs coefficient was acquired from a linear mixed-model Yr,w,u=β0,s+βsr+ξ0,w,u+ξ1,w,ur+ϵr,w,u with observers and parcels as grouping variables, where β0,s was a fixed-effect coefficient, ξ0,w,u and ξ1,w,u were random effect coefficients, and ϵ was residual error. The same approach was used to assess gradual changes over runs in the centroid-to-centroid distances Dsame(r), ΔDsame(r), Dnov(r), and ΔDnov(r). This served to test the statistical significance of linear rates βs in each session. Sessions with significant rates are marked by stars in Figure 6.

2.5.10 Stability of shape identity and novelty representations

We also assessed the stability of the representation of the 16 response classes (15 recurring and 1 non-recurring) over the course of the experiment. To this end, we compared the average representation in individual runs r (centroids Cr of responses to exemplars) to the average representation over all runs (centroids Cave). For observer u, identity-selective parcel w, and object class i, we calculated the Euclidean distance Du,w,i,r between the relevant Cr and Cave, and also the difference ΔDu,w,i,r between the relevant centroids from successive runs, Cr and Cr+1. After averaging over observers u, identity-selective parcels w, and object classes i, we obtained Dsame(r) and ΔDsame for recurring objects and by Dnov(r) and ΔDnov(r) for non-recurring objects.

As a baseline for comparison, we also computed the distances Du,w,i,r and differences ΔDu,w,i,r that may be expected purely on the basis of response variance. To this end, we permuted the sequence of all 3,600 trials, separately within each of the 16 response classes (15 recurring and 1 non-recurring) such as to obtain 18 “pseudo-runs” with 200 trials each. Expectation values were obtained by repeating this Nr=1,000 times.

We note that, in an n-dimensional hypersphere of unit radius, the average Euclidean distance between two random points is

with dave1.4017 for n=14.

2.5.11 Dimensional reduction

To visualize representational geometry in two dimensions, we randomly sampled 50 response patterns to each of the recurring and non-recurring objects within the first and the last sessions and calculated a 1,600×1,600 pair-wise distance matrix (Dw,u) for each identity-selective parcel w and subject u. We did not wish to average distance matrices over observers, as we did not expect the activity patterns of different observers to be comparable. To sidestep this difficulty, we permuted the order of recurring objects 100 times and for each subject obtained an average matrix D¯ over permutations, which was then averaged over subjects. To visualize the representational geometry of identity in the first and the last session, we used multidimensional scaling (Matlab function mdscale, metric stress) to map the distances matrices for recurring objects (50 exemplars from the first session and 50 exemplars from the last session) into a two-dimensional space. To visualize the representational geometry of novelty, we restricted the distance matrix to non-recurring objects (50 exemplars from the first session and 50 exemplars from the last session) and just 3 of the 15 recurring objects (20 exemplars from either session).

Observers viewed sequences of computer-generated objects, with each object shown for 2.5s while rotating in three dimensions (Fig. 1A, B, a movie may be viewed HERE). Over three sessions, observers viewed 3,600 objects in total, of which 3,240 were presentations of recurring objects (15 different objects, each appearing approximately 216 times) and 360 were presentations of non-recurring objects (360 different objects, each appearing once). The display was intended to be sufficiently intriguing to remain interesting over 3 successive days. To this end, presentations never repeated exactly. Observers were required to classify each object as “familiar” (recurring) or “novel” (non-recurring). The task performance improved as observers became increasingly familiar with recurring objects, as illustrated in Figure 1C. Over the first 600 presentations, classification performance improved approximately from 50% correct (chance) to 85% correct, and reaction times decreased approximately from 1.65s to 1.25s. Over the remaining 3,000 presentations, performance improved further to approximately 90% correct and reaction times decreased further to approximately 0.95s. After three sessions, all observers were “familiar” with all recurring objects and could pick them out from an array of distractor objects.

All sessions were performed in an MRI scanner while whole-brain functional imaging data were being collected. In the following, we report the results of three types of analyses. First, we describe the cortical areas in which multivariate BOLD activity encodes information about the identity of recurring objects (“object identity”), as determined by cross-validated analyses of entire data sets (3 sessions per observer). Second, we describe changes in cortical representations over coarse time intervals, by means of cross-validated analyses of successive parts of the data sets (3 sessions divided into 5 batches). These changes pertain to the encoding of both recurring objects and the distinction between recurring and non-recurring objects (“object novelty”). Third, we describe changes in representations over finer time intervals (3 sessions divided into 18 runs), by foregoing cross-validation and adopting a fixed reference frame. These finer intervals confirm the results from coarse intervals but reveal more details about the geometry of neural representations and their development over time.

3.1 Cross-validated representation of object identity

To assess the extent to which multivariate neural responses to recurring objects encoded object identity, we relied on optimal linear classifiers combined with cross-validation (“direct linear discriminant analysis,” DLDA, see Methods for details). Specifically, we quantified the “identity” information in multivariate responses of every parcel w{1,,758} and data set u{1,,16} in terms of classification accuracy αw,u, average pairwise dissimilarity δw,u, and the ratio of between-class and within-class variance Fw,u. All three measures proved highly correlated and supported similar conclusions. For example, Figure 3B illustrates the correlation of classification accuracy αw,u and variance ratio Fw,u (ρ=0.94, p<0.001). The correlations of aw,u and δw,u (ρ=0.95, p<0.001), and of δw,u and Fw,u (ρ=0.98, p<0.001) were comparably strong. The results of individual observers from the two experimental conditions (structured and unstructured object sequences) were highly similar as well, demonstrating test-retest consistency (Supplementary Fig. S2).

Fig. 3.

Neural representation of object identity. (A) Identity-selective parcels are shown in color (124 of 758 parcels) on an inflated standard brain and are found in the occipital (70 parcels), parietal (29), temporal/fusiform (18), and frontal cortex (7). Color indicates classification αwave (average over 16 data sets), and ranges from chance to the largest observed value (6.67% to 17%). Parcels are identified by AAL region and number (in color), as detailed in Appendix Table A1. (B) Classification αw,u and variance ratio Fw,u for all 758 parcels w and 16 data sets u. Both values differ highly significantly from the values obtained with shuffled object identities (red cross and ellipse, representing mean ± 3 S.D.). Two particular parcels are highlighted (Calcarine-L 331 in red, Parahippocampus-R 325 in blue, and magnified in the inset) to illustrate the variability of data sets. (C) Minimum values awmin and Fwmin for all parcels over 16 data sets. Identity-selective parcels are colored according to awave as in (A). A minimum above chance 6.67% corresponds to a prevalence γ above 0.5 (dotted vertical line). The distributions obtained with shuffled identities are indicated as well (red cross and ellipse).

Fig. 3.

Neural representation of object identity. (A) Identity-selective parcels are shown in color (124 of 758 parcels) on an inflated standard brain and are found in the occipital (70 parcels), parietal (29), temporal/fusiform (18), and frontal cortex (7). Color indicates classification αwave (average over 16 data sets), and ranges from chance to the largest observed value (6.67% to 17%). Parcels are identified by AAL region and number (in color), as detailed in Appendix Table A1. (B) Classification αw,u and variance ratio Fw,u for all 758 parcels w and 16 data sets u. Both values differ highly significantly from the values obtained with shuffled object identities (red cross and ellipse, representing mean ± 3 S.D.). Two particular parcels are highlighted (Calcarine-L 331 in red, Parahippocampus-R 325 in blue, and magnified in the inset) to illustrate the variability of data sets. (C) Minimum values awmin and Fwmin for all parcels over 16 data sets. Identity-selective parcels are colored according to awave as in (A). A minimum above chance 6.67% corresponds to a prevalence γ above 0.5 (dotted vertical line). The distributions obtained with shuffled identities are indicated as well (red cross and ellipse).

Close modal

For most parcels, the results from different observers showed considerable variability. Whereas a few parcels exhibited significant accuracy αw,u and variance ratio Fw,u in all data sets (e.g., Calcarine 331), in many parcels the representation of object identity was significant only in some data sets (e.g., Parahippocampus 325) (Fig. 3B). Global significance was assessed by comparing the minimal accuracy or variance ratio over the 8 data sets from one condition (structured or unstructured) to the minimal values obtained with shuffled data (red ellipse in Fig. 3C, see Methods for details).

Minimal classification accuracy αw was significant in 17% of all parcels (128 of 758 parcels) in the structured sequence condition and in 19% of parcels (146 of 748) in the unstructured condition (p0.05, corrected for multiple comparisons), when compared to null-distributions obtained from shuffled object identities. For minimal variance ratios Fw,u, the corresponding values were 18% and 17%, respectively (136 and 130 parcels). To combine the results from both conditions, we used a “prevalence” analysis to determine parcels in which “identity” was represented significantly in a majority of all 16 data sets (prevalence γ0.5), once again comparing the observed minimal values to the minimal values obtained with shuffled data (red ellipse in Fig. 3C, see Methods for details).

Figure 3A illustrates the 124 parcels identified as significantly “identity-selective” by the prevalence criterion γ0.5 and Supplementary Figure S3 shows the same information in terms of a sliced brain. Among these were 70 parcels in the occipital cortex, 29 in the parietal cortex, 18 in the fusiform or temporal cortex, and 7 in the frontal cortex. The average prevalence of identity-selectivity in these parcels was 0.663±0.016 (mean and S.D.), and the minimal value was 0.58. As the prevalence criterion (based on 16 data sets) was marginally more conservative than the accuracy criterion (based on 8 data sets), 120 of the 124 parcels were significantly “identity-selective” in terms of both criteria. The four exceptions (identified only by prevalence, but not by accuracy) were Frontal-superior-R 56, Occipital-superior-R 393, Occipital-middle-L 403, and Parietal-superior-R 510. Appendix Table A1 lists the statistical significance of all three criteria for all “identity-selective” parcels.

Overall, there was a pronounced posterior-anterior gradient. Whereas many parcels at the posterior pole of the brain exhibited high classification accuracy, this tended to progressively decrease at more anterior locations (Fig. 3A; Supplementary Fig. S3; Appendix Table A1). To formalize this trend, we assigned 66 of the 124 identity-selective parcels to the 25 topographic visual areas defined by Wang et al. (2015) and, additionally, to the anterior inferior temporal cortex (AIT) and to the inferior frontal cortex (IFC). Supplementary Figure S6 provides an overview of all topographically assigned and non-assigned parcels selective for identity. As illustrated in Figure 8A, this assignment showed that accuracy was comparable in early visual areas (V1-hV4) and in the posterior-ventrolateral regions of the temporal lobe, whereas accuracy was lower in the anterior temporal cortex, the inferior frontal cortex, and in parietal cortical areas.

3.2 Cross-validated changes with learning

To assess changes with learning, we separately analyzed five successive and non-overlapping sets of trials (“batches”) with linear classifiers and cross-validation (see Methods for details). Specifically, we established ratios of between- and within-class variance for both object identity (15 classes formed by responses to 15 recurring objects) and for object novelty (2 classes formed by responses to recurring and non-recurring objects, respectively). These two variance ratios measured the neural representation of “identity” and “novelty.”

Variance ratios were converted to z-score values (with respect to the mean and variance of the corresponding shuffle distribution) before being averaged over data sets and/or over parcels. Figure 4A summarizes the results in terms of a grand average over all identity selective parcels. The average identity and novelty ratios were highly significant in all batches (p<0.001). Over successive batches, the average identity ratio weakened slightly but significantly (p<0.05), whereas the average novelty ratio strengthened considerably, especially between batches m=1 and m=2 (p<0.001).

Fig. 4.

Changes in the representation of “identity” and “novelty” over successive “batches” of trials. (A) Ratio of within- and between-class variance for object “identity” (κ=15 classes, inset top left) and object “novelty” (2 classes, inset bottom left). Average variance ratios Fmidentity (blue, mean ± S.E.M.) and Fmnovelty (red, mean ± S.E.M.), as a function of batch number m. While Fmidentity decreases slightly over time (p<0.05), Fmnovelty increases considerably (p<0.001), especially initially. All values are averages over data sets in z-score units. (B) Average within- and between-class variances (mean ± S.E.M.), as a function of batch number m. Whereas between-class variances decrease (SSB,midentity, p<0.05) or increase (SSB,mnovelty, p<0.001), within-class variances remain unchanged. All values are averages over data sets, relative to shuffled averages. (C) Results of regression analysis for 124 identity-selective parcels w. Linear “rate” parameters βwidentity and βwnovelty compared to each other and to classification αw. Novelty and identity rates correlate weakly over parcels (left, ρ=0.298, p<0.001), as do novelty rate and classification accuracy αwidentity (middle, ρ=0.22, p<0.05). Identity rates βw and accuracies αw correlate strongly and negatively (right, ρ=0.74, p<0.001). Significance of linear trends is indicated by * for p<0.05 and ** for p<0.001.

Fig. 4.

Changes in the representation of “identity” and “novelty” over successive “batches” of trials. (A) Ratio of within- and between-class variance for object “identity” (κ=15 classes, inset top left) and object “novelty” (2 classes, inset bottom left). Average variance ratios Fmidentity (blue, mean ± S.E.M.) and Fmnovelty (red, mean ± S.E.M.), as a function of batch number m. While Fmidentity decreases slightly over time (p<0.05), Fmnovelty increases considerably (p<0.001), especially initially. All values are averages over data sets in z-score units. (B) Average within- and between-class variances (mean ± S.E.M.), as a function of batch number m. Whereas between-class variances decrease (SSB,midentity, p<0.05) or increase (SSB,mnovelty, p<0.001), within-class variances remain unchanged. All values are averages over data sets, relative to shuffled averages. (C) Results of regression analysis for 124 identity-selective parcels w. Linear “rate” parameters βwidentity and βwnovelty compared to each other and to classification αw. Novelty and identity rates correlate weakly over parcels (left, ρ=0.298, p<0.001), as do novelty rate and classification accuracy αwidentity (middle, ρ=0.22, p<0.05). Identity rates βw and accuracies αw correlate strongly and negatively (right, ρ=0.74, p<0.001). Significance of linear trends is indicated by * for p<0.05 and ** for p<0.001.

Close modal

As expected, it was the between class-variances SSBidentity and SSBnovelty that changed significantly over successive batches m (p<0.05 and p<0.001, respectively), whereas the within-class variances SSWidentity and SSWnovelty remained essentially the same (p=n.s.), as illustrated by Figure 4B. This was owing to the DLDA algorithm, which maintained within-class variance near unity. Nevertheless, over successive batches, the neural representations of recurring objects tended to become slightly more similar to each other, but more dissimilar to the representations of non-recurring objects.

To ascertain that these overall trends hold true also for individual parcels, we carried out more conventional regression analyses of variance ratios Fm,w,uidentity and Fm,w,unovelty over batches m, parcels w and data sets u. Specifically, we fitted linear mixed-models in order to estimate “rate” parameters βwidentity and βwnovelty for each identity-selective parcel w. The results revealed negative rates βwidentity and positive rates βwidentity for almost all parcels, confirming the overall trends in Figure 4C. The variability over parcels was numerically larger for βwnovelty (0.15±0.1, mean and S.D.) than for βwidentity (0.022±0.015), with both rates weakly correlated (ρ=0.30, p<0.001). Classification accuracy αwidentity correlated negatively with βwnovelty (ρ=0.22, p<0.05) and with βwidentity (ρ=0.74, p<0.001).

To take a closer look at the interaction between “novelty” and “identity,” we divided the identity-selective parcels into “novelty terciles” (high, medium, and low, defined by βnovelty) before comparing representations of novelty (Fnovelty) and identity (accuracy α) (Fig. 5B). The results differed substantially between batches and terciles. In early batches, Fnovelty and α correlated for all terciles, suggesting that initially the representations of non-recurring and recurring objects were linked. However, in successively later batches, this correlation waned in the upper tercile. This may suggest that pronounced representations of non-recurrent objects progressively detached from representations of recurrent objects.

Fig. 5.

Neural representation of “novelty” in terms of the variance ratio Fnovelty and its development over successive batches. (A) Identity-selective parcels and their individual rate parameters βnovelty (color scale), estimated by fitting linear-mixed models to the Fnovelty values (from all batches and data sets). (B) Development of Fnovelty (mean ± S.E.M.) for different “novelty terciles” (upper, middle, and lower tercile of parcels defined by βnovelty). (C) Correlation between Fwnovelty and accuracy αw for different batches and novelty terciles. The parcels of each tercile are distinguished by color, with individual regression lines (dashed) and correlation coefficients ρ ( indicates p<0.05).

Fig. 5.

Neural representation of “novelty” in terms of the variance ratio Fnovelty and its development over successive batches. (A) Identity-selective parcels and their individual rate parameters βnovelty (color scale), estimated by fitting linear-mixed models to the Fnovelty values (from all batches and data sets). (B) Development of Fnovelty (mean ± S.E.M.) for different “novelty terciles” (upper, middle, and lower tercile of parcels defined by βnovelty). (C) Correlation between Fwnovelty and accuracy αw for different batches and novelty terciles. The parcels of each tercile are distinguished by color, with individual regression lines (dashed) and correlation coefficients ρ ( indicates p<0.05).

Close modal

Figure 5A illustrates the degree to which identity-selective parcels express the overall novelty trend, as quantified by fitted rate βwnovelty, and Supplementary Figure S4 shows the same information in terms of brain slices. An anterior-posterior gradient is evident, with a more pronounced representation of novelty at anterior than at posterior locations. This gradient is also apparent when parcels are assigned to topographic visual areas, as illustrated in Figure 8B. Appendix Table A1 lists the rates βwnovelty for all identity-selective parcels.

3.3 Geometry of identity and novelty representations

Next, we present results from alternative analyses relying on fixed subspaces S for each data set (3,600 trials). Fixed subspaces reveal a more detailed geometry of neural representations and allow any changes in this geometry to be tracked over successive runs (200 trials each). The disadvantage of this approach is that it precludes cross-validation. Our aim was to establish not just between- and within-class variances, but also the distances underlying the variances, and the response amplitudes underlying the distances. For the representation of object “identity,” the within- and between-class geometry was defined by response pairs to same and to different recurring objects, respectively. For the representation of object “novelty,” the within-class geometry reflected responses either to pairs of familiar (recurring) or to pairs of novel (non-recurring) objects, whereas the between-class geometry concerned responses to mixed pairs of objects (novel-familiar).

We analyzed multivariate responses in terms of variances, distances, and amplitudes and averaged the results over all data sets and all 124 identity-selective parcels, to obtain separate mean values (and standard errors) for each of the 18 successive runs. Additionally, we averaged the results over the remaining 634 (non-identity-selective) parcels of the brain. We hoped that this would help distinguish more general effects and trends (e.g., habituation, attention, alertness) from learning-related changes in shape representations. All distances in these analyses were residual distances, to minimize the influence of temporal auto-correlations (Supplementary Fig. S1; see Methods for details).

The analyzed quantities—response amplitudes A, response distances D, and variances SS—are illustrated schematically in Figure 6A, and the results are presented in Figure 6B–D in terms of the mean values and standard errors for every run. In identity-selective parcels, response amplitudes Afam to recurring patterns decreased during the first session (runs 1 to 6, p<0.05), but not in the second and third session (runs 7 to 12, runs 13 to 18, p>0.5). Response amplitudes Anov to non-recurring patterns showed no significant change (p n.s.) in any session (Fig. 6B). In non-selective parcels, response amplitudes decreased in all sessions, consistent with general habituation. In identity-selective parcels, response distances Ddiff between different recurring objects declined similarly during the first session (p<0.05), but not during subsequent sessions (p>0.6) (Fig. 6C). Also, response distances Dsame between the same recurring objects did not change significantly during any session (p n.s.). In contrast, response distances Dnov between non-recurring objects declined disproportionately during the first session (p<0.05) but increased during the third session (p<0.05). Response distances Dnovfam between recurring and non-recurring objects, on the other hand, did not change significantly over sessions (p n.s.).

Fig. 6.

Geometry of identity and novelty representation over successive sessions and runs. (A) For each run with Ntrials=200 trials, we collected all individual response amplitudes a and all pairwise response distances d (triangular area with color scale) in the maximally discriminating space and computed average amplitudes Afam and Anov (for recurring and non-recurring objects, respectively) and average distances Dsame and Ddiff (for same and different recurring objects, respectively), as well as average distances Dnov and Dnovfam (for non-recurring objects and between recurring and non-recurring objects, respectively). (B) Response amplitude Anov (red, mean, and S.E.M.) and Afam (blue, mean, and S.E.M.), over 18 runs grouped into three sessions, for identity-selective (left) and non-selective parcels (right). (C) Pairwise response distance Dsame (solid blue), Ddiff (dashed blue), Dnov (solid red), and Dnovfam (dashed red)), over runs and sessions, for both groups of parcels. (D) Variance of response distances SSsame (solid blue), SSdiff (dashed blue), SSnov (solid red), and SSnovfam (dashed red), over runs and sessions, for identity-selective and non-selective parcels. Stars indicate a significant linear trend during a session (see text). All plots show mean (traces) and S.E.M. (shading).

Fig. 6.

Geometry of identity and novelty representation over successive sessions and runs. (A) For each run with Ntrials=200 trials, we collected all individual response amplitudes a and all pairwise response distances d (triangular area with color scale) in the maximally discriminating space and computed average amplitudes Afam and Anov (for recurring and non-recurring objects, respectively) and average distances Dsame and Ddiff (for same and different recurring objects, respectively), as well as average distances Dnov and Dnovfam (for non-recurring objects and between recurring and non-recurring objects, respectively). (B) Response amplitude Anov (red, mean, and S.E.M.) and Afam (blue, mean, and S.E.M.), over 18 runs grouped into three sessions, for identity-selective (left) and non-selective parcels (right). (C) Pairwise response distance Dsame (solid blue), Ddiff (dashed blue), Dnov (solid red), and Dnovfam (dashed red)), over runs and sessions, for both groups of parcels. (D) Variance of response distances SSsame (solid blue), SSdiff (dashed blue), SSnov (solid red), and SSnovfam (dashed red), over runs and sessions, for identity-selective and non-selective parcels. Stars indicate a significant linear trend during a session (see text). All plots show mean (traces) and S.E.M. (shading).

Close modal

A first conclusion is that response amplitudes and response distances are consistently larger for recurring objects (blue traces in Fig. 6B, C) than for non-recurring objects (red traces). Importantly, in the very first run, response distances are comparable between different recurring objects (Ddiff) and different non-recurring objects (Dnov), demonstrating that both recurring and non-recurring objects were represented comparably well. Over subsequent runs, response distances decrease far more between different non-recurring objects (Dnov) than different recurring objects (Ddiff), demonstrating that a comparative advantage for recurring objects develops gradually (i.e., a kind of repetition enhancement). A second conclusion is that the observed development differs between identity-selective and non-selective parcels. Whereas amplitudes and distances stabilize in the former group of parcels, they habituate progressively in the latter group (both within and between sessions). Thus, the responsiveness of identity-selective parcels remains stable over sessions. A third conclusion is that response distances Dnov between different non-recurring objects become comparatively small (already during the first session), not only smaller than the distances Ddiff between different recurring objects but even smaller than the distances Dsame between the same recurring objects.

The results for response variances confirmed the trends observed earlier in the batch analysis of cross-validated variance ratios (Fig. 4A, B). Between-class variance SSdiff for recurring objects declined over the course of sessions (p<0.005), whereas between-class variance SSnovfam for non-recurring objects increased over the first session (p<0.005), only to decline again during the third session (p<0.05). Within-class variances SSsame and SSnov remained largely unchanged. The close correspondence between the trends observed over runs and over batches is illustrated also in Supplementary Figure S5. Surprisingly, non-identity-selective parcels mirrored the trends observed for identity-selective parcels in attenuated form. The fact that between- and within-class variances differ systematically suggests that even non-identity-selective parcels represent object identity to some degree.

It is natural to compare these results to the time-course of behavioral performance (fraction correct and reaction time) in our observers Fig. 1C, D). The changes in the representation of recurring objects (between class distances Ddiff and variances SSdiff) show a gradual decrease in the quality of representation and thus do not correspond to improving performance in terms of fraction correct. However, the changes in the representation of non-recurring objects, including the decrease of within-class distances Dnov and variances SSnov and the increase of between-class variances SSnovfam and variance ratio Rnovelty, do correspond to the rapid improvement in fraction correct over the first few runs. Thus, the neural changes over the course of learning point to diverging representations of “novel” (non-recurring) and “familiar” (recurring) objects.

3.4 Stability of identity and novelty representations

Relying on fixed subspaces S to analyze each data set also permitted us to assess the stability of neural representations over successive runs. With this in mind, we established the centroids of response classes for each run and examined the displacement of centroids between successive runs. As this calculation concerned centroid-to-centroid distances (rather than exemplar-to-exemplar distances), we could not correct for temporal auto-correlations.

The computation of centroids for particular response classes is illustrated schematically in Figure 7A. Given the centroids Cr1 and Cr for successive runs r1 and r and the average centroid Caver over all runs, we computed absolute centroid-to-centroid distances DCr=CrCave as well as relative centroid-to-centroid distances ΔDCr=CrCr1. The 16 response classes were formed by each recurring object (15 classes, DCsame and ΔDCsame) and by the non-recurring objects (1 class, DCnov and ΔDCnov). To compare the displacements expected from sampling noise, we also computed the centroid-to-centroid distances after permuting the responses in each class and regrouping them into 18 “pseudo-runs” (see Methods for details).

Fig. 7.

Stability of identity and novelty representations over successive sessions and runs (A) For each recurring and all non-recurring objects, we calculated the response centroids Cr for each run r and the average centroid Cave for all runs and obtained both absolute distances DCr=CrCave and relative distances ΔDCr=CrCr1. (B) Centroid-to-centroid distances (mean ± S.E.M.) for all identity-selective parcels and all data sets. Distances DCsame and ΔDCsame for the same recurring objects (blue) and distances DCnov and ΔDCnov for non-recurring objects (red) are compared to the corresponding values obtained from shuffled data sets (thin, pale lines). Stars indicate a significant linear trend during a session (see text).

Fig. 7.

Stability of identity and novelty representations over successive sessions and runs (A) For each recurring and all non-recurring objects, we calculated the response centroids Cr for each run r and the average centroid Cave for all runs and obtained both absolute distances DCr=CrCave and relative distances ΔDCr=CrCr1. (B) Centroid-to-centroid distances (mean ± S.E.M.) for all identity-selective parcels and all data sets. Distances DCsame and ΔDCsame for the same recurring objects (blue) and distances DCnov and ΔDCnov for non-recurring objects (red) are compared to the corresponding values obtained from shuffled data sets (thin, pale lines). Stars indicate a significant linear trend during a session (see text).

Close modal

The results are shown in Figure 7B. For both recurring and non-recurring objects, average absolute distances DCsame(r) and DCnov(r) diminished during the first session (runs 1 to 6, p<0.005), but remained stable during the second and third sessions (runs 7 to 12, and 13 to 18, p>0.2). Notably, absolute distances DCnov(r) of novel objects decreased to a much lower average level. Relative distances ΔDCsame(r) and ΔDCnov(r) between successive runs declined during the first session (runs 1 to 6, p<0.05), remained stable during the second session (runs 7 to 12, p>0.2), only to decline once again the last session (13 to 18, p<0.005 for recurring and p<0.05 for non-recurring objects). Absolute distances were far larger for recurring than for non-recurring classes, corroborating the substantial “response enhancement” already noted above. Both absolute and relative distances were slightly smaller than predicted by sampling noise (thin, pale lines, p<0.001), demonstrating that responses of true runs were distributed slightly more compactly and consistently than those of pseudo-runs. Note also that relative distances approached the values expected for fully random displacements in a 14-dimensional hypersphere—specifically, relative distances ΔDC were approximately 1.4 times larger than absolute distances DC – again underlining the dominant influence of sampling noise.

We studied the cortical representation of synthetic visual objects over multiple days of repeated viewing, while observers learned to classify initially unfamiliar objects as “familiar.” Relying on “representational similarity analysis” (RSA), we established distances between spatiotemporal hemodynamic (BOLD) responses to exemplars of different recurring objects, as well as to exemplars of non-recurring objects. Response distances between the same and different recurring objects quantified the neural representation of object identity. Response distances between recurring and non-recurring objects measured the neural representation of object novelty. The results showed that object identity was neurally represented from the start, in the ventral occipitotemporal cortex and beyond. With growing familiarity, the quality of this neural representation remained high, but its geometry expanded to fill the available representational space. In contrast, the neural representation of non-recurring objects (which remained “novel” by definition) improved over time, but its geometry contracted and shifted to the margins of the representational space.

4.1 Cortical representation of object identity

To permit a fine-grained analysis of representational geometry, we generated complex and three-dimensional shapes that were highly characteristic and distinguishable and presented these shapes from various points of view and in various states of rotation (always for one complete turn) (Kakaei et al., 2021). Thus, observers had to recognize an object from all sides in order to classify it as “familiar.” Within the category of our synthetic shapes, every recurring object constituted strictly speaking an “exemplar,” with individual presentations providing different “instantiations.” However, we chose to term objects “classes” and individual presentations “exemplars,” as this terminology conforms better to RSA conventions.

The selectivity of cortical parcels for object identity was assessed in optimized 14-dimensional subspaces S of the much higher-dimensional space of multivariate responses (O(103) dimensions). Specifically, we computed a cross-validated “classification accuracy” (Kriegeskorte, Mur, & Bandettini, 2008) and used a prevalence analysis to combine results from different conditions and observers (Allefeld et al., 2016). Essentially identical results were obtained with alternative measures such as “linear discriminability” and “variance ratio” (of between- and within-class variance; Anderson, 2001). When spatiotemporal responses to different objects are linearly discriminable, they form a neural representation of object identity. As exemplars of each object were presented from various sides, any such neural representation was by definition view-invariant. The obvious caveats are (i) that object rotation may have exposed the same characteristic features in many or most presentations and (ii) that multivariate hemodynamic responses over 9 s can only distantly reflect the neuronal activity evoked during each 2.5 s presentation. Nevertheless, hemodynamic signals exhibited significant invariance to the various modes of presentation of a given object (e.g., the initial perspective, the axis, and the sense of rotation).

In contrast to many other studies, we did not observe suppressed responses when objects were repeated (i.e., no “repetition suppression”) but rather a small enhancement of responses both with longer delays and later trial numbers (Supplementary Fig. S1). This may simply reflect the fact that the object presentations were highly variable and never repeated exactly. Recall that we designed a highly variable display such as to retain the observers’ interest over 3 successive days.

The 124 of 758 parcels that were identified as “identity-selective” on this basis were situated mostly in the ventral occipitotemporal cortex, but some parcels were also located in the parietal or frontal cortex, as illustrated in Figure 3A. The degree of selectivity exhibited a clear gradient, being stronger at the posterior pole and becoming progressively weaker in more anterior and more dorsal regions, as summarized in Figure 8A. These results are consistent with previous findings that multivariate activity distinguishing different exemplars of a particular class of objects (e.g., faces) is present in the ventral and lateral occipital cortex, on the fusiform gyrus, and in the ventral temporal cortex (Brants et al., 2016; Eger et al., 2008; Visconti di Oleggio Castello et al., 2021).

Fig. 8.

Shape identity and shape novelty representations in 26 topographical regions. (A) Identity representation as indexed by classification accuracy αidentity (mean ± S.E.M.). Posterior regions (V1-hV4, VO, LO) exhibit higher accuracy than more anterior or more dorsal regions (IPS, AIT, IFC). (B) Novelty representation as indexed by rate βnovelty of novelty gain (mean ± S.E.M.). More anterior or more dorsal regions (IPS, AIT, IFC) exhibit a higher slope parameter than the posterior visual cortex (V1-hV4). (C) Comparison of identity and novelty representations as indexed by rate βidentity of identity loss (negative values) and rate βnovelty of novelty gain (positive values). Groups of regions are distinguished by color, with ellipsoids indicating mean and standard error. Note the positive correlation between novelty gain and identity loss. List of abbreviations: visual cortex (V1, V2, V3, hV4), ventral occipital cortex (VO1, VO2), lateral occipital cortex (LO1, LO2), parahippocampal cortex and fusiform gyrus (PHC), medial temporal areas (hMT), intraparietal sulcus (IPS), anterior inferior temporal cortex (AIT), and inferior frontal cortex (IFC).

Fig. 8.

Shape identity and shape novelty representations in 26 topographical regions. (A) Identity representation as indexed by classification accuracy αidentity (mean ± S.E.M.). Posterior regions (V1-hV4, VO, LO) exhibit higher accuracy than more anterior or more dorsal regions (IPS, AIT, IFC). (B) Novelty representation as indexed by rate βnovelty of novelty gain (mean ± S.E.M.). More anterior or more dorsal regions (IPS, AIT, IFC) exhibit a higher slope parameter than the posterior visual cortex (V1-hV4). (C) Comparison of identity and novelty representations as indexed by rate βidentity of identity loss (negative values) and rate βnovelty of novelty gain (positive values). Groups of regions are distinguished by color, with ellipsoids indicating mean and standard error. Note the positive correlation between novelty gain and identity loss. List of abbreviations: visual cortex (V1, V2, V3, hV4), ventral occipital cortex (VO1, VO2), lateral occipital cortex (LO1, LO2), parahippocampal cortex and fusiform gyrus (PHC), medial temporal areas (hMT), intraparietal sulcus (IPS), anterior inferior temporal cortex (AIT), and inferior frontal cortex (IFC).

Close modal

In general, it is thought that progressively “higher” levels of visual processing represent progressively “larger” visual sets, beginning with image features, and widening gradually to object features, object exemplars, object categories, and finally to supercategories such as animate or inanimate objects, or objects and landscapes (Grill-Spector & Weiner, 2014). Accordingly, the discriminability of exemplars within a category is expected to diminish at more anterior locations, which correspond to “higher” levels of visual processing (Eger et al., 2008; Grill-Spector & Weiner, 2014). Moreover, it has been hypothesized that the spatial scale of neural representations increases with the level of abstraction, in the sense that exemplars are represented at smaller scales than categories (Grill-Spector & Weiner, 2014). Thus, if this trend is exacerbated in the more anterior parts of the ventral pathway, exemplar representations may become progressively less discriminable at the spatial resolution of BOLD signals.

A previous study of visual expertise for synthetic shapes (Brants et al., 2016) reported a gradual enhancement of neural representations in object-selective areas, whereas we observed a moderate decline. This difference may have been due to task design. Brants and colleagues used barely discriminable shapes and emphasized perceptual load, whereas we used highly distinguishable shapes and emphasized memory load.

We also observed identity-selectivity in frontoparietal regions that are typically associated with the dorsal visual pathway and the right frontoparietal “attention network.” This is consistent with previous findings on the presence of object- and/or face-selective representations in dorsal areas (Freud et al., 2017; Jeong & Xu, 2016; Konen & Kastner, 2008; Poirier et al., 2006; Visconti di Oleggio Castello et al., 2021). However, the interpretation of this selectivity is not straightforward. Particularly the clusters associated with the “attention network” are often found to express functional correlations with ventral visual areas in both resting and task states (Dornas & Braun, 2018; Mutlu et al., 2022; Smith et al., 2013). Thus, it seems possible that multivariate functional correlations could have propagated identity-selectivity feedforward throughout the “attention network” and beyond.

Finally, we observed pronounced identity-selectivity in the primary visual cortex (calcarine sulcus, left and right), where neuronal activity encodes basic visual features (orientation, spatial frequency, direction of movement, and so on) (Grill-Spector & Weiner, 2014; Haxby et al., 2001). It is possible that multivariate hemodynamic responses in the primary visual cortex could have reflected this visually evoked neuronal activity sufficiently well to have encoded object identity, especially as the rotation may have exposed the same low-level features in many or most presentations. Additionally, hemodynamic responses could have been driven by spatiotemporal patterns of feedback from higher areas of the visual cortex. There is some evidence to suggest that feedback can dominate the hemodynamics of the early visual cortex under continuous viewing conditions (as used here) (Blake & Braun, 2009).

4.2 Cortical representation of novel object shapes

We also investigated the representation of “novel” object shapes that were encountered only once (and never recurred). Note that “novelty” is here not meant to imply “surprise” for the observer in the sense of a violation of expectations (e.g., Uddin, 2015). Rather, it simply denotes the more heterogeneous class of non-recurring objects (with 360 exemplars, each from a different object), as distinct from the 15 more homogeneous classes of recurring objects (with approximately 200 exemplars each, all from the same object). As mentioned, “novelty” was measured in terms of the linear discriminability of hemodynamic responses to non-recurring and recurring objects in 14-dimensional subspaces S, more specifically, by comparing pairwise response distances between classes (recurring and non-recurring) and within classes (either recurring or non-recurring).

All 124 “identity-selective” parcels were also “novelty-selective,” in the sense that hemodynamic responses discriminated non-recurring and recurring objects to some degree, as illustrated in Figure 5A. As discriminative subspaces were optimized for recurring objects—that were generated in the same way as non-recurring objects—some degree of discriminability was to be expected. Moreover, as non-recurring objects were more numerous (360 objects) than recurring objects (15 objects), some discriminability was expected purely by chance, particularly in a 14-dimensional space. However, as discussed further below, the linear discriminability of non-recurring objects increased over successive runs and sessions, mirroring observers’ improving ability to classify objects as “novel” or “familiar.” Because of this dynamic aspect, we quantified the novelty-selectivity of cortical parcels in terms of an “improvement rate,” βnovelty (Fig. 4). Interestingly, there was an anterior-posterior gradient in that novelty-selectivity was more pronounced in more frontal, parietal, and anterior temporal areas than more posterior temporal and occipital areas, as summarized in Figure 8B. In other words, the representational disparity between familiar object shapes and novel objects shapes tended to be larger in the higher-level (more anterior) visual cortex than in the lower-level (more posterior) cortex, suggesting that learning effects were more pronounced.

4.3 Representational changes with learning

As representational changes with learning were the main objective of our study, we addressed this issue with several complementary approaches. First, we divided our observations from 18 runs into five successive “batches” and established the neural representation of both “identity” and “novelty” separately for each batch with cross-validated statistics, while aggregating over all identity-selective parcels (Fig. 4B). Second, to assess changes in individual parcels, we performed a regressional analysis of the same cross-validated data and obtained “rates” of representational changes for every identity-selective parcel (Fig. 4C). Third, we adopted stable discriminative subspaces S and sacrificed cross-validation in order to analyze representational geometry over individual runs (Fig. 6). All three approaches yielded comparable results.

Already in the first run and the first batch, without time for plasticity or learning, the neural representations of identity were maximally differentiated (Figs. 4A and 6D; Supplementary Fig. S5). This initial identity representation was most pronounced in known object processing areas, including the ventral occipitotemporal cortex and early visual cortex. Apparently, pre-existing representations based on life-long experience were sufficient to immediately provide a view-independent representation of synthetic shapes, which we had designed to be highly characteristic and discriminable. In contrast, neural representations of novelty were minimally differentiated in the first run and the first batch. As there was no systematic difference between recurring and non-recurring objects (and without time for plasticity), any residual initial discriminability of novelty must be attributed to chance.

Over subsequent runs and batches, the neural representation of object identity remained pronounced, but its quality declined steadily over time (Figs. 4A and 6D; Supplementary Fig. S5). Some decline in BOLD activity is not untypical for learning studies over multiple days and is commonly ascribed to repetition suppression, sparsification of responses, and/or diminishing attention or effort (e.g., Poldrack, 2000). However, while our results are consistent with such a scenario in non-identity-selective parcels, they do not support a general decline of activity in identity-selective parcels, as the response amplitudes and distances in these parcels declined only initially and subsequently remained stable (Fig. 6B, C).

In contrast, the neural representation of object novelty improved substantially over subsequent runs and batches. The time course was similar in both analyses (batch-by-batch and run-by-run), with the steepest improvement occurring over the first few runs (Figs. 4A and 6D; Supplementary Fig. S5). However, the detailed results revealed that this “improvement” (in discriminating non-recurring and recurring objects) actually reflected a deterioration in the representation of non-recurring objects (i.e., diminishing response distances, Fig. 6C).

In absolute terms, response amplitudes and distances were already larger for recurring objects and smaller for non-recurring objects during the first run and the difference increased over the next few runs (Fig. 6B, C). Apparently, recurring objects benefited from a “repetition enhancement,” as the only immediate and systematic difference between recurring and non-recurring objects was the frequency of recurrence. Interestingly, this enhancement was comparable for “structured” and “unstructured” sequences, even though the repetition latencies were quite different (Supplementary Fig. S1B, C). Accordingly, we hypothesize that the enhancement was not merely a passive effect but rather a consequence of task relevance and cognitive engagement (Supplementary Fig. S1B, C).

As mentioned, the rates of change of identity and novelty representations differed systematically between cortical regions (Fig. 8C). Intriguingly, the rates of novelty gain and identity loss varied inversely over the cortical hierarchy: in early visual areas (V1, V2, V3, hV4), identity declined rapidly, whereas novelty grew slowly. At the opposite end, in the inferior frontal cortex (IFC) and anterior ventral temporal cortex (AIT), identity declined slowly, but novelty grew rapidly. In the higher visual cortex (VO, LO), both rates were intermediate.

It is informative to visualize the observed representational changes in two dimensions (Fig. 9), while approximately preserving the relative pairwise distances in the discriminative subspaces S. This visualization makes clear that the neural representation of recurring objects expands between the beginning and the end of the experiment, filling the available representational space (Fig. 9A). The expansion explains our observation that the linear discriminability of object classes degrades but remains high. In contrast, the neural representation of non-recurring objects contracts between the beginning and the end of the experiment while also shifting to the margins of representational space, which explains why the linear discriminability of non-recurring objects improved over time (Fig. 9B). These two opposite developments may reflect both cognitive engagement and repetition frequency: representations may expand for objects that observers attempt to memorize and/or that recur frequently, but contract for objects that observers learn to ignore and/or that are rare.

Fig. 9.

Changes in the geometry of shape identity and novelty representations, visualized with multi-dimensional scaling. Symbols (colored circles) represent neural response patterns in a 14-dimensional space S. Symbols are positioned such that pairwise distances reflect pairwise distances in S. Response classes are distinguished by color and are represented by 50 randomly selected responses each. (A) Fifteen response classes to recurring objects in the first session (left, run 1–6) and the third session (right, run 13–18). Note that recurring response classes expand with learning to fill the available space. (The regions occupied by classes depend on the selected responses. “Inside” and “outside” classes can exchange positions). (B) Three response classes to recurring objects and one response class to non-recurring objects (larger symbols), in the first session (left, run 1–6) and the third session (right, run 13–18). Note that the non-recurring response class contracts with learning and shifts to the margins of the available space. Class positions are similar for other triplets of recurring classes.

Fig. 9.

Changes in the geometry of shape identity and novelty representations, visualized with multi-dimensional scaling. Symbols (colored circles) represent neural response patterns in a 14-dimensional space S. Symbols are positioned such that pairwise distances reflect pairwise distances in S. Response classes are distinguished by color and are represented by 50 randomly selected responses each. (A) Fifteen response classes to recurring objects in the first session (left, run 1–6) and the third session (right, run 13–18). Note that recurring response classes expand with learning to fill the available space. (The regions occupied by classes depend on the selected responses. “Inside” and “outside” classes can exchange positions). (B) Three response classes to recurring objects and one response class to non-recurring objects (larger symbols), in the first session (left, run 1–6) and the third session (right, run 13–18). Note that the non-recurring response class contracts with learning and shifts to the margins of the available space. Class positions are similar for other triplets of recurring classes.

Close modal

In addition to relative changes in representational geometry indexed by linear discriminability, we established absolute changes in representational geometry, indexed by distances between response centroids in successive runs (see Fig. 7). The results were dominated by sampling noise, and the displacement of centroids was comparable to random jumps in a hypersphere while maintaining a given distance from its center. However, both absolute and relative centroid distances were slightly (and significantly) smaller than predicted by sampling noise, indicating that the representations were slightly more consistent and compact. The most interesting result of this analysis was that centroid distances were approximately 30% smaller for non-recurring than for recurring objects, highlighting again the representational disparity noted above.

4.4 Behavioral and cognitive changes with learning

The behavioral changes over three sessions of viewing sequences of objects included both increased classification performance (“familiar” or “novel”) and decreased reaction times. Both behavioral measures changed rapidly during the first three runs of the first session and more slowly during the second and third sessions (Fig. 1). As described elsewhere (Kakaei et al., 2021), the classification of a particular object typically changed from (mostly) “novel” to (mostly) “familiar” at one identifiable point in time during the sessions, which we termed “onset of familiarity.” This objective observation was consistent with the subjective reports of observers that they memorized all three-dimensional shapes one by one, such that every object became recognizable from all sides. Some observers also mentioned having assigned linguistic labels to individual recurring objects. After the three sessions, all observers were “familiar” with all recurring objects and could pick them out from an array of distractor objects.

Only some of these behavioral changes have obvious counterparts in the neural changes discussed above. First, the decrease of reaction times from under 2 s to under 1 s implies that observers spend less time actively evaluating the stimulus and more time passively observing it. However, the neural response of identity-selective parcels does not mirror this trend, as both response amplitudes and response differences stabilize after the first few runs (Fig. 6B, C). In the rest of the brain (non-identity-selective parcels), the neural responses do show a progressive decrease, but any attribution would be speculative.

Second, the increase in objective performance and in subjective “familiarity” was not mirrored directly in neural responses to recurring objects, as multivariate responses were sufficiently rich to identify such objects from the very start. However, multivariate responses were dispersed over the three sessions such as to fill more of the available space (see above). This growing response diversity is a plausible correlate of memory consolidation, that is, the formation of stable long-term memories in visually responsive cortical areas. When such memories are consolidated, one would expect that increased connectivity would enhance pattern completion over additional levels of representation, rendering network activity more complex (e.g., Steinberg & Sompolinsky, 2022). It is worth noting that this development was observed for both types of presentation sequences (“structured” and “unstructured”), suggesting that neural consolidation was due to task relevance and not merely to repetition latency.

Third, the increase in objective performance was mirrored indirectly in neural responses to non-recurring objects. Whereas these responses were initially comparable to recurring responses, they contracted over three sessions into a smaller part of the available space, thus becoming more stereotypical. As this part was comparably distant from all recurring responses, it lay at the margins of the representational space. The time course of classification performance corresponded best to this particular development in neural representations. Accordingly, this development was a plausible indirect correlate of memory consolidation, in the sense that visually responsive areas grew less responsive to other objects that failed to match the newly formed long-term memories.

We analyzed the cortical representation of visual objects in the multivariate hemodynamic responses of 758 brain parcels. For each parcel, we used linear discriminant analysis to map the O(103)-dimensional responses into a lower-dimensional subspace that optimally discriminated the 15 stimulus classes (recurring objects). Optimal subspaces captured a large part of the correlated variance and overlapped substantially with the principal components of the responses. Typically, 2/3 of the principal component variance discriminated between stimulus classes (and thus coincided with the optimal subspace), while the remaining 1/3 was shared between stimulus classes. Our analyses revealed where and how the cortical representations of visual objects changed as visual expertise was being acquired and consolidated by the observers.

Our results were broadly consistent with other recent studies of visual expertise, which have highlighted the roles of three pathways or networks (Kravitz et al., 2011, 2013), an occipitotemporal pathway (“ventral pathway”), an occipitoparietal pathway (“dorsal pathway”), and a right frontoparietal network (“attention system”). Several studies linked behavioral performance to enhanced activity and/or representation in the frontoparietal network (Duyck et al., 2021; Poirier et al., 2006; Visconti di Oleggio Castello et al., 2021), as well as in the more anterior parts of the occipitotemporal pathway and the more dorsal parts of the occipitoparietal pathway (Christophel et al., 2017).

Due to our focus on object shape, our results do not speak directly to the modulation of cortical responses by expectation, such as “expectation suppression” or “surprise signalling” (Barron et al., 2016; Bell et al., 2016; Mayrhauser et al., 2014; Vinken et al., 2018). Moreover, in our paradigm, object presentations were never repeated exactly and every object presentation contained elements of surprise, as neither the object, nor the point of view, nor the direction of rotation could be anticipated by observers.

The most robust representations of object shape for both recurring objects (“identity”) and non-recurring objects (“novelty”) were observed in the ventral occipitotemporal cortex, at the intermediate levels of the shape processing hierarchy (Grill-Spector & Weiner, 2014; Perry & Fallah, 2014). Additionally, we found representations of object shape in “dorsal stream” cortical areas, consistent with the view that these areas encode goal- and task-related object features (Perry & Fallah, 2014).

The most novel aspect of our findings was changes in the geometry of cortical representations as visual expertise for recurring objects was being acquired and consolidated. In relative terms, distances between response classes decreased, and/or distances within classes increased, while observers repeatedly viewed and became familiar with the corresponding stimulus classes. This modest decline in stimulus encoding was however associated with an expansion (or diversification) in the distribution of responses within classes, so that responses of all classes taken together scattered more uniformly over the available representational space. Changes in cortical representations were quite different for stimuli that appeared only once and that observers did not attempt to memorize (non-recurring objects). Here, again in relative terms, distances between classes (non-recurring and recurring) increased and/or distances within classes (non-recurring) decreased. This steep growth in class encoding was associated with a substantial contraction (or stereotypisation) in the distribution of responses, in the sense that responses to non-recurring objects shifted to the margin of the available representational space.

We conclude that hemodynamic responses to novel object shapes immediately represent the differences between these shapes, even prior to learning, presumably reflecting life-long prior experience. When object shapes grow familiar with learning, hemodynamic responses to the same shapes become more diverse, whereas responses to different shapes remain comparably dissimilar from each other. Responses to control objects that are always novel develop quite differently in that they become less diverse relative to each other, but also more dissimilar from responses to familiar objects.

Direct linear discriminant analysis and prevalence inference is available on github.com/cognitive-biology/DLDA. MR data will be made available upon request.

Ehsan Kakaei: Conceptualization, data curation, formal analysis, visualization, and writing of the original draft. Jochen Braun: Conceptualization, linear algebra, formal analysis, supervision, and reviewing & editing.

We thank Claus Tempelmann, Martin Kanowski, and Denise Scheermann at the Magnetic Resonance Imaging Laboratory of the Department of Neurology of Otto-von-Guericke University, Magdeburg. We are grateful to Oliver Speck for providing essential support and a balanced perspective. We also thank Stepan Aleshin for helpful discussions and constructive comments. This study was funded by the federal state Saxony-Anhalt and the European Structural and Investment Funds (ESF, 2014–2020), project number ZS/2016/08/80645, as part of the doctoral program ABINEP (Analysis, Imaging and Modelling of Neuronal Processes).

The authors are not aware of any competing interest.

Supplementary material for this article is available with the online version here: https://doi.org/10.1162/imag_a_00255

Albers
,
K. J.
,
Ambrosen
,
K. S.
,
Liptrot
,
M. G.
,
Dyrby
,
T. B.
,
Schmidt
,
M. N.
, &
Mørup
,
M.
(
2021
).
Using connectomics for predictive assessment of brain parcellations
.
NeuroImage
,
238
,
118170
. https://URL.org/10.1016/j.neuroimage.2021.118170
Allefeld
,
C.
,
Görgen
,
K.
, &
Haynes
,
J.-D.
(
2016
).
Valid population inference for information-based imaging: From the second-level t-test to prevalence inference
.
NeuroImage
,
141
,
378
392
. https://doi.org/10.1016/j.neuroimage.2016.07.040
Anderson
,
M. J.
(
2001
).
A new method for non-parametric multivariate analysis of variance
.
Austral Ecology
,
26
(
1
),
32
46
. https://doi.org/10.1111/j.1442-9993.2001.01070.pp.x
Barron
,
H. C.
,
Garvert
,
M. M.
, &
Behrens
,
T. E. J.
(
2016
).
Repetition suppression: A means to index neural representations using bold
?
Philosophical Transactions of the Royal Society B: Biological Sciences
,
371
(
1705
),
20150355
. https://doi.org/10.1098/rstb.2015.0355
Beckmann
,
C. F.
, &
Smith
,
S. M.
(
2004
).
Probabilistic independent component analysis for functional magnetic resonance imaging
.
IEEE Transactions on Medical Imaging
,
23
(
2
),
137
152
. https://doi.org/10.1109/TMI.2003.822821
Bell
,
A. H.
,
Summerfield
,
C.
,
Morin
,
E. L.
,
Malecek
,
N. J.
, &
Ungerleider
,
L. G.
(
2016
).
Encoding of stimulus probability in macaque inferior temporal cortex
.
Current Biology
,
26
(
17
),
2280
2290
. https://doi.org/10.1016/j.cub.2016.07.007
Bi
,
Y.
,
Wang
,
X.
, &
Caramazza
,
A.
(
2016
).
Object domain and modality in the ventral visual pathway
.
Trends in Cognitive Sciences
,
20
(
4
),
282
290
. https://doi.org/10.1016/j.tics.2016.02.002
Blake
,
R.
, &
Braun
,
J.
(
2009
).
Visual perception: Tracking the elusive footprints of awareness
.
Current Biology
,
19
(
1
),
R30
R32
. https://doi.org/10.1016/j.cub.2008.11.009
Brainard
,
D. H.
(
1997
).
The psychophysics toolbox
.
Spatial Vision
,
10
(
4
),
433
436
. https://doi.org/10.1163/156856897X00357
Brants
,
M.
,
Bulthé
,
J.
,
Daniels
,
N.
,
Wagemans
,
J.
, &
de Beeck
,
H. P. O
. (
2016
).
How learning might strengthen existing visual object representations in human object-selective cortex
.
NeuroImage
,
127
,
74
85
. https://doi.org/10.1016/j.neuroimage.2015.11.063
Brants
,
M.
,
Wagemans
,
J.
, &
Op de Beeck
,
H. P.
(
2011
).
Activation of fusiform face area by greebles is related to face similarity but not expertise
.
Journal of Cognitive Neuroscience
,
23
(
12
),
3949
3958
. https://doi.org/10.1162/jocn_a_00072
Bukach
,
C. M.
,
Gauthier
,
I.
, &
Tarr
,
M. J.
(
2006
).
Beyond faces and modularity: The power of an expertise framework
.
Trends in Cognitive Sciences
,
10
(
4
),
159
166
. https://doi.org/10.1016/j.tics.2006.02.004
Cetron
,
J. S.
,
Connolly
,
A. C.
,
Diamond
,
S. G.
,
May
,
V. V.
,
Haxby
,
J. V.
, &
Kraemer
,
D. J.
(
2019
).
Decoding individual differences in STEM learning from functional MRI data
.
Nature Communications
,
10
(
1
),
1
10
. https://doi.org/10.1038/s41467-019-10053-y
Charest
,
I.
, &
Kriegeskorte
,
N.
(
2015
).
The brain of the beholder: Honouring individual representational idiosyncrasies
.
Language, Cognition and Neuroscience
,
30
(
4
),
367
379
. https://doi.org/10.1080/23273798.2014.1002505
Christophel
,
T. B.
,
Klink
,
P. C.
,
Spitzer
,
B.
,
Roelfsema
,
P. R.
, &
Haynes
,
J.-D.
(
2017
).
The distributed nature of working memory
.
Trends in Cognitive Sciences
,
21
(
2
),
111
124
. https://doi.org/10.1016/j.tics.2016.12.007
Collins
,
E.
, &
Behrmann
,
M.
(
2020
).
Exemplar learning reveals the representational origins of expert category perception
.
Proceedings of the National Academy of Sciences of the United States of America
,
117
(
20
),
11167
11177
. https://doi.org/10.1073/pnas.1912734117
Connolly
,
A. C.
,
Guntupalli
,
J. S.
,
Gors
,
J.
,
Hanke
,
M.
,
Halchenko
,
Y. O.
,
Wu
,
Y.-C.
,
Abdi
,
H.
, &
Haxby
,
J. V.
(
2012
).
The representation of biological classes in the human brain
.
Journal of Neuroscience
,
32
(
8
),
2608
2618
. https://doi.org/10.1523/JNEUROSCI.5547-11.2012
de Beeck
,
H. P. O.
, &
Baker
,
C. I.
(
2010
).
The neural basis of visual object learning
.
Trends in Cognitive Sciences
,
14
(
1
),
22
30
. https://doi.org/10.1016/j.tics.2009.11.002
de Beeck
,
H. P. O.
,
Baker
,
C. I.
,
DiCarlo
,
J. J.
, &
Kanwisher
,
N. G.
(
2006
).
Discrimination training alters object representations in human extrastriate cortex
.
Journal of Neuroscience
,
26
(
50
),
13025
13036
. https://doi.org/10.1523/JNEUROSCI.2481-06.2006
Dornas
,
J. V.
, &
Braun
,
J.
(
2018
).
Finer parcellation reveals detailed correlational structure of resting-state fMRI signals
.
Journal of Neuroscience Methods
,
294
,
15
33
. https://doi.org/10.1016/j.jneumeth.2017.10.020
Duyck
,
S.
,
Martens
,
F.
,
Chen
,
C.-Y.
, &
Op de Beeck
,
H.
(
2021
).
How visual expertise changes representational geometry: A behavioral and neural perspective
.
Journal of Cognitive Neuroscience
,
33
(
12
),
2461
2476
. https://doi.org/10.1162/jocn_a_01778
Eger
,
E.
,
Ashburner
,
J.
,
Haynes
,
J.-D.
,
Dolan
,
R. J.
, &
Rees
,
G.
(
2008
).
fMRI activity patterns in human loc carry information about object exemplars within category
.
Journal of Cognitive Neuroscience
,
20
(
2
),
356
370
. https://doi.org/10.1162/jocn.2008.20019
Freud
,
E.
,
Culham
,
J. C.
,
Plaut
,
D. C.
, &
Behrmann
,
M.
(
2017
).
The large-scale organization of shape processing in the ventral and dorsal pathways
.
eLife
,
6
,
e27576
. https://doi.org/10.7554/eLife.34464
Gauthier
,
I.
, &
Tarr
,
M. J.
(
2016
).
Visual object recognition: Do we (finally) know more now than we did
?
Annual Review of Vision Science
,
2
,
377
396
. https://doi.org/10.1146/annurev-vision-111815-114621
Gauthier
,
I.
,
Tarr
,
M. J.
,
Anderson
,
A. W.
,
Skudlarski
,
P.
, &
Gore
,
J. C.
(
1999
).
Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects
.
Nature Neuroscience
,
2
(
6
),
568
573
. https://doi.org/10.1038/9224
Greve
,
D. N.
, &
Fischl
,
B.
(
2009
).
Accurate and robust brain image alignment using boundary-based registration
.
NeuroImage
,
48
(
1
),
63
72
. https://doi.org/10.1016/j.neuroimage.2009.06.060
Grill-Spector
,
K.
,
Knouf
,
N.
, &
Kanwisher
,
N.
(
2004
).
The fusiform face area subserves face perception, not generic within-category identification
.
Nature Neuroscience
,
7
(
5
),
555
562
. https://doi.org/10.1038/nn1224
Grill-Spector
,
K.
, &
Weiner
,
K. S.
(
2014
).
The functional architecture of the ventral temporal cortex and its role in categorization
.
Nature Reviews Neuroscience
,
15
(
8
),
536
548
. https://doi.org/10.1038/nrn3747
Harel
,
A.
,
Kravitz
,
D.
, &
Baker
,
C. I.
(
2013
).
Beyond perceptual expertise: Revisiting the neural substrates of expert object recognition
.
Frontiers in Human Neuroscience
,
7
,
885
. https://doi.org/10.1167/14.10.820
Haxby
,
J. V.
(
2012
).
Multivariate pattern analysis of fMRI: The early beginnings
.
NeuroImage
,
62
(
2
),
852
855
. https://doi.org/10.1016/j.neuroimage.2012.03.016
Haxby
,
J. V.
,
Gobbini
,
M. I.
,
Furey
,
M. L.
,
Ishai
,
A.
,
Schouten
,
J. L.
, &
Pietrini
,
P.
(
2001
).
Distributed and overlapping representations of faces and objects in ventral temporal cortex
.
Science
,
293
(
5539
),
2425
2430
. https://doi.org/10.1126/science.1063736
Hung
,
C. P.
,
Kreiman
,
G.
,
Poggio
,
T.
, &
DiCarlo
,
J. J.
(
2005
).
Fast readout of object identity from macaque inferior temporal cortex
.
Science
,
310
(
5749
),
863
866
. https://doi.org/10.1126/science.1117593
Jenkinson
,
M.
,
Bannister
,
P.
,
Brady
,
M.
, &
Smith
,
S.
(
2002
).
Improved optimization for the robust and accurate linear registration and motion correction of brain images
.
NeuroImage
,
17
(
2
),
825
841
. https://doi.org/10.1006/nimg.2002.1132
Jenkinson
,
M.
, &
Smith
,
S.
(
2001
).
A global optimisation method for robust affine registration of brain images
.
Medical Image Analysis
,
5
(
2
),
143
156
. https://doi.org/10.1016/S1361-8415(01)00036-6
Jeong
,
S. K.
, &
Xu
,
Y.
(
2016
).
Behaviorally relevant abstract object identity representation in the human parietal cortex
.
Journal of Neuroscience
,
36
(
5
),
1607
1619
. https://doi.org/10.1523/JNEUROSCI.1016-15.2016
Kakaei
,
E.
,
Aleshin
,
S.
, &
Braun
,
J.
(
2021
).
Visual object recognition is facilitated by temporal community structure
.
Learning & Memory
,
28
(
5
),
148
152
. https://doi.org/10.1101/lm.053306.120
Konen
,
C. S.
, &
Kastner
,
S.
(
2008
).
Two hierarchically organized neural systems for object information in human visual cortex
.
Nature Neuroscience
,
11
(
2
),
224
231
. https://doi.org/10.1038/nn2036
Konkle
,
T.
, &
Oliva
,
A.
(
2012
).
A real-world size organization of object responses in occipitotemporal cortex
.
Neuron
,
74
(
6
),
1114
1124
. https://doi.org/10.1016/j.neuron.2012.04.036
Kravitz
,
D. J.
,
Saleem
,
K. S.
,
Baker
,
C. I.
, &
Mishkin
,
M.
(
2011
).
A new neural framework for visuospatial processing
.
Nature Reviews Neuroscience
,
12
(
4
),
217
230
. https://doi.org/10.1038/nrn3008
Kravitz
,
D. J.
,
Saleem
,
K. S.
,
Baker
,
C. I.
,
Ungerleider
,
L. G.
, &
Mishkin
,
M.
(
2013
).
The ventral visual pathway: An expanded neural framework for the processing of object quality
.
Trends in Cognitive Sciences
,
17
(
1
),
26
49
. https://doi.org/10.1016/j.tics.2012.10.011
Kriegeskorte
,
N.
,
Goebel
,
R.
, &
Bandettini
,
P.
(
2006
).
Information-based functional brain mapping
.
Proceedings of the National Academy of Sciences of the United States of America
,
103
(
10
),
3863
3868
. https://doi.org/10.1073/pnas.0600244103
Kriegeskorte
,
N.
,
Mur
,
M.
, &
Bandettini
,
P. A.
(
2008
).
Representational similarity analysis-connecting the branches of systems neuroscience
.
Frontiers in Systems Neuroscience
,
2
,
4
. https://doi.org/10.3389/neuro.06.004.2008
Kriegeskorte
,
N.
,
Mur
,
M.
,
Ruff
,
D. A.
,
Kiani
,
R.
,
Bodurka
,
J.
,
Esteky
,
H.
,
Tanaka
,
K.
, &
Bandettini
,
P. A.
(
2008
).
Matching categorical object representations in inferior temporal cortex of man and monkey
.
Neuron
,
60
(
6
),
1126
1141
. https://doi.org/10.1016/j.neuron.2008.10.043
Kumar
,
M.
,
Anderson
,
M. J.
,
Antony
,
J. W.
,
Baldassano
,
C.
,
Brooks
,
P. P.
,
Cai
,
M. B.
,
Chen
,
P.-H. C.
,
Ellis
,
C. T.
,
Henselman-Petrusek
,
G.
,
Huberdeau
,
D.
,
Hutchinson
,
J. B.
,
Li
,
Y. P.
,
Lu
,
Q.
,
Manning
,
J. R.
,
Mennen
,
A. C.
,
Nastase
,
S. A.
,
Richard
,
H.
,
Schapiro
,
A. C.
,
Schuck
,
N. W.
, …
Norman
,
A. K.
(
2022
).
BrainIAK: The brain imaging analysis kit
.
Aperture Neuro
,
2021
(
4
),
1
19
. https://doi.org/10.52294/31bb5b68-2184-411b-8c00-a1dacb61e1da
Liu
,
H.
,
Agam
,
Y.
,
Madsen
,
J. R.
, &
Kreiman
,
G.
(
2009
).
Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex
.
Neuron
,
62
(
2
),
281
290
. https://doi.org/10.1016/j.neuron.2009.02.025
Martens
,
F.
,
Bulthé
,
J.
,
van Vliet
,
C.
, &
de Beeck
,
H. O.
(
2018
).
Domain-general and domain-specific neural changes underlying visual expertise
.
NeuroImage
,
169
,
80
93
. https://doi.org/10.1016/j.neuroimage.2017.12.013
Mayrhauser
,
L.
,
Bergmann
,
J.
,
Crone
,
J.
, &
Kronbichler
,
M.
(
2014
).
Neural repetition suppression: Evidence for perceptual expectation in object-selective regions
.
Frontiers in Human Neuroscience
,
8
,
225
. https://doi.org/10.3389/fnhum.2014.00225
McGugin
,
R. W.
,
Gatenby
,
J. C.
,
Gore
,
J. C.
, &
Gauthier
,
I.
(
2012
).
High-resolution imaging of expertise reveals reliable object selectivity in the fusiform face area related to perceptual performance
.
Proceedings of the National Academy of Sciences of the United States of America
,
109
(
42
),
17063
17068
. https://doi.org/10.1073/pnas.1116333109
Mutlu
,
M. C.
,
Kakaei
,
E.
, &
Braun
,
J.
(
2022
).
Candidate areas for initiating spontaneous reversals of kinetic depth: Inferior frontal cortex and insula
. In
Bernstein Conference 2022
(
PIII
64
).
Berlin, Germany
. https://doi.org/10.12751/nncn.bc2022.208
Nastase
,
S. A.
,
Gazzola
,
V.
,
Hasson
,
U.
, &
Keysers
,
C.
(
2019
).
Measuring shared responses across subjects using intersubject correlation
.
Social Cognitive and Affective Neuroscience
,
14
(
6
),
667
685
. https://doi.org/10.1093/scan/nsz037
Nestor
,
A.
,
Plaut
,
D. C.
, &
Behrmann
,
M.
(
2016
).
Feature-based face representations and image reconstruction from behavioral and neural data
.
Proceedings of the National Academy of Sciences of the United States of America
,
113
(
2
),
416
421
. https://doi.org/10.1073/pnas.1514551112
Patel
,
A. X.
,
Kundu
,
P.
,
Rubinov
,
M.
,
Jones
,
P. S.
,
Vértes
,
P. E.
,
Ersche
,
K. D.
,
Suckling
,
J.
, &
Bullmore
,
E. T.
(
2014
).
A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series
.
NeuroImage
,
95
,
287
304
. https://doi.org/10.1016/j.neuroimage.2014.03.012
Perry
,
C. J.
, &
Fallah
,
M.
(
2014
).
Feature integration and object representations along the dorsal stream visual hierarchy
.
Frontiers in Computational Neuroscience
,
8
,
84
. https://doi.org/10.3389/fncom.2014.00084
Poirier
,
C. C.
,
De Volder
,
A. G.
,
Tranduy
,
D.
, &
Scheiber
,
C.
(
2006
).
Neural changes in the ventral and dorsal visual streams during pattern recognition learning
.
Neurobiology of Learning and Memory
,
85
(
1
),
36
43
. https://doi.org/10.1016/j.nlm.2005.08.006
Poldrack
,
R. A.
(
2000
).
Imaging brain plasticity: Conceptual and methodological issues—A theoretical review
.
NeuroImage
,
12
(
1
),
1
13
. https://doi.org/10.1006/nimg.2000.0596
Roth
,
Z. N.
, &
Zohary
,
E.
(
2015
).
Fingerprints of learned object recognition seen in the fMRI activation patterns of lateral occipital complex
.
Cerebral Cortex
,
25
(
9
),
2427
2439
. https://doi.org/10.1093/cercor/bhu042
Smith
,
S. M.
(
2002
).
Fast robust automated brain extraction
.
Human Brain Mapping
,
17
(
3
),
143
155
. https://doi.org/10.1002/hbm.10062
Smith
,
S. M.
, &
Brady
,
J. M.
(
1997
).
Susan—A new approach to low level image processing
.
International Journal of Computer Vision
,
23
(
1
),
45
78
. https://doi.org/10.1023/A:1007963824710
Smith
,
S. M.
,
Vidaurre
,
D.
,
Beckmann
,
C. F.
,
Glasser
,
M. F.
,
Jenkinson
,
M.
,
Miller
,
K. L.
,
Nichols
,
T. E.
,
Robinson
,
E. C.
,
Salimi-Khorshidi
,
G.
,
Woolrich
,
M. W.
,
Barch
,
D. M.
,
Uğurbil
,
K.
, &
Van Essen
,
D. C.
(
2013
).
Functional connectomics from resting-state fMRI
.
Trends in Cognitive Sciences
,
17
(
12
),
666
682
. https://doi.org/10.1016/j.tics.2013.09.016
Steinberg
,
J.
, &
Sompolinsky
,
H.
(
2022
).
Associative memory of structured knowledge
.
Scientific Reports
,
12
(
1
),
21808
. https://doi.org/10.1038/s41598-022-25708-y
Tzourio-Mazoyer
,
N.
,
Landeau
,
B.
,
Papathanassiou
,
D.
,
Crivello
,
F.
,
Etard
,
O.
,
Delcroix
,
N.
,
Mazoyer
,
B.
, &
Joliot
,
M.
(
2002
).
Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain
.
NeuroImage
,
15
(
1
),
273
289
. https://doi.org/10.1006/nimg.2001.0978
Uddin
,
L. Q.
(
2015
).
Salience processing and insular cortical function and dysfunction
.
Nature Reviews Neuroscience
,
16
(
1
),
55
61
. https://doi.org/10.1038/nrn3857
Vinken
,
K.
,
Op de Beeck
,
H. P.
, &
Vogels
,
R.
(
2018
).
Face repetition probability does not affect repetition suppression in macaque inferotemporal cortex
.
The Journal of Neuroscience
,
38
(
34
),
7492
7504
. https://doi.org/10.1523/jneurosci.0462-18.2018
Visconti di Oleggio Castello
,
M.
,
Haxby
,
J. V.
, &
Gobbini
,
M. I.
(
2021
).
Shared neural codes for visual and semantic information about familiar faces in a common representational space
.
Proceedings of the National Academy of Sciences of the United States of America
,
118
(
45
),
e2110474118
. https://doi.org/10.1073/pnas.2110474118
Wang
,
L.
,
Mruczek
,
R. E.
,
Arcaro
,
M. J.
, &
Kastner
,
S.
(
2015
).
Probabilistic maps of visual topography in human cortex
.
Cerebral Cortex
,
25
(
10
),
3911
3931
. https://doi.org/10.1093/cercor/bhu277
Weiner
,
K. S.
, &
Zilles
,
K.
(
2016
).
The anatomical and functional specialization of the fusiform gyrus
.
Neuropsychologia
,
83
,
48
62
. https://doi.org/10.1016/j.neuropsychologia.2015.06.033
Wong
,
A. C.-N.
,
Palmeri
,
T. J.
, &
Gauthier
,
I.
(
2009
).
Conditions for facelike expertise with objects: Becoming a ziggerin expert—But which type
?
Psychological Science
,
20
(
9
),
1108
1117
. https://doi.org/10.1111/j.1467-9280.2009.02430.x
Wong
,
Y. K.
,
Folstein
,
J. R.
, &
Gauthier
,
I.
(
2012
).
The nature of experience determines object representations in the visual system
.
Journal of Experimental Psychology: General
,
141
(
4
),
682
. https://doi.org/10.1037/a0027822
Wurm
,
M. F.
, &
Caramazza
,
A.
(
2022
).
Two ‘what’ pathways for action and object recognition
.
Trends in Cognitive Sciences
,
26
(
2
),
103
116
. https://doi.org/10.1016/j.tics.2021.10.003
Ye
,
J.
,
Xiong
,
T.
, &
Madigan
,
D.
(
2006
).
Computational and theoretical analysis of null space and orthogonal linear discriminant analysis
.
Journal of Machine Learning Research
,
7
(
7
),
1183
1204
. http://jmlr.org/papers/v7/ye06a.html
Yildirim
,
I.
,
Wu
,
J.
,
Kanwisher
,
N.
, &
Tenenbaum
,
J.
(
2019
).
An integrative computational architecture for object-driven cortex
.
Current Opinion in Neurobiology
,
55
,
73
81
. https://doi.org/10.1016/j.conb.2019.01.010
Yu
,
H.
, &
Yang
,
J.
(
2001
).
A direct LDA algorithm for high-dimensional data—With application to face recognition
.
Pattern Recognition
,
34
(
10
),
2067
2070
. https://doi.org/10.1016/S0031-3203(00)00162-X
Yue
,
X.
,
Tjan
,
B. S.
, &
Biederman
,
I.
(
2006
).
What makes faces special
?
Vision Research
,
46
(
22
),
3802
3811
. https://doi.org/10.1016/j.visres.2006.06.017
Zhang
,
Y.
,
Brady
,
M.
, &
Smith
,
S.
(
2001
).
Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm
.
IEEE Transactions on Medical Imaging
,
20
(
1
),
45
57
. https://doi.org/10.1109/42.906424

AppendiX

Appendix Table A1

List of identity-selective parcels and their anatomical region.

AAL regionParcelMNIα (%)βpppTopog.
No.xyzIdentityNoveltyStruct.Unstruct.BothAssign.
Precentral 14 -51 36 8.4 0.33 0.006 105 105 
Superior frontal 56 25 -9 64 7.9 0.18 n.s. n.s. 4×105 
Inferior frontal (opercular) 143 38 11 31 7.5 0.27 5×105 5×103 0.02 IFC 
146 52 10 22 0.44 0.05 n.s. 105 IFC 
Inferior frontal (triangular) 163 51 23 20 7.9 0.34 105 105 105 IFC 
Parahippocampal 325 26 -38 -9 7.5 0.05 0.007 2×104 105 
Calcarine 331 -2 -91 -1 16.9 0.01 105 105 105 V1v 
333 -2 -94 -4 10.2 -0.01 105 5×103 105 V1v 
335 -4 -86 12.6 0.11 105 105 105 V1d 
336 -5 -99 -8 12.2 0.03 105 105 105 V1v 
337 -12 -99 -5 13.8 0.10 105 105 105 V1d 
338 -7 -75 10 8.7 0.08 105 105 105 
342 17 -83 11 9.8 0.07 105 105 105 
344 -85 13.8 0.10 105 105 105 V1v 
345 15 -91 14.3 0.09 105 105 105 V1d 
347 18 -99 -1 12.1 0.07 105 105 105 V1d 
348 11 -72 11 9.1 0.06 n.s. 105 2×104 
Cuneus 350 -7 -85 26 10.1 0.07 105 105 105 
352 -94 25 9.1 0.05 105 105 105 V2d 
354 -2 -79 22 8.6 0.07 0.003 105 105 
355 -4 -92 25 11.9 0.09 n.s. 105 0.02 V2d 
356 12 -91 18 13 0.15 105 105 105 V2d 
357 15 -97 10 13.8 0.08 105 105 105 V2d 
Lingual 363 -12 -65 -5 9.5 0.10 105 105 105 V3v 
364 -15 -95 -16 10.9 0.02 104 105 105 V2v 
367 -29 -89 -16 11.5 0.07 2×105 105 105 hV4 
368 -17 -85 -12 14.9 0.26 105 105 105 V2v 
370 -22 -65 -5 9.7 0.16 105 105 105 VO2 
371 -12 -79 -8 13.9 0.12 105 105 105 V2v 
372 -6 -74 12 0.11 105 105 105 V1v 
373 16 -81 -7 14.1 0.11 105 105 105 V3v 
375 11 -72 -4 12.5 0.04 105 105 105 V2v 
377 21 -58 -3 9.3 0.24 105 105 105 
378 13 -52 7.6 0.05 n.s. 105 2×104 
379 16 -88 -10 13.4 0.14 105 105 105 V2v 
380 27 -91 -16 9.2 0.06 n.s. 0.01 0.01 
381 17 -98 -10 9.3 0.04 105 n.s. 0.005 V1v 
383 14 -56 -6 7.9 0.05 2×105 0.01 105 
Occipital (superior) 384 -18 -84 25 10.6 0.08 105 105 105 V3a 
385 -16 -85 41 9.8 0.12 105 105 105 IPS0 
386 -18 -69 29 7.7 0.04 n.s. 105 0.005 
387 -16 -95 23 13.5 0.12 105 105 105 V3a 
388 -22 -75 34 9.1 0.21 105 105 105 IPS1 
389 -11 -95 12.9 0.05 105 105 105 V2d 
390 22 -90 24 13.6 0.10 105 105 105 V3a 
391 21 -98 14 13.8 0.11 105 105 105 V2d 
392 27 -85 40 9.2 0.12 105 105 105 IPS0 
393 25 -67 33 0.18 n.s. n.s. 0.005 
394 24 -75 21 8.7 0.11 105 105 105 
395 23 -79 33 10 0.12 105 105 105 
396 29 -70 43 9.2 0.21 0.03 105 105 IPS1 
Occipital (middle) 397 -28 -77 27 10.1 0.23 105 105 105 IPS0 
398 -28 -72 34 9.1 0.32 n.s. 105 105 IPS1 
400 -38 -86 12 0.15 105 105 105 LO2 
402 -33 -87 19 12 0.24 105 105 105 V3b 
403 -30 -78 7.7 0.05 n.s. n.s. 0.02 
404 -27 -94 12.7 0.15 105 105 105 V3d 
405 -16 -100 14 0.13 105 105 105 V2d 
406 -40 -75 15 8.9 0.14 105 105 105 
407 -27 -83 14 11.3 0.19 105 105 105 IPS0 
408 -24 -93 13 13.6 0.19 105 105 105 V3d 
410 -38 -83 23 9.9 0.21 105 105 105 
412 -46 -77 10.1 0.14 105 105 105 hMT 
413 33 -88 13.2 0.23 105 105 105 LO1 
414 33 -96 10.7 0.17 105 105 105 V3d 
415 39 -81 14 10.6 0.19 105 105 105 V3b 
416 44 -78 10.9 0.22 105 105 105 LO2 
418 32 -86 23 11.9 0.22 105 105 105 V3b 
420 34 -69 32 8.5 0.34 n.s. 105 0.02 
421 32 -76 27 10 0.28 105 104 105 IPS0 
Occipital (inferior) 423 -50 -68 -14 9.4 0.22 105 105 105 
424 -31 -83 -8 12.6 0.22 105 105 105 
425 -22 -95 -9 12.3 0.11 105 105 105 
426 -42 -73 -8 11.6 0.21 105 105 105 
428 36 -85 -7 13.2 0.21 105 105 105 
430 42 -73 -9 11.4 0.25 105 105 105 
Fusiform 432 -27 -71 -11 12.2 0.25 105 105 105 VO2 
435 -33 -77 -17 12.4 0.15 105 105 105 hV4 
436 -31 -53 -13 8.8 0.26 105 n.s. 105 PHC1 
438 -41 -56 -17 8.9 0.26 n.s. 105 0.01 
440 -36 -63 -16 9.6 0.25 0.001 105 105 
442 28 -74 -11 12.8 0.31 105 105 105 hV4 
443 36 -71 -16 10.8 0.25 0.02 105 105 hV4 
447 29 -47 -14 8.3 0.31 0.002 105 105 PHC2 
450 41 -48 -20 8.4 0.30 0.006 105 0.02 
452 28 -59 -12 9.7 0.35 105 105 105 VO2 
Postcentral 476 60 -18 37 9.2 0.34 105 105 105 
478 42 -31 49 8.5 0.25 105 0.02 105 
484 30 -40 61 8.5 0.20 105 105 105 
Parietal (superior) 494 -26 -61 61 9.1 0.09 105 105 105 
495 -27 -53 68 0.09 0.02 0.05 2×105 
497 -21 -67 47 9.3 0.19 105 105 105 IPS1 
498 -15 -69 50 8.5 0.12 0.01 105 105 IPS2 
499 -28 -69 50 8.9 0.29 0.001 3×105 105 
501 -17 -79 50 8.7 0.11 104 3×105 105 IPS1 
502 30 -60 64 8.8 0.18 0.002 5×104 105 IPS3 
504 34 -62 57 9.3 0.30 5×105 0.003 105 
506 33 -50 58 9.8 0.34 105 105 105 
507 21 -57 74 8.1 0.09 n.s. 105 5×104 
509 31 -73 53 8.6 0.13 n.s. 105 0.001 
510 21 -74 55 8.8 0.14 0.06 0.07 2×105 IPS1 
511 20 -65 54 9.2 0.17 105 105 105 IPS2 
Parietal (inferior) 514 -45 -30 42 8.6 0.24 0.002 105 105 
516 -32 -75 44 0.19 0.01 5×104 105 
521 -39 -47 42 8.6 0.17 n.s. 104 0.005 
522 -32 -46 50 8.5 0.12 105 105 105 
523 -31 -53 46 8.4 0.18 105 n.s. 2×105 
527 46 -39 50 8.6 0.35 n.s. 105 0.001 
529 37 -49 46 8.7 0.34 0.06 0.003 2×105 
530 35 -44 51 9.2 0.29 5×104 105 105 
Supramarginal 536 -61 -28 34 8.4 0.24 0.003 105 105 
539 44 -34 41 8.7 0.20 0.001 0.001 105 
542 63 -24 37 8.5 0.22 105 n.s. 104 
Angular 557 34 -60 44 8.7 0.38 n.s. 105 0.005 
Precuneus 561 -5 -77 53 7.9 0.06 105 2×105 105 
573 -9 -71 54 8.1 0.08 105 0.003 105 
576 14 -71 45 7.9 0.12 105 105 105 
Temporal (middle) 678 -45 -67 11 8.9 0.12 105 n.s. 104 
685 -49 -62 9.3 0.19 105 105 105 
701 52 -59 8.9 0.20 105 105 105 
717 49 -69 10.5 0.18 105 105 105 
Temporal (inferior) 728 -54 -58 -11 8.5 0.25 2×104 105 105 AIT 
732 -45 -52 -13 9.3 0.30 105 105 105 AIT 
755 46 -53 -11 9.7 0.42 105 105 105 AIT 
AAL regionParcelMNIα (%)βpppTopog.
No.xyzIdentityNoveltyStruct.Unstruct.BothAssign.
Precentral 14 -51 36 8.4 0.33 0.006 105 105 
Superior frontal 56 25 -9 64 7.9 0.18 n.s. n.s. 4×105 
Inferior frontal (opercular) 143 38 11 31 7.5 0.27 5×105 5×103 0.02 IFC 
146 52 10 22 0.44 0.05 n.s. 105 IFC 
Inferior frontal (triangular) 163 51 23 20 7.9 0.34 105 105 105 IFC 
Parahippocampal 325 26 -38 -9 7.5 0.05 0.007 2×104 105 
Calcarine 331 -2 -91 -1 16.9 0.01 105 105 105 V1v 
333 -2 -94 -4 10.2 -0.01 105 5×103 105 V1v 
335 -4 -86 12.6 0.11 105 105 105 V1d 
336 -5 -99 -8 12.2 0.03 105 105 105 V1v 
337 -12 -99 -5 13.8 0.10 105 105 105 V1d 
338 -7 -75 10 8.7 0.08 105 105 105 
342 17 -83 11 9.8 0.07 105 105 105 
344 -85 13.8 0.10 105 105 105 V1v 
345 15 -91 14.3 0.09 105 105 105 V1d 
347 18 -99 -1 12.1 0.07 105 105 105 V1d 
348 11 -72 11 9.1 0.06 n.s. 105 2×104 
Cuneus 350 -7 -85 26 10.1 0.07 105 105 105 
352 -94 25 9.1 0.05 105 105 105 V2d 
354 -2 -79 22 8.6 0.07 0.003 105 105 
355 -4 -92 25 11.9 0.09 n.s. 105 0.02 V2d 
356 12 -91 18 13 0.15 105 105 105 V2d 
357 15 -97 10 13.8 0.08 105 105 105 V2d 
Lingual 363 -12 -65 -5 9.5 0.10 105 105 105 V3v 
364 -15 -95 -16 10.9 0.02 104 105 105 V2v 
367 -29 -89 -16 11.5 0.07 2×105 105 105 hV4 
368 -17 -85 -12 14.9 0.26 105 105 105 V2v 
370 -22 -65 -5 9.7 0.16 105 105 105 VO2 
371 -12 -79 -8 13.9 0.12 105 105 105 V2v 
372 -6 -74 12 0.11 105 105 105 V1v 
373 16 -81 -7 14.1 0.11 105 105 105 V3v 
375 11 -72 -4 12.5 0.04 105 105 105 V2v 
377 21 -58 -3 9.3 0.24 105 105 105 
378 13 -52 7.6 0.05 n.s. 105 2×104 
379 16 -88 -10 13.4 0.14 105 105 105 V2v 
380 27 -91 -16 9.2 0.06 n.s. 0.01 0.01 
381 17 -98 -10 9.3 0.04 105 n.s. 0.005 V1v 
383 14 -56 -6 7.9 0.05 2×105 0.01 105 
Occipital (superior) 384 -18 -84 25 10.6 0.08 105 105 105 V3a 
385 -16 -85 41 9.8 0.12 105 105 105 IPS0 
386 -18 -69 29 7.7 0.04 n.s. 105 0.005 
387 -16 -95 23 13.5 0.12 105 105 105 V3a 
388 -22 -75 34 9.1 0.21 105 105 105 IPS1 
389 -11 -95 12.9 0.05 105 105 105 V2d 
390 22 -90 24 13.6 0.10 105 105 105 V3a 
391 21 -98 14 13.8 0.11 105 105 105 V2d 
392 27 -85 40 9.2 0.12 105 105 105 IPS0 
393 25 -67 33 0.18 n.s. n.s. 0.005 
394 24 -75 21 8.7 0.11 105 105 105 
395 23 -79 33 10 0.12 105 105 105 
396 29 -70 43 9.2 0.21 0.03 105 105 IPS1 
Occipital (middle) 397 -28 -77 27 10.1 0.23 105 105 105 IPS0 
398 -28 -72 34 9.1 0.32 n.s. 105 105 IPS1 
400 -38 -86 12 0.15 105 105 105 LO2 
402 -33 -87 19 12 0.24 105 105 105 V3b 
403 -30 -78 7.7 0.05 n.s. n.s. 0.02 
404 -27 -94 12.7 0.15 105 105 105 V3d 
405 -16 -100 14 0.13 105 105 105 V2d 
406 -40 -75 15 8.9 0.14 105 105 105 
407 -27 -83 14 11.3 0.19 105 105 105 IPS0 
408 -24 -93 13 13.6 0.19 105 105 105 V3d 
410 -38 -83 23 9.9 0.21 105 105 105 
412 -46 -77 10.1 0.14 105 105 105 hMT 
413 33 -88 13.2 0.23 105 105 105 LO1 
414 33 -96 10.7 0.17 105 105 105 V3d 
415 39 -81 14 10.6 0.19 105 105 105 V3b 
416 44 -78 10.9 0.22 105 105 105 LO2 
418 32 -86 23 11.9 0.22 105 105 105 V3b 
420 34 -69 32 8.5 0.34 n.s. 105 0.02 
421 32 -76 27 10 0.28 105 104 105 IPS0 
Occipital (inferior) 423 -50 -68 -14 9.4 0.22 105 105 105 
424 -31 -83 -8 12.6 0.22 105 105 105 
425 -22 -95 -9 12.3 0.11 105 105 105 
426 -42 -73 -8 11.6 0.21 105 105 105 
428 36 -85 -7 13.2 0.21 105 105 105 
430 42 -73 -9 11.4 0.25 105 105 105 
Fusiform 432 -27 -71 -11 12.2 0.25 105 105 105 VO2 
435 -33 -77 -17 12.4 0.15 105 105 105 hV4 
436 -31 -53 -13 8.8 0.26 105 n.s. 105 PHC1 
438 -41 -56 -17 8.9 0.26 n.s. 105 0.01 
440 -36 -63 -16 9.6 0.25 0.001 105 105 
442 28 -74 -11 12.8 0.31 105 105 105 hV4 
443 36 -71 -16 10.8 0.25 0.02 105 105 hV4 
447 29 -47 -14 8.3 0.31 0.002 105 105 PHC2 
450 41 -48 -20 8.4 0.30 0.006 105 0.02 
452 28 -59 -12 9.7 0.35 105 105 105 VO2 
Postcentral 476 60 -18 37 9.2 0.34 105 105 105 
478 42 -31 49 8.5 0.25 105 0.02 105 
484 30 -40 61 8.5 0.20 105 105 105 
Parietal (superior) 494 -26 -61 61 9.1 0.09 105 105 105 
495 -27 -53 68 0.09 0.02 0.05 2×105 
497 -21 -67 47 9.3 0.19 105 105 105 IPS1 
498 -15 -69 50 8.5 0.12 0.01 105 105 IPS2 
499 -28 -69 50 8.9 0.29 0.001 3×105 105 
501 -17 -79 50 8.7 0.11 104 3×105 105 IPS1 
502 30 -60 64 8.8 0.18 0.002 5×104 105 IPS3 
504 34 -62 57 9.3 0.30 5×105 0.003 105 
506 33 -50 58 9.8 0.34 105 105 105 
507 21 -57 74 8.1 0.09 n.s. 105 5×104 
509 31 -73 53 8.6 0.13 n.s. 105 0.001 
510 21 -74 55 8.8 0.14 0.06 0.07 2×105 IPS1 
511 20 -65 54 9.2 0.17 105 105 105 IPS2 
Parietal (inferior) 514 -45 -30 42 8.6 0.24 0.002 105 105 
516 -32 -75 44 0.19 0.01 5×104 105 
521 -39 -47 42 8.6 0.17 n.s. 104 0.005 
522 -32 -46 50 8.5 0.12 105 105 105 
523 -31 -53 46 8.4 0.18 105 n.s. 2×105 
527 46 -39 50 8.6 0.35 n.s. 105 0.001 
529 37 -49 46 8.7 0.34 0.06 0.003 2×105 
530 35 -44 51 9.2 0.29 5×104 105 105 
Supramarginal 536 -61 -28 34 8.4 0.24 0.003 105 105 
539 44 -34 41 8.7 0.20 0.001 0.001 105 
542 63 -24 37 8.5 0.22 105 n.s. 104 
Angular 557 34 -60 44 8.7 0.38 n.s. 105 0.005 
Precuneus 561 -5 -77 53 7.9 0.06 105 2×105 105 
573 -9 -71 54 8.1 0.08 105 0.003 105 
576 14 -71 45 7.9 0.12 105 105 105 
Temporal (middle) 678 -45 -67 11 8.9 0.12 105 n.s. 104 
685 -49 -62 9.3 0.19 105 105 105 
701 52 -59 8.9 0.20 105 105 105 
717 49 -69 10.5 0.18 105 105 105 
Temporal (inferior) 728 -54 -58 -11 8.5 0.25 2×104 105 105 AIT 
732 -45 -52 -13 9.3 0.30 105 105 105 AIT 
755 46 -53 -11 9.7 0.42 105 105 105 AIT 

Parcel ID, geometrical centroid x/y/z in MNI coordinates, average classification accuracy α, average novelty rate β, corrected significance p in structured or unstructured conditions (n=8), corrected significance p in both conditions (n=16), and topographical assignment, if any.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Supplementary data