## Abstract

*Objective*: Visual expertise for particular categories of objects (e.g., mushrooms, birds, flowers, minerals, and so on) is known to enhance cortical responses in parts of the ventral occipitotemporal cortex. How is such additional expertise integrated into the prior cortical representation of life-long visual experience? To address this question, we presented synthetic visual objects rotating in three dimensions and recorded multivariate BOLD responses as initially unfamiliar objects gradually became familiar.

*Main results*: An analysis of pairwise distances between multivariate BOLD responses (“representational similarity analysis,” RSA) revealed that visual objects were linearly discriminable in large parts of the ventral occipital cortex, including the primary visual cortex, as well as in certain parts of the parietal and frontal cortex. These cortical representations were present from the start, when objects were still unfamiliar, and even though objects were shown from different sides. As shapes became familiar with repeated viewing, the distribution of responses expanded to fill more of the available space. In contrast, the distribution of responses to novel shapes (which appeared only once) contracted and shifted to the margins of the available space.

*Conclusion*: Our results revealed cortical representations of object shape and gradual changes in these representations with learning and consolidation. The cortical representations of once-viewed shapes that remained novel diverged dramatically from repeatedly viewed shapes that became familiar. This disparity was evident in both the similarity and the diversity of multivariate BOLD responses.

## 1 Introduction

An essential aspect of visual object recognition is the processing of visual shapes. The neural substrate of shape processing includes the ventral visual pathway, which in humans extends over the ventral occipitotemporal cortex from the occipital pole to the lateral occipital cortex, fusiform gyrus, and beyond (reviewed by Bi et al., 2016; Grill-Spector & Weiner, 2014; Kravitz et al., 2013; Weiner & Zilles, 2016). Functional imaging studies of ventral occipitotemporal cortex reveal intriguing functional anatomy, with responsiveness to specific object categories (e.g., faces, scenes, body parts) changing systematically over the cortical surface along several large-scale anatomical gradients (e.g., animate-inanimate, large-small, feature-whole, or perception-action; Freud et al., 2017; Grill-Spector & Weiner, 2014; Grill-Spector et al., 2004; Konkle & Oliva, 2012; Wurm & Caramazza, 2022; Yildirim et al., 2019).

Experience and learning improve object recognition performance, and also modify shape processing in the ventral occipitotemporal cortex. Indeed, functional imaging evidence shows that particular visual expertise—being able to identify and categorize visually similar objects of a particular kind—often entails moderate but anatomically distributed changes in the pre-existing responsiveness to shape (reviewed by Bukach et al., 2006; de Beeck & Baker, 2010; Gauthier & Tarr, 2016; Harel et al., 2013). This has been established by comparing novices and experts for identifying particular categories of natural objects (e.g., birds, mushrooms, minerals, degraded images; Cetron et al., 2019; Connolly et al., 2012; Duyck et al., 2021; Freud et al., 2017; Martens et al., 2018; McGugin et al., 2012; Roth & Zohary, 2015), as well as by comparing observers before and after they have learned to categorize initially unfamiliar synthetic shapes (e.g., computer-generated “greebles,” “spikies,” or “ziggerins”; Brants et al., 2011; de Beeck et al., 2006; Gauthier et al., 1999; A. C.-N. Wong et al., 2009; Y. K. Wong et al., 2012; Yue et al., 2006).

Here, we map the cortical representation of synthetic visual objects and track gradual changes as initially unfamiliar objects become progressively familiar with learning. We wondered how pre-existing shape representations would accommodate and integrate novel synthetic objects. We further wondered whether representational changes would be specific to learned objects or extend also to other objects of the same kind. To explore these questions, we analyzed “representational similarity” of spatiotemporal BOLD patterns (Haxby, 2012; Kriegeskorte, Mur, Ruff, et al., 2008), which offers a potentially sensitive measure for the information encoded in neural activity and may also be related to similarity as perceived by human observers (Charest & Kriegeskorte, 2015; Collins & Behrmann, 2020; Nestor et al., 2016).

Most previous studies of visual expertise identified cortical sites associated with a particular object category by comparing BOLD activity either between novices and experts or before and after learning. We extend this work in three ways: firstly, by establishing representational distance at the level of object exemplars rather than object categories; secondly, by monitoring gradual changes as observers gain familiarity with object exemplars; and thirdly, by analyzing changes in the diversity of multivariate BOLD activity. Few previous studies have attempted to resolve shape representations in such detail (Brants et al., 2016; Duyck et al., 2021; Eger et al., 2008; Visconti di Oleggio Castello et al., 2021). To progress fine-grained analysis of representational geometry, we developed synthetic shapes for which visual expertise is acquired comparatively slowly (Kakaei et al., 2021) and took advantage of a numerically tractable method for linear discriminant analysis in O($103$)-dimensional multivariate activity (DLDA; Yu & Yang, 2001).

Our results showed view-invariant representations of shape over surprisingly extensive regions of the ventral occipitotemporal cortex, including the fusiform gyrus, lateral occipital areas, and primary visual cortex. Representational distances were high from the start, even before learning, suggesting that new visual expertise was accommodated and encoded within pre-existing representations. However, shapes that appeared repeatedly (and were memorized by observers) and shapes that appeared just once (and were ignored) diverged dramatically, in terms of their cortical representations, while visual expertise was being acquired and consolidated.

## 2 Methods

### 2.1 Observers and behavior

Eight healthy observers (4 female and 4 male; aged 25 to 32 years) took part in behavioral training (“sham experiment,” one session per observer), the functional imaging experiment (“main experiment,” six scanning sessions per observer), and a final behavioral assessment (two sessions). All observers were paid and gave informed consent. Ethical approval was granted under Chiffre 30/21 by the ethics committee of the Faculty of Medicine of the Otto-von-Guericke University, Magdeburg.

In both sham and main experiments, observers viewed sequences of 200 recurring and non-recurring objects (see below and Fig. 1A) and attempted to classify each object as “familiar” or “novel” (by pressing the appropriate button). Over the course of multiple sessions, observers gradually became familiar with recurring objects and thus became able to distinguish them from non-recurring objects. Objects of the sham experiment were two-dimensional shapes, whereas objects of the main experiment were rotating, three-dimensional shapes (see below and Fig. 1A).

The main experiment extended over 3 successive weeks, with three sessions on separate days of both the 1st and 3rd week (no sessions took place in the 2nd week). The experiments of the 1st and 3rd week differed in four aspects: sequence type (structured or unstructured), the set of recurring objects, object color (red or blue), and responding hand (left or right). All aspects were counterbalanced across observers.

After the three scanning sessions of a week, observers participated in an additional behavioral session to confirm that they had in fact become familiar with every recurring object. Specifically, they performed a spatial search task in which they pointed out recurring target objects among non-recurring distractor objects (Kakaei et al., 2021). In addition, observers were offered the opportunity to voice anything they might have noticed about the experiment.

### 2.2 Experimental paradigm

Complex three-dimensional objects were computer-generated and presented as described previously (Kakaei et al., 2021). A movie can be viewed under this LINK. All objects were highly characteristic and dissimilar from each other as confirmed computationally in terms of vector distances between depth maps (Kakaei et al., 2021). Objects were presented every $3s$, with $2.5s$ viewing and $0.5s$ transition time (Fig. 1A). Objects were shown from all sides and, after appearing at an arbitrary angle, revolved smoothly for one full turn (period $2.5s$, frequency $0.4Hz$, angular frequency $144\u2218\u200b/\u200bs$) about one of several axes in the frontal plane ($\u221245\u2218$, $0\u2218$, $45\u2218$, clockwise or counter-clockwise). Axes and directions were counterbalanced for each object, and initial viewing angles were chosen randomly (Fig. 1B). All stimuli were generated with MATLAB (The MathWorks, Inc.), presented with the psychophysics toolbox (Brainard, 1997), and viewed in a mirror mounted to the MR head coil (screen resolution $960\xd7720$ pixels, frame rate $60Hz$, subtending approximately $8\u2218\u200b\xd76\u2218$ of visual angle, average luminance $50Cd\u200b/\u200bm2$, background luminance $5Cd\u200b/\u200bm2$). Observers responded with the right or left index finger on an MR-safe response box.

Fifteen objects recurred many times during three sessions (“recurring” objects), whereas other objects appeared exactly once (“non-recurring” or “singular” objects). As mentioned, observers classified every object as either “familiar” or “unfamiliar” by pressing a button during its presentation. Over the course of three sessions, all observers gradually became familiar with the “recurring objects” (see below). The average time-course of learning, as established by a simplified signal detection and reaction-time (RT) analysis, is shown in Figure 1C.

Every session comprised six sequences (“runs”), each lasting $600s$ and presenting $180$ “recurring” and $20$ “non-recurring” objects ($200$ objects in total). As there were 15 different recurrent objects, each such object was seen $12\xb11.9$ times during every sequence. Over the three sessions (or 18 sequences), each recurring object appeared at least $190$ times each (mean $\xb1$ S.D.: $216\xb19$), whereas non-recurring objects appeared only once. Altogether, there were $3,240$ presentations of recurring objects ($3\xd76\xd7180$) and $360$ presentations of non-recurring objects ($3\xd76\xd7180$).

Presentation sequences started with a random recurring object and continued randomly to one of the possible next objects, with neither immediate repetitions ($X\u2192X$) nor direct returns ($X\u2192Y\u2192X$) being allowed. Sequences comprised $200$ objects, of which $180$ were recurring and $20$ objects non-recurring and were interspersed at random intervals. Object sequences were post-selected such as to counterbalance the number of appearances of every recurring object in every session.

All observers performed the experiment twice in the scanner, once during the 1st week and again during the 3rd week of the main experiment (so that 8 observers provided 16 data sets). As mentioned, the 2 weeks differed in terms of the recurring objects and the presentation sequence. “Structured” sequences exhibited predictive sequential dependencies (3 possible recurring next objects), whereas “unstructured” sequences did not (14 possible recurring next objects, see Kakaei et al., 2021 for details). As a result, the repetition latency (i.e., the latency of successive presentations of the same object) was $5.5\xb115$ (median and S.D.) for “structured” and $10.5\xb111$ for “unstructured” sequences. Further aspects and effects of sequence structure are reported and discussed in detail in a companion paper.

To verify that recurring objects had become familiar to observers, every observer performed 60 trials of a spatial search task with 3 recurring and 9 non-recurring objects. The 12 objects were positioned randomly in a $3\xd74$ array and were presented for 30 s while rotating in three dimensions (as in the main experiment). After each presentation, observers indicated the recurring object positions with the computer mouse. Performance was consistently above 95% correct.

### 2.3 MRI acquisition

All magnetic-resonance images were acquired on a 3T Siemens Prisma scanner with a 64-channel head coil. Structural images were T1-weighted sequences (**MPRAGE** TR = 2,500 ms, TE = 2.82 ms, TI = 1,100 ms, $7\u2218$ flip angle, isotropic resolution $1\xd71\xd71mm$ and matrix size of $256\xd7256\xd7192$). Functional images were T2*-weighted sequences (TR = 1,000 ms, TE = 30 ms, $65\u2218$ flip angle, resolution of $3\xd73\xd73.6mm$ and matrix size of $72\xd772\xd736$). Field maps were obtained by gradient dual-echo sequences (TR = 720 ms, TE1 = 4.92 ms, TE2 = 7.38 ms, resolution of $1.594\xd71.594\xd72mm$ and matrix size of $138\xd7138\xd772$).

### 2.4 fMRI pre-processing

Our approach to fMRI analysis was influenced by recent advances in comparing uni- and multivariate responses of corresponding voxels between different observers (e.g., Kumar et al., 2022; Nastase et al., 2019). The *local* correlation structure of voxel response, which is similar in different observers, provided the basis for our functional parcellation (Dornas & Braun, 2018). The parcellation obviated “searchlight” strategies by defining for all observers corresponding brain “parcels” with corresponding episodes of high-dimensional (O(1000)) multivariate activity.

The fMRI pre-processing procedure was similar to that published previously (Dornas & Braun, 2018). First, DICOM files were converted into NIFTI format using MRICRON (MRICRON Toolbox, Maryland, USA, NIH). Then, brain tissues were extracted and segmented using BET (Smith, 2002) and FAST (Zhang et al., 2001). Field map correction, head motion correction, spatial smoothing, high-pass temporal filtering, and registration to structural and standard images were performed with the MELODIC package of FSL (Beckmann & Smith, 2004).

Field map correction and registration to structural image were carried out using Boundary-Based Registration (BBR; Greve & Fischl, 2009). MELODIC uses MCFLIRT (Jenkinson et al., 2002) to correct for head motion. Spatial smoothing was performed with SUSAN (Smith & Brady, 1997), with full width at half maximum set at FWHM $=5mm$. To remove low-frequency artifacts, we applied a high-pass filter of the cut-off frequency $f=0.01$Hz, that is, oscillations/events with periods of more than $100$ s were removed. To register the structural image to Montreal MNI152 standard space with isotropic $2mm$ voxel size, we used FLIRT (FMRIB’s Linear Image Registration Tool; Jenkinson & Smith, 2001; Jenkinson et al., 2002) with 12 degrees of freedom (DOF) and FNIRT (FMRIB’s Nonlinear Image Registration Tool) to apply the non-linear registration. To further reduce artifacts arising from head motion, we applied despiking with a threshold of $\lambda =100$ using BrainWavelet toolbox (Patel et al., 2014). Later, we regressed out the mean CSF activity as well as 12 DOF translation and rotation factors predicted by a motion correction algorithm (MCFLIRT). Afterward, the time series of each voxel was detrended linearly and whitened (with Matlab functions “detrend” and “zscore”).

Finally, the $160,099$ voxels of MNI152 space were grouped into $758$ functional parcels according to the MD758 atlas (Dornas & Braun, 2018). Each functional parcel is associated with an anatomically labeled region of the AAL atlas (Tzourio-Mazoyer et al., 2002) and comprises approximately $200$ voxels or approximately $1.7cm3$ of gray matter volume ($212\xb170$ voxels, range $45$ to $462$ voxels). Parcels were defined for a small population of observers such as to maximize signal covariance *within* and minimize covariance *between* parcels in the resting state. In contrast to other parcellation schemes, this was based exclusively on the (typically strong) functional correlations within each anatomical region and disregarded the (typically weak) correlations between different anatomical regions. The MD758 parcellation offers superior cluster quality, correlational structure, sparseness, and consistency with fiber tracking, compared to other parcellation schemes of similar resolution (Albers et al., 2021; Dornas & Braun, 2018).

### 2.5 fMRI data analysis

To study the neural representation of objects, we extracted the multivoxel activity pattern at $Nt=9$ time points following object onset. In a functional parcel with $Nvox$ voxels, this response pattern constituted a point (or vector) in an $Ndim$-dimensional space, where $Ndim=Nt\u22c5Nvox$ (Fig. 2A). To identify parcels with significant selectivity for individual recurring objects, we employed a representational similarity analysis (RSA; Kriegeskorte, Mur, & Bandettini, 2008) (Fig. 2B). This analysis uses the standardized Euclidean (Mahalanobis) distance between responses in a high-dimensional space to examine the separability of neural object representations as a function of learning, or object type (recurring or non-recurring), or both. Over all 758 parcels, response dimensionality was $Ndim=1,911\xb1634$ (mean and standard-deviation), with a range from $405$ (Calcarine-L 329, with $45$ voxels) to $4,113$ (Postcentral-R-484, with $457$ voxels).

Our approach to RSA differed from previous work in some respects. Firstly, we analyzed high-dimensional *spatiotemporal* patterns of BOLD activity ($200$ voxels $\xd7$$9$ s, or O($103$) dimensions) in non-overlapping gray matter volumes ($758$ functional subdivisions of $90$ anatomical regions, averaging $1.7 cm3$; Dornas & Braun, 2018). Other studies have used lower-dimensional *spatial* activity patterns in overlapping searchlight volumes ($O(102)$ voxels or dimensions, covering $0.25$ to $1.0cm3$; Kriegeskorte et al., 2006). Secondly, we employed multi-class linear discriminant analysis (“direct linear discriminant analysis,” DLDA; Yu & Yang, 2001), rather than pairwise discriminability or one-versus-all discriminability (e.g., Hung et al., 2005; Liu et al., 2009). With these modifications, RSA revealed representational geometry at the level of object exemplars, as well as gradual changes in this geometry over sessions and runs.

#### 2.5.1 Linear discriminant analysis

To analyze the response variance that discriminates $\kappa =15$ recurring objects, at most $(\kappa \u22121)$-dimensions are required. Restricting the analysis to $14$ principal components of the response could potentially have neglected smaller but more discriminating components. Accordingly, we performed a Linear Discriminant Analysis (LDA), which amounts to a “supervised” principal component analysis (PCA) and yields the $(\kappa \u22121)$-dimensional orthonormal subspace $S$ that optimally discriminates the $\kappa $ response classes. Here, optimality is defined as simultaneously minimizing within-class variance and maximizing between-class variance of responses.

The results of LDA and PCA showed considerable commonality. Over the 758 parcels, the first 14 principal components captured $53\xb17%$ (mean and S.D.) of the total response variance, whereas the 14-dimensional subspaces $S$ captured $33\xb17%$ of the total variance (or $61\xb16%$ of the principal component variance). Almost all of the subspace variance overlapped with the principal component variance (i.e., $88\xb15%$ of subspace variance projected into the space of the first 14 principal components, while the remaining $12\xb15%$ projected into the space of the remaining principal components).

Similar numbers were obtained for the 124 identity-selective parcels. The first 14 principal components captured $57\xb16%$ (mean and S.D.) of the total response variance, and subspaces $S$ captured $38\xb16%$ of the total variance (or $67\xb14%$ of the principal component variance). Almost all of the subspace variance ($91\xb13%$) overlapped with the first 14 principal components. In summary, Linear Discriminant Analysis captured the useful (discriminating) part of correlated variance and distributed this variance more uniformly over its 14 orthonormal dimensions ($6\xb13%$ per dimension) than principal component analysis could ($4\xb16%$ per dimension).

A numerically tractable procedure for identifying the optimal subspace $S$ is available in terms of “direct LDA” or DLDA (Ye et al., 2006; Yu & Yang, 2001). Briefly, this method first diagonalizes between-class variance to identify $\kappa \u22121$ discriminative eigenvectors with non-zero eigenvalues, next diagonalizes within-class variance, and finally yields a rectangular matrix for projecting activity patterns from the original activity space (dimensionality $Ndim$) to the maximally discriminative subspace $S$ and back. As this method is linear and relies on all available degrees of freedom, its results are deterministic. An important feature of this particular algorithm is that within-class variance is maintained near unity for all classes, by means of a suitable scaling of the subspace dimensions. The link github.com/cognitive-biology/DLDA provides a Matlab implementation of DLDA.

#### 2.5.2 Amplitudes, distances, and correlations

Activity patterns $xjk$ associated with trials $k$ were analyzed in the maximally discriminative subspace $S$. The normalized amplitude $ak=1\kappa \u22121\u2211j=1\kappa \u22121xjk2$ of such patterns exhibited an average value of $\u2329a\u232a=0.99$. The normalized distance $dkl=1\kappa \u22121\u2211j=1\kappa \u22121(xjk\u2212xjl)2$ between patterns associated with trials $k$ and $l$ measured on average $\u2329d\u232a=1.40$, consistent with distance expected between random patterns of this amplitude ($2$). Averaging over trials $k$ produced normalized response amplitudes $A=\u2329ak\u232ak$. Averaging over pairs of trials $k$, $l$ separated by a given latency $l\u2212k$, produced normalized response distances $D(l\u2212k)=\u2329dkl\u232ak,l$.

The patterns from successive trials exhibited a weak temporal correlation, with approximately $5%$ smaller distances at delays below $4$ trials and approximately $2%$ larger distances at delays ranging from $6$ to $15$ trials (see Supplementary Fig. S1A, B). Comparing pairs of trials with different types of objects, we observed approximately $3%$*larger* response distances $D$ (at all latencies) for the same recurring objects than for either different recurring or non-recurring objects (Supplementary Fig. S1C). Differential response amplitudes $A$ increased marginally with latency, because response amplitudes tended to increase slightly over the course of each run (Supplementary Fig. S1D). This trend was evident for all types of objects and with both “structured” and “unstructured” sequences. In other words, the effect of object type on multivariate hemodynamic responses was limited to response distances and did not extend to response amplitudes. Thus, our data provided no evidence for “repetition suppression.”

For certain analyses (Sections 2.5.8 and 2.5.9), we established for each parcel $w$ the average delay-dependent distance $Tw(\Delta k)=\u2329dw,u,r(\Delta k)\u232au,r$ between patterns with a relative delay of $\Delta k$ trials, where the average was taken over subjects $u$ and runs $r$. The time-course $Tw$ allowed us to discount temporal correlations by computing $dw,u,rcorrected(\Delta k)=dw,u,r(\Delta k)\u2212Tw(\Delta k)+\u2329Tw(\Delta k)\u232a\Delta k$, where $\u2329Tw(\Delta k)\u232a\Delta k$ is the average value over delays $\Delta k$.

#### 2.5.3 Representation of shape “identity” for recurring objects

Our observations comprised approximately $200$ activity patterns for each of the $15$ recurring object classes (per observer and condition). To allow for cross-validation, we randomly divided these patterns in a larger “training set” ($90%$ or $190\xb17.7$ per object class) and a smaller test set ($10%$ or $22\xb10.9$ per object class) (Fig. 2B). Note that the “training set” comprised exclusively activity patterns associated with recurring objects. To reduce the variability introduced by random test sets, this selection was repeated $Nr=20$ times and all statistical measures described below represent the average over repetitions. As illustrated in Figure 2C, in the discriminative subspace $S$, we compared the $ni$ test set exemplars $xki$ (where $k=1,\u2026,ni$) of class $i$ to the centroids $cjtrain$ established for the training exemplars of class $j$. To compute Mahalanobis distances and variance ratios (see below), we compared test set exemplars $xki$ of class $i$ to the centroids $cjtest$ of test set exemplars of class $j$.

We used three measures for this comparison, all with comparable results. Firstly, the nearest class centroid $citrain$ to each pattern exemplar $xki$ was identified to establish a matrix of classification probabilities $P(j\u200b|i)$ (probability that an exemplar of class $i$ is nearest to the centroid of class $j$), also known as “confusion matrix,” as well as the “classification accuracy” $\alpha =\u2211iP(i\u200b|i)P(i)$, which is the probability that the nearest centroid is the correct one.

Secondly, for each pair of object classes $(i,j)$, object exemplars $xki$ and $xkj$ from the test set were projected onto the line connecting the two test set centroids, $citest$ and $cjtest$, and a pairwise discriminability/dissimilarity/Mahalanobis distance $\delta i,j$ was computed from the means, $\mu i$ and $\mu j$, and variances, $\sigma i2$ and $\sigma j2$, of these projections, as $\delta i,j=|\mu i\u2212\mu j|12(\sigma i2+\sigma j2)$. The average over all pairs of object classes was computed as $\delta =2\kappa (\kappa \u22121)\u2211i,j\delta i,j$.

Thirdly, given class centroids $citest$ and overall centroid $ctest$, we computed the Euclidean distances $dki=\Vert xki\u2212citest\Vert $ between exemplars $xki$ and class centroid $citest$ and, for each object class $i$, the “sum of squares” as $SSWi=\u2211k=1nidki2$. The “within-class” variance of all classes was computed as $SSW=1N\u2211i=1\kappa SSWi$, where $N=\u2211i=1\kappa ni$. Similarly, from the Euclidean distances $di\u200a\u200b=\Vert citest\u2212ctest\Vert $ between individual and overall centroids, we computed “between-class” variance $SSB\u200a\u200b=1N\u2211i=1\kappa nidi2$. From the Euclidean distances $dki\u200a\u200b=\Vert xkitest\u2212ctest\Vert $ between exemplars and overall centroid, we computed “total” variance $SST\u200a\u200b=1N\u2211i=1\kappa \u2211k=1nidki2$. Variances $SSW$, $SSB$, and $SST$ are also denoted, respectively, $SSsame$, $SSdiff$, and $SSfam$ further below. To quantify the discriminability of classes, the variance ratio $Fidentity\u200a\u200b=SSB(N\u2212\kappa )\u200b/\u200bSSW(\kappa \u22121)$ provided a non-parametric multivariate statistic (PERMANOVA; Anderson, 2001). The average within-class and between-class dispersion per dimension could be estimated as $\sigma W\u200a\u200b=SSW\u200b/\u200b(N\u2212\kappa )$ and $\sigma B=SSB\u200b\u200b/\u200b(\kappa \u22121)$, respectively.

#### 2.5.4 Minimum statistic

To test for statistical significance, we computed average classification performance (in terms of both classification accuracy $\alpha obs$ and f-ratio $Fobs$) over $Nr$ test sets, as well as over $103$ first-level permutations of object identities (in each of the $Nr$ test sets). In principle, we could have tested an “individual null” hypothesis for every parcel and every data set, namely, the probability of obtaining the observed performance $\alpha obs$ (or $Fobs$) purely by chance. Instead, we computed the “minimum statistic” $m=mink\alpha k$ (or $m=minkFk$) over data sets $k$, as well as over $105$ second-level permutations (drawn randomly from the first level permutations) and tested the “global null” hypothesis, namely, the probability $pn(m)$ of obtaining the observed minimum performance over $n$ data sets purely by chance. This computation was performed separately for each of the 2 conditions (8 data sets from 8 observers per condition) as well as for the union of conditions (16 data sets from 8 observers). When the “global null” hypothesis could be rejected, we inferred statistically significant classification performance in at least *some* data sets. Our threshold for significance was $pn\u22c6(m)<0.05$ after correction for multiple comparisons ($758$ parcels and $2$ conditions) (Allefeld et al., 2016).

#### 2.5.5 Prevalence analysis

To summarize the results from all observers and conditions, we used a “prevalence analysis” (Allefeld et al., 2016). Prevalence $\gamma true$ is the fraction of significant performance over $n=16$ data sets. To test the “prevalence null” hypothesis that $\gamma true$ is below a threshold $\gamma 0\u200b\u200a=0.5$, an upper bound for $P(\gamma true\u200b<\gamma 0)$ was obtained from the probability $pn\u22c6(m)$ of the minimum statistic over $n=16$ data sets, after correction for multiple comparisons:

This was the criterion used to label parcels as “identity selective.” Threshold prevalence $\gamma \u22430.5$ corresponded to corrected probability $pn\u22c6(m)\u22430.0012$ and *minimal* accuracy of $6.67%$ (i.e., near chance).

Additionally, we computed $\gamma est$ as the largest value for which the “prevalence null” hypothesis could be rejected from

where $pn\u22c6(m)$ is the corrected minimum probability, $n=16$ the number of data sets, and $\alpha =0.05$ the significance threshold.

#### 2.5.6 Representation of shape “novelty” for non-recurring objects

Although recurring and non-recurring objects were comparable and generated in the same way, it seemed possible that neural representations might discriminate the class of 15 recurring objects from the class of 360 non-recurring objects. Indeed, the two classes became discriminable after observers had learned to classify recurring objects as “familiar” and non-recurring objects as “novel.” Accordingly, we considered this discriminability a representation of “novelty.”

To assess the neural representation of “novelty,” we divided non-recurring and recurring objects into two sets of unequal size (approximately $N=216\xd715$ recurrent or “familiar” exemplars vs. $M=360$ non-recurrent or “novel” exemplars). From the Euclidean distances $dk\u200a\u200b=\Vert xk\u200a\u2212c\Vert $ between test set exemplars $xk$ and centroids $cfam\u200a\u200b=1N\u2211k=1Nxk$ or $cnov\u200b\u200a=1M\u2211k=1Mxk$, we obtained “within-class” variance $SSW\u200a\u200b=SSfam\u200a\u200b+SSnov$, where $SSfam\u200b\u200a=1N+M\u2211k=1Ndk,fam2$ and $SSnov\u200a\u200b=1N+M\u2211k=1Mdk,nov2$. From distances $dfam\u200a\u200b=\Vert cfam\u200a\u200b\u2212ctot\Vert $ and $dnov\u200a\u200b=\Vert cnov\u200a\u200b\u2212ctot\Vert $ between class centroids and overall centroid $ctot\u200a\u200b=NN+Mcfam\u200a\u200b+MN+Mcnov$, we obtained “between-class” variance $SSB\u200a\u200b=SSnovfam=NN+Mdfam2+MN+Mdnov2=$$NM(N+M)2(cfam\u200a\u200b\u2212cnov)2$. Finally, from distances $dk\u200a=$$\u200a\u200b\Vert xk\u200a\u200b\u2212ctot\Vert $ between exemplars and overall centroid, we obtained total variance $SST=1N+M\u2211k=1N+Mdk2$. To quantify the discriminability of non-recurring and recurring objects, we formed the variance ratio $Fnovelty\u200b\u200a=SSB(N+M\u200a\u200b\u22122)\u200b/\u200bSSW$ (Anderson, 2001). Average within-class and between-class dispersion per dimension was obtained from $\sigma W\u200a\u200b=SSW\u200b/\u200b(N+M\u22122)$ and $\sigma B=SSB$, respectively.

#### 2.5.7 Changes of representation analyzed in “batches”

To assess changes in neural representations over the course of the experiment, while also allowing for cross-validation, we divided all recurring object presentations into five successive “batches” $B1,B2,\u2026$, each with $20%$ of the presentations (Fig. 2D). In this way, we could select “test sets” for cross-validated DLDA from one particular batch, while retaining all other presentations as a “training set.” As every recurrent object was presented $210\xb19$ times over all sessions, a batch would comprise $42\xb11.8$ presentations, a test set $21\xb10.9$, and a training set $189\xb18.1$ presentations. To reduce the variance deriving from test set selection, we repeated the random selection $Nr\u200a\u200b=20$ times and averaged over repetitions.

To quantify representational changes over the course of learning, we computed the variance ratios $Fm,w,uidentity$ for each temporal window or batch $m$, identity-selective parcel $w$, and data sets $u\u2208{1,\u2026,16}$. We formed the average ratio over 16 data sets, $Fm,widentity\u200a\u200b=\u2329Fm,w,uidentity\u232au$, and assessed statistical significance by shuffling ($103$ permutations) the identity of recurring objects to obtain the distribution of variance ratios due to chance or data structure. The mean $\mu m,w$ and variance $\sigma m,w2$ of this distribution could also be used to convert $Fm,widentity$ into z-score values $Zm,widentity=(Fm,widentity\u200b\u200a\u2212\mu m,w)/\sigma m,w$.

Additionally, we performed a regression analysis and quantified representational changes in terms of linear trends. Specifically, we determined a “rate” parameter $\beta widentity$ by fitting a linear mixed-model $Fm,w,uidentity\u200a=$$\u200b\beta 0,w\u200a\u200b+\beta widentitym+\xi 0,w,u\u200b\u200a+\xi 1,w,um+\u03f5m,w,u$ with data sets $u$ as the grouping variable, where $\beta 0,w$ was a fixed-effect coefficient, $\xi 0,w,u$ and $\xi 1,w,u$ were random effect coefficients, and $\u03f5m,w,u$ was residual error.

Similarly, to assess whether neural representations of non-recurring objects change with learning, we divided all object presentations (recurring and non-recurring) into five successive “batches” $B1,B2,...$, each with $20%$ of the presentations (Fig. 2D), to obtain variance ratios $Fm,w,unovelty$ for each temporal window or batch $m$, identity-selective parcel $w$, and data sets $u\u2208{1,\u2026,16}$. After averaging over 16 data sets, $Fm,wnovelty=\u2329Fm,w,unovelty\u232au$, we assessed statistical significance by shuffling ($103$ permutations) the identity of recurring and non-recurring objects to obtain the distribution of variance ratios due to chance or data structure. The mean $\mu m,w$ and variance $\sigma m,w2$ of this distribution were used to convert $Fm,wnovelty$ into z-score values $Zm,wnovelty\u200a\u200b=(Fm,wnovelty\u200b\u200a\u2212\mu m,w)/\sigma m,w$.

Additionally, we performed a regression analysis to establish linear trends. Changes in the representation of object “novelty” were assessed by fitting the “rate” parameter $\beta wnovelty$ in a linear mixed-model $Fm,w,unovelty\u200a\u200b=\beta 0,w\u200b\u200a+\beta wnoveltym+\xi 0,w,u\u200b\u200a+\xi 1,w,um+\u03f5m,w,u$, with data sets $u$ as the grouping variable, where $\beta 0,w$ was a fixed-effect coefficient, $\xi 0,w,u$ and $\xi 1,w,u$ were random effect coefficients, and $\u03f5m,w,u$ was a residual error.

To establish linear trends $Fm\u200a=\u2329Fm,w,u\u232aw,u$ (of either identity and novelty) that average over both parcels $w$ and data sets $u$, we obtained a rate parameter $\beta 1$ by fitting linear mixed-model $Fm,w,u\u200b\u200a=\beta 0\u200b\u200a+\beta 1m+\xi 0,w,u\u200b\u200a+\xi 1,w,um+\u03f5m,w,u$ with both parcels and data sets as grouping variables.

#### 2.5.8 Geometry of representations

In the cross-validated analyses described above, subspaces $S$ differed slightly between different batches (and training sets). To analyze the geometry of neural representations in a stable framework, we repeated some analyses in fixed subspaces $S$ that reflected all observations (i.e., all recurring activity patterns $xk$). In the fixed subspace, we calculated the normalized amplitude $ak\u200a\u200b=\Vert xk\Vert /\u200b\kappa \u22121=\u2211j=1\kappa \u22121xjk2/\u200b\kappa \u22121$ of individual patterns $k$ and the normalized pairwise distance $dkl\u200a\u200b=\Vert xk\u200b\u2212xl\Vert /\u200b\kappa \u22121=\u2211j=1\kappa \u22121(xjk\u200b\u2212xjl)2/\u200b\kappa \u22121$ between two patterns $k$ and $l$.

For each parcel $w$, data set $u$, and run $r$, we obtained the average amplitude $Aw,u,rtot\u200b\u200a=1N+M\u2211k=1N+Mak$ of all patterns, the average amplitude $Aw,u,rfam\u200b\u200a=1N\u2211k=1Nak$ of *recurring* patterns, and the average amplitude $Aw,u,rnov\u200b\u200a=1M\u2211k=1Mak$ of *non-recurring* patterns. Similarly, we obtained the average pairwise distance $Dw,u,rtot\u200b\u200a=2(N+M)(N+M\u22121)$$\u2211k=1N+M\u2211l=kN+Mdkl$ between all patterns, the average distance $Dw,u,rnov=2M(M\u22121)\u2211k=1M\u2211l=kMdkl$ between non-recurring patterns, the average distance $Dw,u,rfam\u200b\u200a=$$2N(N\u22121)\u2211k=1N\u2211l=kNdkl$ between recurring patterns, and the average distance $Dw,u,rnovfam\u200b\u200a=1MN\u2211k=1M\u2211l=1Ndkl$ between pairs comprising one recurring and one non-recurring pattern. For recurring patterns, we further obtained the average distance $Dw,u,rsame=2N(N\u200b/\u200b\kappa \u22121)\u2211i=1\kappa \u2211k=1ni\u2211l=knidkl$ between pairs of recurring patterns in the *same* class and the average distance $Dw,u,rdiff\u200b\u200a=1N(N\u2212N\u200b/\u200b\kappa )\u2211i=1\kappa \u2211k=1ni$$\u2211l=1N\u2212nidkl$ between pairs in *different* classes. All distances were corrected for the temporal auto-correlation by subtracting the time course of $Tw(i,j)$, as described above.

As described further above, the distances between individual activity patterns and different centroids—such as $ctot$, $cnov$, and $cfam$—yielded total variance $SST\u200b\u200a=SStot$, within-class variance $SSW\u200b\u200a=SSfam\u200b\u200a+SSnov$, and between-class variance $SSB\u200a\u200b=SSnovfam$. For recurring patterns, distances to individual class centroids $ci$ and overall centroid $cfam$ yielded total variance $SST\u200a\u200b=SSfam$, within-class variance $SSW\u200b\u200a=SSsame$, and between-class variance $SSB\u200b=SSdiff$.

These values were computed for each parcel $w$, observer $u$, and run $r$, in order to obtain variance fractions $Fw,u,rfam\u200b\u200a=SSfam\u200b/\u200bSStot$, $Fw,u,rnov\u200a\u200b=SSnov\u200b/\u200bSStot$, $Fw,u,rnovfam\u200a=$$\u200a\u200bSSnovfam\u200b/\u200bSStot$, $Fw,u,rsame\u200b\u200a=SSsame\u200b/\u200bSSfam$, and $Fw,u,rdiff=SSdiff/$$\u200b\u200bSSfam$, as well as variance ratios $Rw,u,ridentity\u200b\u200a\u200b=SSdiff(N\u2212\kappa )\u200b/$$SSsame(\kappa \u22121)$ and $Rw,u,rnovelty\u200b\u200a=SSnovfam(N+M\u22122)\u200b/\u2009\u200b(SSnov\u2009\u200b\u200a\u200b=$$SSfam)$.

#### 2.5.9 Changes with learning analyzed by “runs”

Fixed subspaces permitted us to assess representational changes between successive “runs.” To this end, we computed average amplitudes $Aw,u,r$, distances $Dw,u,r$, variances $SSw,u,r$, and variance ratios $Fw,u,r$, as described above, for each parcel $w$, data set $u\u2208{1,\u2026,16}$, and run $r$. Within each session $s$, we assessed the changes of these parameters $Y\u200a\u200b\u2208{A,D,SS,F}$ over runs $r\u2032\u2208s$ by determining a “rate” parameter $\beta s$ for identity-selective $w$ and non-selective parcels $w\u2032$. Each $\beta s$ coefficient was acquired from a linear mixed-model $Yr\u2032,w,u\u200b\u200a=\beta 0,s\u200b\u200b\u200a+\beta sr\u2032+\xi 0,w,u\u200b\u200a+\xi 1,w,ur\u2032\u200b\u200a+\u03f5r\u2032,w,u$ with observers and parcels as grouping variables, where $\beta 0,s$ was a fixed-effect coefficient, $\xi 0,w,u$ and $\xi 1,w,u$ were random effect coefficients, and $\u03f5$ was residual error. The same approach was used to assess gradual changes over runs in the centroid-to-centroid distances $Dsame(r)$, $\Delta Dsame(r)$, $Dnov(r)$, and $\Delta Dnov(r)$. This served to test the statistical significance of linear rates $\beta s$ in each session. Sessions with significant rates are marked by stars in Figure 6.

#### 2.5.10 Stability of shape identity and novelty representations

We also assessed the stability of the representation of the 16 response classes (15 recurring and 1 non-recurring) over the course of the experiment. To this end, we compared the average representation in individual runs $r$ (centroids $Cr$ of responses to exemplars) to the average representation over all runs (centroids $Cave$). For observer $u$, identity-selective parcel $w$, and object class $i$, we calculated the Euclidean distance $Du,w,i,r$ between the relevant $Cr$ and $Cave$, and also the difference $\Delta Du,w,i,r$ between the relevant centroids from successive runs, $Cr$ and $Cr+1$. After averaging over observers $u$, identity-selective parcels $w$, and object classes $i$, we obtained $Dsame(r)$ and $\Delta Dsame$ for recurring objects and by $Dnov(r)$ and $\Delta Dnov(r)$ for non-recurring objects.

As a baseline for comparison, we also computed the distances $Du,w,i,r$ and differences $\Delta Du,w,i,r$ that may be expected purely on the basis of response variance. To this end, we permuted the sequence of all $3,600$ trials, separately within each of the 16 response classes (15 recurring and 1 non-recurring) such as to obtain 18 “pseudo-runs” with $200$ trials each. Expectation values were obtained by repeating this $Nr\u200b\u200a=1,000$ times.

We note that, in an $n$-dimensional hypersphere of unit radius, the average Euclidean distance between two random points is

with $dave\u22481.4017$ for $n=14$.

#### 2.5.11 Dimensional reduction

To visualize representational geometry in two dimensions, we randomly sampled $50$ response patterns to each of the recurring and non-recurring objects within the first and the last sessions and calculated a $1,600\xd71,600$ pair-wise distance matrix ($Dw,u$) for each identity-selective parcel $w$ and subject $u$. We did not wish to average distance matrices over observers, as we did not expect the activity patterns of different observers to be comparable. To sidestep this difficulty, we permuted the order of recurring objects $100$ times and for each subject obtained an average matrix $D\xaf$ over permutations, which was then averaged over subjects. To visualize the representational geometry of identity in the first and the last session, we used multidimensional scaling (Matlab function *mdscale*, metric stress) to map the distances matrices for recurring objects ($50$ exemplars from the first session and $50$ exemplars from the last session) into a two-dimensional space. To visualize the representational geometry of novelty, we restricted the distance matrix to non-recurring objects ($50$ exemplars from the first session and $50$ exemplars from the last session) and just 3 of the 15 recurring objects ($20$ exemplars from either session).

## 3 Results

Observers viewed sequences of computer-generated objects, with each object shown for $2.5s$ while rotating in three dimensions (Fig. 1A, B, a movie may be viewed HERE). Over three sessions, observers viewed 3,600 objects in total, of which 3,240 were presentations of *recurring* objects (15 different objects, each appearing approximately 216 times) and 360 were presentations of *non-recurring* objects (360 different objects, each appearing once). The display was intended to be sufficiently intriguing to remain interesting over 3 successive days. To this end, presentations never repeated exactly. Observers were required to classify each object as “familiar” (recurring) or “novel” (non-recurring). The task performance improved as observers became increasingly familiar with recurring objects, as illustrated in Figure 1C. Over the first 600 presentations, classification performance improved approximately from $50%$ correct (chance) to $85%$ correct, and reaction times decreased approximately from $1.65\u2009s$ to $1.25\u2009s$. Over the remaining 3,000 presentations, performance improved further to approximately $90%$ correct and reaction times decreased further to approximately $0.95\u2009s$. After three sessions, all observers were “familiar” with all recurring objects and could pick them out from an array of distractor objects.

All sessions were performed in an MRI scanner while whole-brain functional imaging data were being collected. In the following, we report the results of three types of analyses. First, we describe the cortical areas in which multivariate BOLD activity encodes information about the identity of recurring objects (“object identity”), as determined by cross-validated analyses of entire data sets (3 sessions per observer). Second, we describe changes in cortical representations over coarse time intervals, by means of cross-validated analyses of successive parts of the data sets (3 sessions divided into 5 batches). These changes pertain to the encoding of both recurring objects and the distinction between recurring and non-recurring objects (“object novelty”). Third, we describe changes in representations over finer time intervals (3 sessions divided into 18 runs), by foregoing cross-validation and adopting a fixed reference frame. These finer intervals confirm the results from coarse intervals but reveal more details about the geometry of neural representations and their development over time.

### 3.1 Cross-validated representation of object identity

To assess the extent to which multivariate neural responses to recurring objects encoded object identity, we relied on optimal linear classifiers combined with cross-validation (“direct linear discriminant analysis,” DLDA, see Methods for details). Specifically, we quantified the “identity” information in multivariate responses of every parcel $w\u2208{1,\u2026,758}$ and data set $u\u2208{1,\u2026,16}$ in terms of classification accuracy $\alpha w,u$, average pairwise dissimilarity $\delta w,u$, and the ratio of between-class and within-class variance $Fw,u$. All three measures proved highly correlated and supported similar conclusions. For example, Figure 3B illustrates the correlation of classification accuracy $\alpha w,u$ and variance ratio $Fw,u$ ($\rho =0.94$, $p<0.001$). The correlations of $aw,u$ and $\delta w,u$ ($\rho =0.95$, $p<0.001$), and of $\delta w,u$ and $Fw,u$ ($\rho =0.98$, $p<0.001$) were comparably strong. The results of individual observers from the two experimental conditions (structured and unstructured object sequences) were highly similar as well, demonstrating test-retest consistency (Supplementary Fig. S2).

For most parcels, the results from different observers showed considerable variability. Whereas a few parcels exhibited significant accuracy $\alpha w,u$ and variance ratio $Fw,u$ in all data sets (e.g., Calcarine 331), in many parcels the representation of object identity was significant only in some data sets (e.g., Parahippocampus 325) (Fig. 3B). Global significance was assessed by comparing the *minimal* accuracy or variance ratio over the 8 data sets from one condition (structured or unstructured) to the minimal values obtained with shuffled data (red ellipse in Fig. 3C, see Methods for details).

Minimal classification accuracy $\alpha w$ was significant in 17% of all parcels (128 of 758 parcels) in the structured sequence condition and in 19% of parcels (146 of 748) in the unstructured condition ($p\u22c6\u20090.05$, corrected for multiple comparisons), when compared to null-distributions obtained from shuffled object identities. For minimal variance ratios $Fw,u$, the corresponding values were 18% and 17%, respectively (136 and 130 parcels). To combine the results from both conditions, we used a “prevalence” analysis to determine parcels in which “identity” was represented significantly in a majority of all 16 data sets (prevalence $\gamma \u22650.5$), once again comparing the observed minimal values to the minimal values obtained with shuffled data (red ellipse in Fig. 3C, see Methods for details).

Figure 3A illustrates the 124 parcels identified as significantly “identity-selective” by the prevalence criterion $\gamma \u22650.5$ and Supplementary Figure S3 shows the same information in terms of a sliced brain. Among these were $70$ parcels in the occipital cortex, $29$ in the parietal cortex, $18$ in the fusiform or temporal cortex, and $7$ in the frontal cortex. The average prevalence of identity-selectivity in these parcels was $0.663\xb10.016$ (mean and S.D.), and the minimal value was $0.58$. As the prevalence criterion (based on 16 data sets) was marginally more conservative than the accuracy criterion (based on 8 data sets), 120 of the 124 parcels were significantly “identity-selective” in terms of both criteria. The four exceptions (identified only by prevalence, but not by accuracy) were Frontal-superior-R 56, Occipital-superior-R 393, Occipital-middle-L 403, and Parietal-superior-R 510. Appendix Table A1 lists the statistical significance of all three criteria for all “identity-selective” parcels.

Overall, there was a pronounced posterior-anterior gradient. Whereas many parcels at the posterior pole of the brain exhibited high classification accuracy, this tended to progressively decrease at more anterior locations (Fig. 3A; Supplementary Fig. S3; Appendix Table A1). To formalize this trend, we assigned 66 of the 124 identity-selective parcels to the 25 topographic visual areas defined by Wang et al. (2015) and, additionally, to the anterior inferior temporal cortex (AIT) and to the inferior frontal cortex (IFC). Supplementary Figure S6 provides an overview of all topographically assigned and non-assigned parcels selective for identity. As illustrated in Figure 8A, this assignment showed that accuracy was comparable in early visual areas (V1-hV4) and in the posterior-ventrolateral regions of the temporal lobe, whereas accuracy was lower in the anterior temporal cortex, the inferior frontal cortex, and in parietal cortical areas.

### 3.2 Cross-validated changes with learning

To assess changes with learning, we separately analyzed five successive and non-overlapping sets of trials (“batches”) with linear classifiers and cross-validation (see Methods for details). Specifically, we established ratios of between- and within-class variance for both object identity (15 classes formed by responses to 15 recurring objects) and for object novelty (2 classes formed by responses to recurring and non-recurring objects, respectively). These two variance ratios measured the neural representation of “identity” and “novelty.”

Variance ratios were converted to z-score values (with respect to the mean and variance of the corresponding shuffle distribution) before being averaged over data sets and/or over parcels. Figure 4A summarizes the results in terms of a grand average over all identity selective parcels. The average identity and novelty ratios were highly significant in all batches ($p<0.001$). Over successive batches, the average identity ratio weakened slightly but significantly ($p<0.05$), whereas the average novelty ratio strengthened considerably, especially between batches $m=1$ and $m=2$ ($p<0.001$).

As expected, it was the between class-variances $SSBidentity$ and $SSBnovelty$ that changed significantly over successive batches $m$ ($p<0.05$ and $p<0.001$, respectively), whereas the within-class variances $SSWidentity$ and $SSWnovelty$ remained essentially the same ($p=n.s.$), as illustrated by Figure 4B. This was owing to the DLDA algorithm, which maintained within-class variance near unity. Nevertheless, over successive batches, the neural representations of recurring objects tended to become slightly more similar to each other, but more dissimilar to the representations of non-recurring objects.

To ascertain that these overall trends hold true also for individual parcels, we carried out more conventional regression analyses of variance ratios $Fm,w,uidentity$ and $Fm,w,unovelty$ over batches $m$, parcels $w$ and data sets $u$. Specifically, we fitted linear mixed-models in order to estimate “rate” parameters $\beta widentity$ and $\beta wnovelty$ for each identity-selective parcel $w$. The results revealed negative rates $\beta widentity$ and positive rates $\beta widentity$ for almost all parcels, confirming the overall trends in Figure 4C. The variability over parcels was numerically larger for $\beta wnovelty$ ($0.15\xb10.1$, mean and S.D.) than for $\beta widentity$ ($0.022\xb10.015$), with both rates weakly correlated ($\rho =0.30$, $p<0.001$). Classification accuracy $\alpha widentity$ correlated negatively with $\beta wnovelty$ ($\rho =\u2212\u200a0.22$, $p<0.05$) and with $\beta widentity$ ($\rho =\u2212\u200b\u200a0.74$, $p<0.001$).

To take a closer look at the interaction between “novelty” and “identity,” we divided the identity-selective parcels into “novelty terciles” (high, medium, and low, defined by $\beta novelty$) before comparing representations of novelty ($Fnovelty$) and identity (accuracy $\alpha $) (Fig. 5B). The results differed substantially between batches and terciles. In early batches, $Fnovelty$ and $\alpha $ correlated for all terciles, suggesting that initially the representations of non-recurring and recurring objects were linked. However, in successively later batches, this correlation waned in the upper tercile. This may suggest that pronounced representations of non-recurrent objects progressively detached from representations of recurrent objects.

Figure 5A illustrates the degree to which identity-selective parcels express the overall novelty trend, as quantified by fitted rate $\beta wnovelty$, and Supplementary Figure S4 shows the same information in terms of brain slices. An anterior-posterior gradient is evident, with a more pronounced representation of novelty at anterior than at posterior locations. This gradient is also apparent when parcels are assigned to topographic visual areas, as illustrated in Figure 8B. Appendix Table A1 lists the rates $\beta wnovelty$ for all identity-selective parcels.

### 3.3 Geometry of identity and novelty representations

Next, we present results from alternative analyses relying on fixed subspaces $S$ for each data set (3,600 trials). Fixed subspaces reveal a more detailed geometry of neural representations and allow any changes in this geometry to be tracked over successive runs (200 trials each). The disadvantage of this approach is that it precludes cross-validation. Our aim was to establish not just between- and within-class variances, but also the distances underlying the variances, and the response amplitudes underlying the distances. For the representation of object “identity,” the within- and between-class geometry was defined by response pairs to *same* and to *different* recurring objects, respectively. For the representation of object “novelty,” the within-class geometry reflected responses either to pairs of *familiar* (recurring) or to pairs of *novel* (non-recurring) objects, whereas the between-class geometry concerned responses to mixed pairs of objects (*novel-familiar*).

We analyzed multivariate responses in terms of variances, distances, and amplitudes and averaged the results over all data sets and all 124 identity-selective parcels, to obtain separate mean values (and standard errors) for each of the 18 successive runs. Additionally, we averaged the results over the remaining 634 (non-identity-selective) parcels of the brain. We hoped that this would help distinguish more general effects and trends (e.g., habituation, attention, alertness) from learning-related changes in shape representations. All distances in these analyses were residual distances, to minimize the influence of temporal auto-correlations (Supplementary Fig. S1; see Methods for details).

The analyzed quantities—response amplitudes $A$, response distances $D$, and variances $SS$—are illustrated schematically in Figure 6A, and the results are presented in Figure 6B–D in terms of the mean values and standard errors for every run. In identity-selective parcels, response amplitudes $Afam$ to recurring patterns decreased during the first session (runs 1 to 6, $p<0.05$), but not in the second and third session (runs 7 to 12, runs 13 to 18, $p>0.5$). Response amplitudes $Anov$ to non-recurring patterns showed no significant change ($p$ n.s.) in any session (Fig. 6B). In non-selective parcels, response amplitudes decreased in all sessions, consistent with general habituation. In identity-selective parcels, response distances $Ddiff$ between different recurring objects declined similarly during the first session ($p<0.05$), but not during subsequent sessions ($p>0.6$) (Fig. 6C). Also, response distances $Dsame$ between the same recurring objects did not change significantly during any session ($p$ n.s.). In contrast, response distances $Dnov$ between non-recurring objects declined disproportionately during the first session ($p<0.05$) but increased during the third session ($p<0.05$). Response distances $Dnovfam$ between recurring and non-recurring objects, on the other hand, did not change significantly over sessions ($p$ n.s.).

A first conclusion is that response amplitudes and response distances are consistently larger for recurring objects (blue traces in Fig. 6B, C) than for non-recurring objects (red traces). Importantly, in the very first run, response distances are comparable between different recurring objects ($Ddiff$) and different non-recurring objects ($Dnov$), demonstrating that both recurring and non-recurring objects were represented comparably well. Over subsequent runs, response distances decrease far more between different non-recurring objects ($Dnov$) than different recurring objects ($Ddiff$), demonstrating that a comparative advantage for *recurring* objects develops gradually (i.e., a kind of repetition enhancement). A second conclusion is that the observed development differs between identity-selective and non-selective parcels. Whereas amplitudes and distances stabilize in the former group of parcels, they habituate progressively in the latter group (both within and between sessions). Thus, the responsiveness of identity-selective parcels remains stable over sessions. A third conclusion is that response distances $Dnov$ between different non-recurring objects become comparatively small (already during the first session), not only smaller than the distances $Ddiff$ between *different* recurring objects but even smaller than the distances $Dsame$ between the *same* recurring objects.

The results for response variances confirmed the trends observed earlier in the batch analysis of cross-validated variance ratios (Fig. 4A, B). Between-class variance $SSdiff$ for recurring objects declined over the course of sessions ($p<0.005$), whereas between-class variance $SSnovfam$ for non-recurring objects increased over the first session ($p<0.005$), only to decline again during the third session ($p<0.05$). Within-class variances $SSsame$ and $SSnov$ remained largely unchanged. The close correspondence between the trends observed over runs and over batches is illustrated also in Supplementary Figure S5. Surprisingly, non-identity-selective parcels mirrored the trends observed for identity-selective parcels in attenuated form. The fact that between- and within-class variances differ systematically suggests that even non-identity-selective parcels represent object identity to some degree.

It is natural to compare these results to the time-course of behavioral performance (fraction correct and reaction time) in our observers Fig. 1C, D). The changes in the representation of *recurring* objects (between class distances $Ddiff$ and variances $SSdiff$) show a gradual *decrease* in the quality of representation and thus do *not* correspond to improving performance in terms of fraction correct. However, the changes in the representation of *non-recurring* objects, including the decrease of within-class distances $Dnov$ and variances $SSnov$ and the increase of between-class variances $SSnovfam$ and variance ratio $Rnovelty$, do correspond to the rapid improvement in fraction correct over the first few runs. Thus, the neural changes over the course of learning point to diverging representations of “novel” (non-recurring) and “familiar” (recurring) objects.

### 3.4 Stability of identity and novelty representations

Relying on fixed subspaces $S$ to analyze each data set also permitted us to assess the *stability* of neural representations over successive runs. With this in mind, we established the centroids of response classes for each run and examined the displacement of centroids between successive runs. As this calculation concerned centroid-to-centroid distances (rather than exemplar-to-exemplar distances), we could not correct for temporal auto-correlations.

The computation of centroids for particular response classes is illustrated schematically in Figure 7A. Given the centroids $Cr\u22121$ and $Cr$ for successive runs $r\u22121$ and $r$ and the average centroid $Caver$ over all runs, we computed absolute centroid-to-centroid distances $DCr=\Vert \u200aCr\u2212Cave\u200a\Vert $ as well as relative centroid-to-centroid distances $\Delta DCr=\Vert \u2009Cr\u2212Cr\u22121\u2009\Vert $. The 16 response classes were formed by each recurring object (15 classes, $DCsame$ and $\Delta DCsame$) and by the non-recurring objects (1 class, $DCnov$ and $\Delta DCnov$). To compare the displacements expected from sampling noise, we also computed the centroid-to-centroid distances after permuting the responses in each class and regrouping them into 18 “pseudo-runs” (see Methods for details).

The results are shown in Figure 7B. For both recurring and non-recurring objects, average absolute distances $DCsame(r)$ and $DCnov(r)$ diminished during the first session (runs 1 to 6, $p<0.005$), but remained stable during the second and third sessions (runs 7 to 12, and 13 to 18, $p>0.2$). Notably, absolute distances $DCnov(r)$ of novel objects decreased to a much lower average level. Relative distances $\Delta DCsame(r)$ and $\Delta DCnov(r)$ between successive runs declined during the first session (runs 1 to 6, $p<0.05$), remained stable during the second session (runs 7 to 12, $p>0.2$), only to decline once again the last session (13 to 18, $p<0.005$ for recurring and $p<0.05$ for non-recurring objects). Absolute distances were far larger for recurring than for non-recurring classes, corroborating the substantial “response enhancement” already noted above. Both absolute and relative distances were slightly smaller than predicted by sampling noise (thin, pale lines, $p<0.001$), demonstrating that responses of true runs were distributed slightly more compactly and consistently than those of pseudo-runs. Note also that relative distances approached the values expected for fully random displacements in a 14-dimensional hypersphere—specifically, relative distances $\Delta DC$ were approximately $1.4$ times larger than absolute distances $DC$ – again underlining the dominant influence of sampling noise.

## 4 Discussion

We studied the cortical representation of synthetic visual objects over multiple days of repeated viewing, while observers learned to classify initially unfamiliar objects as “familiar.” Relying on “representational similarity analysis” (RSA), we established distances between spatiotemporal hemodynamic (BOLD) responses to exemplars of different *recurring* objects, as well as to exemplars of *non-recurring* objects. Response distances between the same and different recurring objects quantified the neural representation of object *identity*. Response distances between recurring and non-recurring objects measured the neural representation of object *novelty*. The results showed that object identity was neurally represented from the start, in the ventral occipitotemporal cortex and beyond. With growing familiarity, the quality of this neural representation remained high, but its geometry expanded to fill the available representational space. In contrast, the neural representation of non-recurring objects (which remained “novel” by definition) improved over time, but its geometry contracted and shifted to the margins of the representational space.

### 4.1 Cortical representation of object identity

To permit a fine-grained analysis of representational geometry, we generated complex and three-dimensional shapes that were highly characteristic and distinguishable and presented these shapes from various points of view and in various states of rotation (always for one complete turn) (Kakaei et al., 2021). Thus, observers had to recognize an object from all sides in order to classify it as “familiar.” Within the category of our synthetic shapes, every recurring object constituted strictly speaking an “exemplar,” with individual presentations providing different “instantiations.” However, we chose to term objects “classes” and individual presentations “exemplars,” as this terminology conforms better to RSA conventions.

The selectivity of cortical parcels for object identity was assessed in optimized 14-dimensional subspaces $S$ of the much higher-dimensional space of multivariate responses ($O(103)$ dimensions). Specifically, we computed a cross-validated “classification accuracy” (Kriegeskorte, Mur, & Bandettini, 2008) and used a prevalence analysis to combine results from different conditions and observers (Allefeld et al., 2016). Essentially identical results were obtained with alternative measures such as “linear discriminability” and “variance ratio” (of between- and within-class variance; Anderson, 2001). When spatiotemporal responses to different objects are linearly discriminable, they form a neural representation of object identity. As exemplars of each object were presented from various sides, any such neural representation was by definition view-invariant. The obvious caveats are *(i)* that object rotation may have exposed the same characteristic features in many or most presentations and *(ii)* that multivariate hemodynamic responses over 9 s can only distantly reflect the neuronal activity evoked during each 2.5 s presentation. Nevertheless, hemodynamic signals exhibited significant invariance to the various modes of presentation of a given object (e.g., the initial perspective, the axis, and the sense of rotation).

In contrast to many other studies, we did not observe suppressed responses when objects were repeated (i.e., no “repetition suppression”) but rather a small enhancement of responses both with longer delays and later trial numbers (Supplementary Fig. S1). This may simply reflect the fact that the object presentations were highly variable and never repeated exactly. Recall that we designed a highly variable display such as to retain the observers’ interest over 3 successive days.

The 124 of 758 parcels that were identified as “identity-selective” on this basis were situated mostly in the ventral occipitotemporal cortex, but some parcels were also located in the parietal or frontal cortex, as illustrated in Figure 3A. The degree of selectivity exhibited a clear gradient, being stronger at the posterior pole and becoming progressively weaker in more anterior and more dorsal regions, as summarized in Figure 8A. These results are consistent with previous findings that multivariate activity distinguishing different exemplars of a particular class of objects (e.g., faces) is present in the ventral and lateral occipital cortex, on the fusiform gyrus, and in the ventral temporal cortex (Brants et al., 2016; Eger et al., 2008; Visconti di Oleggio Castello et al., 2021).

In general, it is thought that progressively “higher” levels of visual processing represent progressively “larger” visual sets, beginning with image features, and widening gradually to object features, object exemplars, object categories, and finally to supercategories such as animate or inanimate objects, or objects and landscapes (Grill-Spector & Weiner, 2014). Accordingly, the discriminability of exemplars within a category is expected to diminish at more anterior locations, which correspond to “higher” levels of visual processing (Eger et al., 2008; Grill-Spector & Weiner, 2014). Moreover, it has been hypothesized that the spatial scale of neural representations increases with the level of abstraction, in the sense that exemplars are represented at smaller scales than categories (Grill-Spector & Weiner, 2014). Thus, if this trend is exacerbated in the more anterior parts of the ventral pathway, exemplar representations may become progressively less discriminable at the spatial resolution of BOLD signals.

A previous study of visual expertise for synthetic shapes (Brants et al., 2016) reported a gradual enhancement of neural representations in object-selective areas, whereas we observed a moderate decline. This difference may have been due to task design. Brants and colleagues used barely discriminable shapes and emphasized perceptual load, whereas we used highly distinguishable shapes and emphasized memory load.

We also observed identity-selectivity in frontoparietal regions that are typically associated with the dorsal visual pathway and the right frontoparietal “attention network.” This is consistent with previous findings on the presence of object- and/or face-selective representations in dorsal areas (Freud et al., 2017; Jeong & Xu, 2016; Konen & Kastner, 2008; Poirier et al., 2006; Visconti di Oleggio Castello et al., 2021). However, the interpretation of this selectivity is not straightforward. Particularly the clusters associated with the “attention network” are often found to express functional correlations with ventral visual areas in both resting and task states (Dornas & Braun, 2018; Mutlu et al., 2022; Smith et al., 2013). Thus, it seems possible that multivariate functional correlations could have propagated identity-selectivity feedforward throughout the “attention network” and beyond.

Finally, we observed pronounced identity-selectivity in the primary visual cortex (calcarine sulcus, left and right), where neuronal activity encodes basic visual features (orientation, spatial frequency, direction of movement, and so on) (Grill-Spector & Weiner, 2014; Haxby et al., 2001). It is possible that multivariate hemodynamic responses in the primary visual cortex could have reflected this visually evoked neuronal activity sufficiently well to have encoded object identity, especially as the rotation may have exposed the same low-level features in many or most presentations. Additionally, hemodynamic responses could have been driven by spatiotemporal patterns of feedback from higher areas of the visual cortex. There is some evidence to suggest that feedback can dominate the hemodynamics of the early visual cortex under continuous viewing conditions (as used here) (Blake & Braun, 2009).

### 4.2 Cortical representation of novel object shapes

We also investigated the representation of “novel” object shapes that were encountered only once (and never recurred). Note that “novelty” is here not meant to imply “surprise” for the observer in the sense of a violation of expectations (e.g., Uddin, 2015). Rather, it simply denotes the more heterogeneous class of *non-recurring* objects (with 360 exemplars, each from a different object), as distinct from the 15 more homogeneous classes of *recurring* objects (with approximately 200 exemplars each, all from the same object). As mentioned, “novelty” was measured in terms of the linear discriminability of hemodynamic responses to non-recurring and recurring objects in 14-dimensional subspaces $S$, more specifically, by comparing pairwise response distances between classes (recurring and non-recurring) and within classes (either recurring or non-recurring).

All 124 “identity-selective” parcels were also “novelty-selective,” in the sense that hemodynamic responses discriminated non-recurring and recurring objects to some degree, as illustrated in Figure 5A. As discriminative subspaces were optimized for recurring objects—that were generated in the same way as non-recurring objects—some degree of discriminability was to be expected. Moreover, as non-recurring objects were more numerous (360 objects) than recurring objects (15 objects), some discriminability was expected purely by chance, particularly in a 14-dimensional space. However, as discussed further below, the linear discriminability of non-recurring objects increased over successive runs and sessions, mirroring observers’ improving ability to classify objects as “novel” or “familiar.” Because of this dynamic aspect, we quantified the novelty-selectivity of cortical parcels in terms of an “improvement rate,” $\beta novelty$ (Fig. 4). Interestingly, there was an anterior-posterior gradient in that novelty-selectivity was more pronounced in more frontal, parietal, and anterior temporal areas than more posterior temporal and occipital areas, as summarized in Figure 8B. In other words, the representational disparity between familiar object shapes and novel objects shapes tended to be larger in the higher-level (more anterior) visual cortex than in the lower-level (more posterior) cortex, suggesting that learning effects were more pronounced.

### 4.3 Representational changes with learning

As representational changes with learning were the main objective of our study, we addressed this issue with several complementary approaches. First, we divided our observations from 18 runs into five successive “batches” and established the neural representation of both “identity” and “novelty” separately for each batch with cross-validated statistics, while aggregating over all identity-selective parcels (Fig. 4B). Second, to assess changes in individual parcels, we performed a regressional analysis of the same cross-validated data and obtained “rates” of representational changes for every identity-selective parcel (Fig. 4C). Third, we adopted stable discriminative subspaces $S$ and sacrificed cross-validation in order to analyze representational geometry over individual runs (Fig. 6). All three approaches yielded comparable results.

Already in the first run and the first batch, without time for plasticity or learning, the neural representations of identity were *maximally* differentiated (Figs. 4A and 6D; Supplementary Fig. S5). This initial identity representation was most pronounced in known object processing areas, including the ventral occipitotemporal cortex and early visual cortex. Apparently, pre-existing representations based on life-long experience were sufficient to immediately provide a view-independent representation of synthetic shapes, which we had designed to be highly characteristic and discriminable. In contrast, neural representations of novelty were *minimally* differentiated in the first run and the first batch. As there was no systematic difference between recurring and non-recurring objects (and without time for plasticity), any residual initial discriminability of novelty must be attributed to chance.

Over subsequent runs and batches, the neural representation of object identity remained pronounced, but its quality declined steadily over time (Figs. 4A and 6D; Supplementary Fig. S5). Some decline in BOLD activity is not untypical for learning studies over multiple days and is commonly ascribed to repetition suppression, sparsification of responses, and/or diminishing attention or effort (e.g., Poldrack, 2000). However, while our results are consistent with such a scenario in non-identity-selective parcels, they do not support a general decline of activity in identity-selective parcels, as the response amplitudes and distances in these parcels declined only initially and subsequently remained stable (Fig. 6B, C).

In contrast, the neural representation of object novelty improved substantially over subsequent runs and batches. The time course was similar in both analyses (batch-by-batch and run-by-run), with the steepest improvement occurring over the first few runs (Figs. 4A and 6D; Supplementary Fig. S5). However, the detailed results revealed that this “improvement” (in discriminating non-recurring and recurring objects) actually reflected a deterioration in the representation of non-recurring objects (i.e., diminishing response distances, Fig. 6C).

In absolute terms, response amplitudes and distances were already larger for recurring objects and smaller for non-recurring objects during the first run and the difference increased over the next few runs (Fig. 6B, C). Apparently, recurring objects benefited from a “repetition enhancement,” as the only immediate and systematic difference between recurring and non-recurring objects was the frequency of recurrence. Interestingly, this enhancement was comparable for “structured” and “unstructured” sequences, even though the repetition latencies were quite different (Supplementary Fig. S1B, C). Accordingly, we hypothesize that the enhancement was not merely a passive effect but rather a consequence of task relevance and cognitive engagement (Supplementary Fig. S1B, C).

As mentioned, the rates of change of identity and novelty representations differed systematically between cortical regions (Fig. 8C). Intriguingly, the rates of novelty *gain* and identity *loss* varied inversely over the cortical hierarchy: in early visual areas (V1, V2, V3, hV4), identity declined rapidly, whereas novelty grew slowly. At the opposite end, in the inferior frontal cortex (IFC) and anterior ventral temporal cortex (AIT), identity declined slowly, but novelty grew rapidly. In the higher visual cortex (VO, LO), both rates were intermediate.

It is informative to visualize the observed representational changes in two dimensions (Fig. 9), while approximately preserving the *relative* pairwise distances in the discriminative subspaces $S$. This visualization makes clear that the neural representation of recurring objects expands between the beginning and the end of the experiment, filling the available representational space (Fig. 9A). The expansion explains our observation that the linear discriminability of object classes degrades but remains high. In contrast, the neural representation of non-recurring objects contracts between the beginning and the end of the experiment while also shifting to the margins of representational space, which explains why the linear discriminability of non-recurring objects improved over time (Fig. 9B). These two opposite developments may reflect both cognitive engagement and repetition frequency: representations may expand for objects that observers attempt to memorize and/or that recur frequently, but contract for objects that observers learn to ignore and/or that are rare.

In addition to relative changes in representational geometry indexed by linear discriminability, we established absolute changes in representational geometry, indexed by distances between response centroids in successive runs (see Fig. 7). The results were dominated by sampling noise, and the displacement of centroids was comparable to random jumps in a hypersphere while maintaining a given distance from its center. However, both absolute and relative centroid distances were slightly (and significantly) smaller than predicted by sampling noise, indicating that the representations were slightly more consistent and compact. The most interesting result of this analysis was that centroid distances were approximately $30%$ smaller for non-recurring than for recurring objects, highlighting again the representational disparity noted above.

### 4.4 Behavioral and cognitive changes with learning

The behavioral changes over three sessions of viewing sequences of objects included both increased classification performance (“familiar” or “novel”) and decreased reaction times. Both behavioral measures changed rapidly during the first three runs of the first session and more slowly during the second and third sessions (Fig. 1). As described elsewhere (Kakaei et al., 2021), the classification of a particular object typically changed from (mostly) “novel” to (mostly) “familiar” at one identifiable point in time during the sessions, which we termed “onset of familiarity.” This objective observation was consistent with the subjective reports of observers that they memorized all three-dimensional shapes one by one, such that every object became recognizable from all sides. Some observers also mentioned having assigned linguistic labels to individual recurring objects. After the three sessions, all observers were “familiar” with all recurring objects and could pick them out from an array of distractor objects.

Only some of these behavioral changes have obvious counterparts in the neural changes discussed above. First, the decrease of reaction times from under 2 s to under 1 s implies that observers spend less time actively evaluating the stimulus and more time passively observing it. However, the neural response of identity-selective parcels does not mirror this trend, as both response amplitudes and response differences stabilize after the first few runs (Fig. 6B, C). In the rest of the brain (non-identity-selective parcels), the neural responses do show a progressive decrease, but any attribution would be speculative.

Second, the increase in objective performance and in subjective “familiarity” was not mirrored directly in neural responses to recurring objects, as multivariate responses were sufficiently rich to identify such objects from the very start. However, multivariate responses were dispersed over the three sessions such as to fill more of the available space (see above). This growing response diversity is a plausible correlate of memory consolidation, that is, the formation of stable long-term memories in visually responsive cortical areas. When such memories are consolidated, one would expect that increased connectivity would enhance pattern completion over additional levels of representation, rendering network activity more complex (e.g., Steinberg & Sompolinsky, 2022). It is worth noting that this development was observed for both types of presentation sequences (“structured” and “unstructured”), suggesting that neural consolidation was due to task relevance and not merely to repetition latency.

Third, the increase in objective performance was mirrored indirectly in neural responses to *non-recurring* objects. Whereas these responses were initially comparable to recurring responses, they contracted over three sessions into a smaller part of the available space, thus becoming more stereotypical. As this part was comparably distant from all recurring responses, it lay at the margins of the representational space. The time course of classification performance corresponded best to this particular development in neural representations. Accordingly, this development was a plausible *indirect* correlate of memory consolidation, in the sense that visually responsive areas grew *less responsive* to other objects that failed to match the newly formed long-term memories.

## 5 Conclusion

We analyzed the cortical representation of visual objects in the multivariate hemodynamic responses of 758 brain parcels. For each parcel, we used linear discriminant analysis to map the O($103$)-dimensional responses into a lower-dimensional subspace that optimally discriminated the 15 stimulus classes (recurring objects). Optimal subspaces captured a large part of the correlated variance and overlapped substantially with the principal components of the responses. Typically, 2/3 of the principal component variance discriminated between stimulus classes (and thus coincided with the optimal subspace), while the remaining 1/3 was shared between stimulus classes. Our analyses revealed where and how the cortical representations of visual objects changed as visual expertise was being acquired and consolidated by the observers.

Our results were broadly consistent with other recent studies of visual expertise, which have highlighted the roles of three pathways or networks (Kravitz et al., 2011, 2013), an occipitotemporal pathway (“ventral pathway”), an occipitoparietal pathway (“dorsal pathway”), and a right frontoparietal network (“attention system”). Several studies linked behavioral performance to enhanced activity and/or representation in the frontoparietal network (Duyck et al., 2021; Poirier et al., 2006; Visconti di Oleggio Castello et al., 2021), as well as in the more anterior parts of the occipitotemporal pathway and the more dorsal parts of the occipitoparietal pathway (Christophel et al., 2017).

Due to our focus on object shape, our results do not speak directly to the modulation of cortical responses by expectation, such as “expectation suppression” or “surprise signalling” (Barron et al., 2016; Bell et al., 2016; Mayrhauser et al., 2014; Vinken et al., 2018). Moreover, in our paradigm, object presentations were never repeated exactly and every object presentation contained elements of surprise, as neither the object, nor the point of view, nor the direction of rotation could be anticipated by observers.

The most robust representations of object shape for both recurring objects (“identity”) and non-recurring objects (“novelty”) were observed in the ventral occipitotemporal cortex, at the intermediate levels of the shape processing hierarchy (Grill-Spector & Weiner, 2014; Perry & Fallah, 2014). Additionally, we found representations of object shape in “dorsal stream” cortical areas, consistent with the view that these areas encode goal- and task-related object features (Perry & Fallah, 2014).

The most novel aspect of our findings was changes in the geometry of cortical representations as visual expertise for recurring objects was being acquired and consolidated. In relative terms, distances between response classes decreased, and/or distances within classes increased, while observers repeatedly viewed and became familiar with the corresponding stimulus classes. This modest *decline* in stimulus encoding was however associated with an expansion (or *diversification*) in the distribution of responses within classes, so that responses of all classes taken together scattered more uniformly over the available representational space. Changes in cortical representations were quite different for stimuli that appeared only once and that observers did not attempt to memorize (non-recurring objects). Here, again in relative terms, distances between classes (non-recurring and recurring) increased and/or distances within classes (non-recurring) decreased. This steep *growth* in class encoding was associated with a substantial contraction (or *stereotypisation*) in the distribution of responses, in the sense that responses to non-recurring objects shifted to the margin of the available representational space.

We conclude that hemodynamic responses to novel object shapes immediately represent the differences between these shapes, even prior to learning, presumably reflecting life-long prior experience. When object shapes grow familiar with learning, hemodynamic responses to the same shapes become more diverse, whereas responses to different shapes remain comparably dissimilar from each other. Responses to control objects that are always novel develop quite differently in that they become less diverse relative to each other, but also more dissimilar from responses to familiar objects.

## Data and Code Availability

Direct linear discriminant analysis and prevalence inference is available on github.com/cognitive-biology/DLDA. MR data will be made available upon request.

## Author Contributions

Ehsan Kakaei: Conceptualization, data curation, formal analysis, visualization, and writing of the original draft. Jochen Braun: Conceptualization, linear algebra, formal analysis, supervision, and reviewing & editing.

## Acknowledgments

We thank Claus Tempelmann, Martin Kanowski, and Denise Scheermann at the Magnetic Resonance Imaging Laboratory of the Department of Neurology of Otto-von-Guericke University, Magdeburg. We are grateful to Oliver Speck for providing essential support and a balanced perspective. We also thank Stepan Aleshin for helpful discussions and constructive comments. This study was funded by the federal state Saxony-Anhalt and the European Structural and Investment Funds (ESF, 2014–2020), project number ZS/2016/08/80645, as part of the doctoral program ABINEP (Analysis, Imaging and Modelling of Neuronal Processes).

## Declaration of Competing Interest

The authors are not aware of any competing interest.

## Supplementary Materials

Supplementary material for this article is available with the online version here: https://doi.org/10.1162/imag_a_00255

## References

#### AppendiX

AAL region . | Parcel . | MNI . | $\alpha $ (%) . | $\beta $ . | $p\u22c6$ . | $p\u22c6$ . | $p\u22c6$ . | Topog. . | ||
---|---|---|---|---|---|---|---|---|---|---|

. | No. . | x . | y . | z . | Identity . | Novelty . | Struct. . | Unstruct. . | Both . | Assign. . |

Precentral | 14 | -51 | 7 | 36 | 8.4 | 0.33 | $0.006$ | $10\u22125$ | $10\u22125$ | - |

Superior frontal | 56 | 25 | -9 | 64 | 7.9 | 0.18 | n.s. | n.s. | $4\xd710\u22125$ | - |

Inferior frontal (opercular) | 143 | 38 | 11 | 31 | 7.5 | 0.27 | $5\xd710\u22125$ | $5\xd710\u22123$ | $0.02$ | IFC |

146 | 52 | 10 | 22 | 9 | 0.44 | $0.05$ | n.s. | $10\u22125$ | IFC | |

Inferior frontal (triangular) | 163 | 51 | 23 | 20 | 7.9 | 0.34 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IFC |

Parahippocampal | 325 | 26 | -38 | -9 | 7.5 | 0.05 | $0.007$ | $2\xd710\u22124$ | $10\u22125$ | - |

Calcarine | 331 | -2 | -91 | -1 | 16.9 | 0.01 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v |

333 | -2 | -94 | -4 | 10.2 | -0.01 | $10\u22125$ | $5\xd710\u22123$ | $10\u22125$ | V1v | |

335 | -4 | -86 | 7 | 12.6 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

336 | -5 | -99 | -8 | 12.2 | 0.03 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v | |

337 | -12 | -99 | -5 | 13.8 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

338 | -7 | -75 | 10 | 8.7 | 0.08 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

342 | 17 | -83 | 11 | 9.8 | 0.07 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

344 | 9 | -85 | 6 | 13.8 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v | |

345 | 15 | -91 | 1 | 14.3 | 0.09 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

347 | 18 | -99 | -1 | 12.1 | 0.07 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

348 | 11 | -72 | 11 | 9.1 | 0.06 | n.s. | $10\u22125$ | $2\xd710\u22124$ | - | |

Cuneus | 350 | -7 | -85 | 26 | 10.1 | 0.07 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

352 | 0 | -94 | 25 | 9.1 | 0.05 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

354 | -2 | -79 | 22 | 8.6 | 0.07 | $0.003$ | $10\u22125$ | $10\u22125$ | - | |

355 | -4 | -92 | 25 | 11.9 | 0.09 | n.s. | $10\u22125$ | $0.02$ | V2d | |

356 | 12 | -91 | 18 | 13 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

357 | 15 | -97 | 10 | 13.8 | 0.08 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

Lingual | 363 | -12 | -65 | -5 | 9.5 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3v |

364 | -15 | -95 | -16 | 10.9 | 0.02 | $10\u22124$ | $10\u22125$ | $10\u22125$ | V2v | |

367 | -29 | -89 | -16 | 11.5 | 0.07 | $2\xd710\u22125$ | $10\u22125$ | $10\u22125$ | hV4 | |

368 | -17 | -85 | -12 | 14.9 | 0.26 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

370 | -22 | -65 | -5 | 9.7 | 0.16 | $10\u22125$ | $10\u22125$ | $10\u22125$ | VO2 | |

371 | -12 | -79 | -8 | 13.9 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

372 | -6 | -74 | 2 | 12 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v | |

373 | 16 | -81 | -7 | 14.1 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3v | |

375 | 11 | -72 | -4 | 12.5 | 0.04 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

377 | 21 | -58 | -3 | 9.3 | 0.24 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

378 | 13 | -52 | 2 | 7.6 | 0.05 | n.s. | $10\u22125$ | $2\xd710\u22124$ | - | |

379 | 16 | -88 | -10 | 13.4 | 0.14 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

380 | 27 | -91 | -16 | 9.2 | 0.06 | n.s. | $0.01$ | $0.01$ | - | |

381 | 17 | -98 | -10 | 9.3 | 0.04 | $10\u22125$ | n.s. | $0.005$ | V1v | |

383 | 14 | -56 | -6 | 7.9 | 0.05 | $2\xd710\u22125$ | $0.01$ | $10\u22125$ | - | |

Occipital (superior) | 384 | -18 | -84 | 25 | 10.6 | 0.08 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3a |

385 | -16 | -85 | 41 | 9.8 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 | |

386 | -18 | -69 | 29 | 7.7 | 0.04 | n.s. | $10\u22125$ | $0.005$ | - | |

387 | -16 | -95 | 23 | 13.5 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3a | |

388 | -22 | -75 | 34 | 9.1 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS1 | |

389 | -11 | -95 | 9 | 12.9 | 0.05 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

390 | 22 | -90 | 24 | 13.6 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3a | |

391 | 21 | -98 | 14 | 13.8 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

392 | 27 | -85 | 40 | 9.2 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 | |

393 | 25 | -67 | 33 | 8 | 0.18 | n.s. | n.s. | $0.005$ | - | |

394 | 24 | -75 | 21 | 8.7 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

395 | 23 | -79 | 33 | 10 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

396 | 29 | -70 | 43 | 9.2 | 0.21 | $0.03$ | $10\u22125$ | $10\u22125$ | IPS1 | |

Occipital (middle) | 397 | -28 | -77 | 27 | 10.1 | 0.23 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 |

398 | -28 | -72 | 34 | 9.1 | 0.32 | n.s. | $10\u22125$ | $10\u22125$ | IPS1 | |

400 | -38 | -86 | 4 | 12 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | LO2 | |

402 | -33 | -87 | 19 | 12 | 0.24 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3b | |

403 | -30 | -78 | 2 | 7.7 | 0.05 | n.s. | n.s. | $0.02$ | - | |

404 | -27 | -94 | 1 | 12.7 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3d | |

405 | -16 | -100 | 1 | 14 | 0.13 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

406 | -40 | -75 | 15 | 8.9 | 0.14 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

407 | -27 | -83 | 14 | 11.3 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 | |

408 | -24 | -93 | 13 | 13.6 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3d | |

410 | -38 | -83 | 23 | 9.9 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

412 | -46 | -77 | 5 | 10.1 | 0.14 | $10\u22125$ | $10\u22125$ | $10\u22125$ | hMT | |

413 | 33 | -88 | 7 | 13.2 | 0.23 | $10\u22125$ | $10\u22125$ | $10\u22125$ | LO1 | |

414 | 33 | -96 | 4 | 10.7 | 0.17 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3d | |

415 | 39 | -81 | 14 | 10.6 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3b | |

416 | 44 | -78 | 5 | 10.9 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | LO2 | |

418 | 32 | -86 | 23 | 11.9 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3b | |

420 | 34 | -69 | 32 | 8.5 | 0.34 | n.s. | $10\u22125$ | $0.02$ | - | |

421 | 32 | -76 | 27 | 10 | 0.28 | $10\u22125$ | $10\u22124$ | $10\u22125$ | IPS0 | |

Occipital (inferior) | 423 | -50 | -68 | -14 | 9.4 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

424 | -31 | -83 | -8 | 12.6 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

425 | -22 | -95 | -9 | 12.3 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

426 | -42 | -73 | -8 | 11.6 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

428 | 36 | -85 | -7 | 13.2 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

430 | 42 | -73 | -9 | 11.4 | 0.25 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Fusiform | 432 | -27 | -71 | -11 | 12.2 | 0.25 | $10\u22125$ | $10\u22125$ | $10\u22125$ | VO2 |

435 | -33 | -77 | -17 | 12.4 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | hV4 | |

436 | -31 | -53 | -13 | 8.8 | 0.26 | $10\u22125$ | n.s. | $10\u22125$ | PHC1 | |

438 | -41 | -56 | -17 | 8.9 | 0.26 | n.s. | $10\u22125$ | $0.01$ | - | |

440 | -36 | -63 | -16 | 9.6 | 0.25 | $0.001$ | $10\u22125$ | $10\u22125$ | - | |

442 | 28 | -74 | -11 | 12.8 | 0.31 | $10\u22125$ | $10\u22125$ | $10\u22125$ | hV4 | |

443 | 36 | -71 | -16 | 10.8 | 0.25 | $0.02$ | $10\u22125$ | $10\u22125$ | hV4 | |

447 | 29 | -47 | -14 | 8.3 | 0.31 | $0.002$ | $10\u22125$ | $10\u22125$ | PHC2 | |

450 | 41 | -48 | -20 | 8.4 | 0.30 | $0.006$ | $10\u22125$ | $0.02$ | - | |

452 | 28 | -59 | -12 | 9.7 | 0.35 | $10\u22125$ | $10\u22125$ | $10\u22125$ | VO2 | |

Postcentral | 476 | 60 | -18 | 37 | 9.2 | 0.34 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

478 | 42 | -31 | 49 | 8.5 | 0.25 | $10\u22125$ | $0.02$ | $10\u22125$ | - | |

484 | 30 | -40 | 61 | 8.5 | 0.20 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Parietal (superior) | 494 | -26 | -61 | 61 | 9.1 | 0.09 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

495 | -27 | -53 | 68 | 8 | 0.09 | $0.02$ | $0.05$ | $2\xd710\u22125$ | - | |

497 | -21 | -67 | 47 | 9.3 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS1 | |

498 | -15 | -69 | 50 | 8.5 | 0.12 | $0.01$ | $10\u22125$ | $10\u22125$ | IPS2 | |

499 | -28 | -69 | 50 | 8.9 | 0.29 | $0.001$ | $3\xd710\u22125$ | $10\u22125$ | - | |

501 | -17 | -79 | 50 | 8.7 | 0.11 | $10\u22124$ | $3\xd710\u22125$ | $10\u22125$ | IPS1 | |

502 | 30 | -60 | 64 | 8.8 | 0.18 | $0.002$ | $5\xd710\u22124$ | $10\u22125$ | IPS3 | |

504 | 34 | -62 | 57 | 9.3 | 0.30 | $5\xd710\u22125$ | $0.003$ | $10\u22125$ | - | |

506 | 33 | -50 | 58 | 9.8 | 0.34 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

507 | 21 | -57 | 74 | 8.1 | 0.09 | n.s. | $10\u22125$ | $5\xd710\u22124$ | - | |

509 | 31 | -73 | 53 | 8.6 | 0.13 | n.s. | $10\u22125$ | $0.001$ | - | |

510 | 21 | -74 | 55 | 8.8 | 0.14 | 0.06 | 0.07 | $2\xd710\u22125$ | IPS1 | |

511 | 20 | -65 | 54 | 9.2 | 0.17 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS2 | |

Parietal (inferior) | 514 | -45 | -30 | 42 | 8.6 | 0.24 | $0.002$ | $10\u22125$ | $10\u22125$ | - |

516 | -32 | -75 | 44 | 8 | 0.19 | $0.01$ | $5\xd710\u22124$ | $10\u22125$ | - | |

521 | -39 | -47 | 42 | 8.6 | 0.17 | $n.s.$ | $10\u22124$ | $0.005$ | - | |

522 | -32 | -46 | 50 | 8.5 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

523 | -31 | -53 | 46 | 8.4 | 0.18 | $10\u22125$ | n.s. | $2\xd710\u22125$ | - | |

527 | 46 | -39 | 50 | 8.6 | 0.35 | n.s. | $10\u22125$ | $0.001$ | - | |

529 | 37 | -49 | 46 | 8.7 | 0.34 | $0.06$ | $0.003$ | $2\xd710\u22125$ | - | |

530 | 35 | -44 | 51 | 9.2 | 0.29 | $5\xd710\u22124$ | $10\u22125$ | $10\u22125$ | - | |

Supramarginal | 536 | -61 | -28 | 34 | 8.4 | 0.24 | $0.003$ | $10\u22125$ | $10\u22125$ | - |

539 | 44 | -34 | 41 | 8.7 | 0.20 | $0.001$ | $0.001$ | $10\u22125$ | - | |

542 | 63 | -24 | 37 | 8.5 | 0.22 | $10\u22125$ | n.s. | $10\u22124$ | - | |

Angular | 557 | 34 | -60 | 44 | 8.7 | 0.38 | n.s. | $10\u22125$ | $0.005$ | - |

Precuneus | 561 | -5 | -77 | 53 | 7.9 | 0.06 | $10\u22125$ | $2\xd710\u22125$ | $10\u22125$ | - |

573 | -9 | -71 | 54 | 8.1 | 0.08 | $10\u22125$ | $0.003$ | $10\u22125$ | - | |

576 | 14 | -71 | 45 | 7.9 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Temporal (middle) | 678 | -45 | -67 | 11 | 8.9 | 0.12 | $10\u22125$ | n.s. | $10\u22124$ | - |

685 | -49 | -62 | 0 | 9.3 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

701 | 52 | -59 | 3 | 8.9 | 0.20 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

717 | 49 | -69 | 4 | 10.5 | 0.18 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Temporal (inferior) | 728 | -54 | -58 | -11 | 8.5 | 0.25 | $2\xd710\u22124$ | $10\u22125$ | $10\u22125$ | AIT |

732 | -45 | -52 | -13 | 9.3 | 0.30 | $10\u22125$ | $10\u22125$ | $10\u22125$ | AIT | |

755 | 46 | -53 | -11 | 9.7 | 0.42 | $10\u22125$ | $10\u22125$ | $10\u22125$ | AIT |

AAL region . | Parcel . | MNI . | $\alpha $ (%) . | $\beta $ . | $p\u22c6$ . | $p\u22c6$ . | $p\u22c6$ . | Topog. . | ||
---|---|---|---|---|---|---|---|---|---|---|

. | No. . | x . | y . | z . | Identity . | Novelty . | Struct. . | Unstruct. . | Both . | Assign. . |

Precentral | 14 | -51 | 7 | 36 | 8.4 | 0.33 | $0.006$ | $10\u22125$ | $10\u22125$ | - |

Superior frontal | 56 | 25 | -9 | 64 | 7.9 | 0.18 | n.s. | n.s. | $4\xd710\u22125$ | - |

Inferior frontal (opercular) | 143 | 38 | 11 | 31 | 7.5 | 0.27 | $5\xd710\u22125$ | $5\xd710\u22123$ | $0.02$ | IFC |

146 | 52 | 10 | 22 | 9 | 0.44 | $0.05$ | n.s. | $10\u22125$ | IFC | |

Inferior frontal (triangular) | 163 | 51 | 23 | 20 | 7.9 | 0.34 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IFC |

Parahippocampal | 325 | 26 | -38 | -9 | 7.5 | 0.05 | $0.007$ | $2\xd710\u22124$ | $10\u22125$ | - |

Calcarine | 331 | -2 | -91 | -1 | 16.9 | 0.01 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v |

333 | -2 | -94 | -4 | 10.2 | -0.01 | $10\u22125$ | $5\xd710\u22123$ | $10\u22125$ | V1v | |

335 | -4 | -86 | 7 | 12.6 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

336 | -5 | -99 | -8 | 12.2 | 0.03 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v | |

337 | -12 | -99 | -5 | 13.8 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

338 | -7 | -75 | 10 | 8.7 | 0.08 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

342 | 17 | -83 | 11 | 9.8 | 0.07 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

344 | 9 | -85 | 6 | 13.8 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v | |

345 | 15 | -91 | 1 | 14.3 | 0.09 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

347 | 18 | -99 | -1 | 12.1 | 0.07 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1d | |

348 | 11 | -72 | 11 | 9.1 | 0.06 | n.s. | $10\u22125$ | $2\xd710\u22124$ | - | |

Cuneus | 350 | -7 | -85 | 26 | 10.1 | 0.07 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

352 | 0 | -94 | 25 | 9.1 | 0.05 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

354 | -2 | -79 | 22 | 8.6 | 0.07 | $0.003$ | $10\u22125$ | $10\u22125$ | - | |

355 | -4 | -92 | 25 | 11.9 | 0.09 | n.s. | $10\u22125$ | $0.02$ | V2d | |

356 | 12 | -91 | 18 | 13 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

357 | 15 | -97 | 10 | 13.8 | 0.08 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

Lingual | 363 | -12 | -65 | -5 | 9.5 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3v |

364 | -15 | -95 | -16 | 10.9 | 0.02 | $10\u22124$ | $10\u22125$ | $10\u22125$ | V2v | |

367 | -29 | -89 | -16 | 11.5 | 0.07 | $2\xd710\u22125$ | $10\u22125$ | $10\u22125$ | hV4 | |

368 | -17 | -85 | -12 | 14.9 | 0.26 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

370 | -22 | -65 | -5 | 9.7 | 0.16 | $10\u22125$ | $10\u22125$ | $10\u22125$ | VO2 | |

371 | -12 | -79 | -8 | 13.9 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

372 | -6 | -74 | 2 | 12 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V1v | |

373 | 16 | -81 | -7 | 14.1 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3v | |

375 | 11 | -72 | -4 | 12.5 | 0.04 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

377 | 21 | -58 | -3 | 9.3 | 0.24 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

378 | 13 | -52 | 2 | 7.6 | 0.05 | n.s. | $10\u22125$ | $2\xd710\u22124$ | - | |

379 | 16 | -88 | -10 | 13.4 | 0.14 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2v | |

380 | 27 | -91 | -16 | 9.2 | 0.06 | n.s. | $0.01$ | $0.01$ | - | |

381 | 17 | -98 | -10 | 9.3 | 0.04 | $10\u22125$ | n.s. | $0.005$ | V1v | |

383 | 14 | -56 | -6 | 7.9 | 0.05 | $2\xd710\u22125$ | $0.01$ | $10\u22125$ | - | |

Occipital (superior) | 384 | -18 | -84 | 25 | 10.6 | 0.08 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3a |

385 | -16 | -85 | 41 | 9.8 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 | |

386 | -18 | -69 | 29 | 7.7 | 0.04 | n.s. | $10\u22125$ | $0.005$ | - | |

387 | -16 | -95 | 23 | 13.5 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3a | |

388 | -22 | -75 | 34 | 9.1 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS1 | |

389 | -11 | -95 | 9 | 12.9 | 0.05 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

390 | 22 | -90 | 24 | 13.6 | 0.10 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3a | |

391 | 21 | -98 | 14 | 13.8 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

392 | 27 | -85 | 40 | 9.2 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 | |

393 | 25 | -67 | 33 | 8 | 0.18 | n.s. | n.s. | $0.005$ | - | |

394 | 24 | -75 | 21 | 8.7 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

395 | 23 | -79 | 33 | 10 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

396 | 29 | -70 | 43 | 9.2 | 0.21 | $0.03$ | $10\u22125$ | $10\u22125$ | IPS1 | |

Occipital (middle) | 397 | -28 | -77 | 27 | 10.1 | 0.23 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 |

398 | -28 | -72 | 34 | 9.1 | 0.32 | n.s. | $10\u22125$ | $10\u22125$ | IPS1 | |

400 | -38 | -86 | 4 | 12 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | LO2 | |

402 | -33 | -87 | 19 | 12 | 0.24 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3b | |

403 | -30 | -78 | 2 | 7.7 | 0.05 | n.s. | n.s. | $0.02$ | - | |

404 | -27 | -94 | 1 | 12.7 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3d | |

405 | -16 | -100 | 1 | 14 | 0.13 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V2d | |

406 | -40 | -75 | 15 | 8.9 | 0.14 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

407 | -27 | -83 | 14 | 11.3 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS0 | |

408 | -24 | -93 | 13 | 13.6 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3d | |

410 | -38 | -83 | 23 | 9.9 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

412 | -46 | -77 | 5 | 10.1 | 0.14 | $10\u22125$ | $10\u22125$ | $10\u22125$ | hMT | |

413 | 33 | -88 | 7 | 13.2 | 0.23 | $10\u22125$ | $10\u22125$ | $10\u22125$ | LO1 | |

414 | 33 | -96 | 4 | 10.7 | 0.17 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3d | |

415 | 39 | -81 | 14 | 10.6 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3b | |

416 | 44 | -78 | 5 | 10.9 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | LO2 | |

418 | 32 | -86 | 23 | 11.9 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | V3b | |

420 | 34 | -69 | 32 | 8.5 | 0.34 | n.s. | $10\u22125$ | $0.02$ | - | |

421 | 32 | -76 | 27 | 10 | 0.28 | $10\u22125$ | $10\u22124$ | $10\u22125$ | IPS0 | |

Occipital (inferior) | 423 | -50 | -68 | -14 | 9.4 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

424 | -31 | -83 | -8 | 12.6 | 0.22 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

425 | -22 | -95 | -9 | 12.3 | 0.11 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

426 | -42 | -73 | -8 | 11.6 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

428 | 36 | -85 | -7 | 13.2 | 0.21 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

430 | 42 | -73 | -9 | 11.4 | 0.25 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Fusiform | 432 | -27 | -71 | -11 | 12.2 | 0.25 | $10\u22125$ | $10\u22125$ | $10\u22125$ | VO2 |

435 | -33 | -77 | -17 | 12.4 | 0.15 | $10\u22125$ | $10\u22125$ | $10\u22125$ | hV4 | |

436 | -31 | -53 | -13 | 8.8 | 0.26 | $10\u22125$ | n.s. | $10\u22125$ | PHC1 | |

438 | -41 | -56 | -17 | 8.9 | 0.26 | n.s. | $10\u22125$ | $0.01$ | - | |

440 | -36 | -63 | -16 | 9.6 | 0.25 | $0.001$ | $10\u22125$ | $10\u22125$ | - | |

442 | 28 | -74 | -11 | 12.8 | 0.31 | $10\u22125$ | $10\u22125$ | $10\u22125$ | hV4 | |

443 | 36 | -71 | -16 | 10.8 | 0.25 | $0.02$ | $10\u22125$ | $10\u22125$ | hV4 | |

447 | 29 | -47 | -14 | 8.3 | 0.31 | $0.002$ | $10\u22125$ | $10\u22125$ | PHC2 | |

450 | 41 | -48 | -20 | 8.4 | 0.30 | $0.006$ | $10\u22125$ | $0.02$ | - | |

452 | 28 | -59 | -12 | 9.7 | 0.35 | $10\u22125$ | $10\u22125$ | $10\u22125$ | VO2 | |

Postcentral | 476 | 60 | -18 | 37 | 9.2 | 0.34 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

478 | 42 | -31 | 49 | 8.5 | 0.25 | $10\u22125$ | $0.02$ | $10\u22125$ | - | |

484 | 30 | -40 | 61 | 8.5 | 0.20 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Parietal (superior) | 494 | -26 | -61 | 61 | 9.1 | 0.09 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - |

495 | -27 | -53 | 68 | 8 | 0.09 | $0.02$ | $0.05$ | $2\xd710\u22125$ | - | |

497 | -21 | -67 | 47 | 9.3 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS1 | |

498 | -15 | -69 | 50 | 8.5 | 0.12 | $0.01$ | $10\u22125$ | $10\u22125$ | IPS2 | |

499 | -28 | -69 | 50 | 8.9 | 0.29 | $0.001$ | $3\xd710\u22125$ | $10\u22125$ | - | |

501 | -17 | -79 | 50 | 8.7 | 0.11 | $10\u22124$ | $3\xd710\u22125$ | $10\u22125$ | IPS1 | |

502 | 30 | -60 | 64 | 8.8 | 0.18 | $0.002$ | $5\xd710\u22124$ | $10\u22125$ | IPS3 | |

504 | 34 | -62 | 57 | 9.3 | 0.30 | $5\xd710\u22125$ | $0.003$ | $10\u22125$ | - | |

506 | 33 | -50 | 58 | 9.8 | 0.34 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

507 | 21 | -57 | 74 | 8.1 | 0.09 | n.s. | $10\u22125$ | $5\xd710\u22124$ | - | |

509 | 31 | -73 | 53 | 8.6 | 0.13 | n.s. | $10\u22125$ | $0.001$ | - | |

510 | 21 | -74 | 55 | 8.8 | 0.14 | 0.06 | 0.07 | $2\xd710\u22125$ | IPS1 | |

511 | 20 | -65 | 54 | 9.2 | 0.17 | $10\u22125$ | $10\u22125$ | $10\u22125$ | IPS2 | |

Parietal (inferior) | 514 | -45 | -30 | 42 | 8.6 | 0.24 | $0.002$ | $10\u22125$ | $10\u22125$ | - |

516 | -32 | -75 | 44 | 8 | 0.19 | $0.01$ | $5\xd710\u22124$ | $10\u22125$ | - | |

521 | -39 | -47 | 42 | 8.6 | 0.17 | $n.s.$ | $10\u22124$ | $0.005$ | - | |

522 | -32 | -46 | 50 | 8.5 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

523 | -31 | -53 | 46 | 8.4 | 0.18 | $10\u22125$ | n.s. | $2\xd710\u22125$ | - | |

527 | 46 | -39 | 50 | 8.6 | 0.35 | n.s. | $10\u22125$ | $0.001$ | - | |

529 | 37 | -49 | 46 | 8.7 | 0.34 | $0.06$ | $0.003$ | $2\xd710\u22125$ | - | |

530 | 35 | -44 | 51 | 9.2 | 0.29 | $5\xd710\u22124$ | $10\u22125$ | $10\u22125$ | - | |

Supramarginal | 536 | -61 | -28 | 34 | 8.4 | 0.24 | $0.003$ | $10\u22125$ | $10\u22125$ | - |

539 | 44 | -34 | 41 | 8.7 | 0.20 | $0.001$ | $0.001$ | $10\u22125$ | - | |

542 | 63 | -24 | 37 | 8.5 | 0.22 | $10\u22125$ | n.s. | $10\u22124$ | - | |

Angular | 557 | 34 | -60 | 44 | 8.7 | 0.38 | n.s. | $10\u22125$ | $0.005$ | - |

Precuneus | 561 | -5 | -77 | 53 | 7.9 | 0.06 | $10\u22125$ | $2\xd710\u22125$ | $10\u22125$ | - |

573 | -9 | -71 | 54 | 8.1 | 0.08 | $10\u22125$ | $0.003$ | $10\u22125$ | - | |

576 | 14 | -71 | 45 | 7.9 | 0.12 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Temporal (middle) | 678 | -45 | -67 | 11 | 8.9 | 0.12 | $10\u22125$ | n.s. | $10\u22124$ | - |

685 | -49 | -62 | 0 | 9.3 | 0.19 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

701 | 52 | -59 | 3 | 8.9 | 0.20 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

717 | 49 | -69 | 4 | 10.5 | 0.18 | $10\u22125$ | $10\u22125$ | $10\u22125$ | - | |

Temporal (inferior) | 728 | -54 | -58 | -11 | 8.5 | 0.25 | $2\xd710\u22124$ | $10\u22125$ | $10\u22125$ | AIT |

732 | -45 | -52 | -13 | 9.3 | 0.30 | $10\u22125$ | $10\u22125$ | $10\u22125$ | AIT | |

755 | 46 | -53 | -11 | 9.7 | 0.42 | $10\u22125$ | $10\u22125$ | $10\u22125$ | AIT |

Parcel ID, geometrical centroid x/y/z in MNI coordinates, average classification accuracy $\alpha $, average novelty rate $\beta $, corrected significance $p\u22c6$ in structured or unstructured conditions ($n=8$), corrected significance $p\u22c6$ in both conditions ($n=16$), and topographical assignment, if any.