Abstract

Visual object recognition is performed effortlessly by humans notwithstanding the fact that it requires a series of complex computations, which are, as yet, not well understood. Here, we tested a novel account of the representations used for visual recognition and their neural correlates using fMRI. The rationale is based on previous research showing that a set of representations, termed “minimal recognizable configurations” (MIRCs), which are computationally derived and have unique psychophysical characteristics, serve as the building blocks of object recognition. We contrasted the BOLD responses elicited by MIRC images, derived from different categories (faces, objects, and places), sub-MIRCs, which are visually similar to MIRCs, but, instead, result in poor recognition and scrambled, unrecognizable images. Stimuli were presented in blocks, and participants indicated yes/no recognition for each image. We confirmed that MIRCs elicited higher recognition performance compared to sub-MIRCs for all three categories. Whereas fMRI activation in early visual cortex for both MIRCs and sub-MIRCs of each category did not differ from that elicited by scrambled images, high-level visual regions exhibited overall greater activation for MIRCs compared to sub-MIRCs or scrambled images. Moreover, MIRCs and sub-MIRCs from each category elicited enhanced activation in corresponding category-selective regions including fusiform face area and occipital face area (faces), lateral occipital cortex (objects), and parahippocampal place area and transverse occipital sulcus (places). These findings reveal the psychological and neural relevance of MIRCs and enable us to make progress in developing a more complete account of object recognition.

INTRODUCTION

Visual object recognition is the process by which observers successfully identify objects, whose images impinge on the retina, in spite of physical variations, such as the position and size of the objects in the input (Rajalingham, Schmidt, & DiCarlo, 2015; Grill-Spector et al., 1999; Ito, Tamura, Fujita, & Tanaka, 1995). In humans, object recognition is extraordinarily accurate and rapid notwithstanding the challenges introduced by transformations over size and position, as well as by differences in class variability, lighting, and pose.

Behavioral Theories of Visual Recognition

Many explanations have been offered to account for the robust object recognition abilities of human observers. These different theories have attempted to identify the basic elements that serve as the building blocks for object recognition, and in characterizing the space of possible object representations, the theories largely fall into one of three main classes, including templates, structural descriptions, and features (see Gauthier & Tarr, 2016; Ungerleider & Bell, 2011; Peissig & Tarr, 2007, for reviews of these theories). Recently, a new approach has offered a different set of building blocks that differ from those that comprise the three major classes. This approach has begun to define the minimal visual information within an image that is needed for recognition (Ullman, Assif, Fetaya, & Harari, 2016; Lerner, Epshtein, Ullman, & Malach, 2008; Ullman, Vidal-Naquet, & Sali, 2002; Ullman & Sali, 2000). In one relevant study, Lerner et al. (2008) demonstrated that, compared with randomly selected fragments of an image, the usage of more informative fragments (which do not correspond to “features” per se) enabled observers to classify and recognize visual images accurately. “Informativeness” in the study was measured statistically, using mutual information. Furthermore, not only was recognition better, but in an accompanying neuroimaging study, there was greater activation for the informative over noninformative fragments. In this study, both the informative and noninformative fragments were derived from three stimulus classes, faces, cars, and horses. On each trial, seven such fragments (e.g., part of an eye or part of a mouth or teeth), each enclosed in a circle, were displayed simultaneously. Interestingly, brain regions strongly associated with object recognition, namely, the posterior fusiform gyrus and lateral occipital region, showed greater BOLD activation in response to the informative versus the noninformative fragment display. Superior activation for the informative fragments extended beyond these typical object regions and was also evident in the intraparietal sulcus (IPS) located in the dorsal cortex and even in a patch in the superior frontal sulcus. In all regions, the activation profile was similar for all three classes of stimuli, although not every pairwise difference between informative versus noninformative fragment reached statistical significance.

Consistent with the observation that informative fragments may subserve object recognition, Ullman et al. (2016) proposed a novel theoretical account in which object recognition is mediated by a visual representation of a “minimal configuration.” Specifically, this account posits that, at the level of minimally recognizable images, a very small change of the image can have a drastic, nonlinear effect on recognition. Empirical support for the utility of a minimal recognizable configuration (MIRC) as the core representational set was gleaned from an investigation in which an object image patch was presented to observers. If the patch was recognized with high accuracy, five descendants were generated by either cropping 20% of the image on one of the four corners or reducing resolution by 20%. The descendants were then shown and the process iteratively repeated. Note that MIRC images are a tiny fraction of the original images, from which they were derived. A recognizable patch is empirically defined as a MIRC if none of its five descendants reach recognition criterion (50%) and the poorly recognized descendants are referred to as “sub-MIRCs” (see Figure 1A and B). A notable aspect of the results is the sharp transition—a surprisingly small change to a MIRC can render it unrecognizable (average drop in recognition rate of 0.71 ± 0.05). Importantly, the sets of the MIRC and sub-MIRC stimuli did not differ in parameters comparing physical attributes of the images. These results suggest that the human visual system is highly sensitive to informative configurations (present in MIRCs but not in sub-MIRCs) and provide initial support for the plausibility of such representations in object recognition.

Figure 1. 

Experimental stimuli. (A) Examples of original images from which MIRCs/sub-MIRCs were derived. Note that these full images were not included in the experiment. Below are examples of MIRCs, sub-MIRCs, and scrambled images that were shown to participants in the imaging experiment. Two stimuli from each category are provided for visualization purposes. (B) An outline of the process of generating MIRCs and sub-MIRCs from the original images. Generally, if an image patch was recognized by human participants, five descendants were generated and presented to additional observers: Four were obtained by cropping 20% of the image (bottom row) and one by reducing resolution by 20% (middle row, right). The process was repeated on all descendants until none of the descendants reached recognition criterion (50% across participants). The numbers next to each image indicate the fraction of participants that correctly recognized the image during preliminary experiments in which MIRCs and sub-MIRCs were generated (note that these are not the behavioral performance data shown in Figure 2 that were obtained during scanning). For a detailed description of this process, see Ullman et al. (2016).

Figure 1. 

Experimental stimuli. (A) Examples of original images from which MIRCs/sub-MIRCs were derived. Note that these full images were not included in the experiment. Below are examples of MIRCs, sub-MIRCs, and scrambled images that were shown to participants in the imaging experiment. Two stimuli from each category are provided for visualization purposes. (B) An outline of the process of generating MIRCs and sub-MIRCs from the original images. Generally, if an image patch was recognized by human participants, five descendants were generated and presented to additional observers: Four were obtained by cropping 20% of the image (bottom row) and one by reducing resolution by 20% (middle row, right). The process was repeated on all descendants until none of the descendants reached recognition criterion (50% across participants). The numbers next to each image indicate the fraction of participants that correctly recognized the image during preliminary experiments in which MIRCs and sub-MIRCs were generated (note that these are not the behavioral performance data shown in Figure 2 that were obtained during scanning). For a detailed description of this process, see Ullman et al. (2016).

The Neural Basis of Recognition

If MIRCs are engaged in object recognition, one might expect to observe neural support in the visual system for the MIRC–sub-MIRC distinction. A host of object-selective cortical regions have been delineated to date, and much is known about the key areas engaged in object perception. The lateral occipital cortex (LOC) is considered the preeminent area associated with object recognition (Kravitz, Saleem, Baker, Ungerleider, & Mishkin, 2013; Gross, 2002; Malach et al., 1995), responding more strongly to objects than to noise, textures, and scrambled stimuli (James, Culham, Humphrey, Milner, & Goodale, 2003; Kourtzi & Kanwisher, 2000; Grill-Spector, Kushnir, Edelman, Itzchak, & Malach, 1998; Kanwisher, Chun, McDermott, & Ledden, 1996; Malach et al., 1995). Moreover, LOC activation is correlated with behavioral measures of recognition (Grill-Spector, Kushnir, Hendler, & Malach, 2000; Vanni, Revonsuo, Saarinen, & Hari, 1996), and, as illustrated by Lerner et al. (2008), LOC is also activated by informative image patches compared with noninformative fragments.

In addition to LOC, other areas within the high-level visual cortex also respond to objects. Many of these areas are considered to be “category-selective” regions, as they are more strongly activated by a selective or preferred category than by any other visual category (Grill-Spector & Weiner, 2014). These include the fusiform face area (FFA) and the occipital face area (OFA) for face images and the parahippocampal place area (PPA) and the transverse occipital sulcus (TOS) for places or houses. Although we know from Lerner et al. (2008) that multiple regions appear to be activated by random fragments, they did not test MIRCs specifically nor did they evaluate extensively whether or not there was a category-specific fragment response; they did show faces (and FFA showed greater significance for face fragments vs. noninformative fragments), but the other two classes were cars and horses, neither of which produce a specific signature in particular regions of cortex.

As a means of elucidating the category-selective visual computations and the viability of MIRCs as the representational basis of object recognition, we used fMRI and MIRCs and their poorly recognizable counterpart (sub-MIRCs; Ullman et al., 2016). If MIRCs do play a functional role in recognition, then we might expect that MIRCs will elicit higher activation compared with sub-MIRCs in high-level visual areas involved in recognition, but not in early visual cortex, which is less engaged in pattern recognition per se. We also tested whether the MIRC–sub-MIRC selectivity in high-level visual areas would be category-selective by examining, compared with other objects classes, the advantage for face MIRCs over sub-MIRCs in FFA, for place MIRCs over sub-MIRCs in PPA and TOS, and for object MIRCs over sub-MIRCs in LOC (see Figure 1A for examples).

METHODS

Participants

Twenty-one participants (12 men) provided informed consent to participate in the experiment. All reported right-handed dominance and normal or corrected vision (by contact lenses). One participant was excluded from the study due to an unusual, incidental anatomical finding as per the recommendation of an expert neuroradiologist, and two other participants were excluded from the study due to excessive head movements in all of the experimental runs. Thus, we conducted the analysis based on 18 healthy participants (11 men), age range = 21–28 years (M = 24.44, SD = 1.85). One of the participants did not perform the localizer runs because of his request to stop the scan earlier, so he was included only in the behavioral and whole-brain analyses. In addition, runs that contained excessive head movements were excluded from the analyses (all participants had at least three of four experimental runs). The experiment was approved by the Helsinki committee of the Soroka University Medical Center, Beer-Sheva, Israel.

Stimuli and Apparatus

Stimuli included images of MIRCs, sub-MIRCs, and scrambled images. The image categories included objects, faces, and places. Creating the MIRC and sub-MIRC stimuli is a long and arduous process—about 14,000 participants were included in the initial MTurk online testing conducted by Ullman et al. (2016) to create object and eye MIRCs and sub-MIRCs. A subsample of this stimulus set was used in this study, and the other face and place stimuli were created by Ullman's lab using the same technique for the purpose of this study (see below for a detailed description of the stimulus set). Because faces and places constitute more homogeneous categories, compared with the object category, the resultant set of MIRCs and sub-MIRCs included an unequal number of stimuli across all categories. The face category contained four kinds of stimuli (MIRC or sub-MIRC stimuli derived from eyes, mouths, noses, or whole faces), the place category contained three kinds of stimuli (MIRC or sub-MIRC stimuli derived from houses, landscapes, or cityscapes), and the object category contained nine objects (MIRC or sub-MIRC stimuli derived from original images of an eagle, airplane, suit, glasses, bicycles, car, horse, ship, or fly). Scrambled images were generated from all categories of MIRC images by dividing each image into patches of 10 × 10 pixels that were randomly reshuffled to create unrecognizable stimuli (Lerner, Hendler, Ben-Bashat, Harel, & Malach, 2001).

The MIRC and sub-MIRC stimuli included in the experiment did not differ in parameters comparing their physical image-level attributes as assessed by the Graycoprops MATLAB command, which measures contrast, correlation, energy, homogeneity, and the Entropy MATLAB command, which measures entropy. Similar analyses were used in a previous study for assessing physical image-level properties of visual stimuli (Freud, Culham, Plaut, & Behrmann, 2017). These parameters were compared across all MIRCs and sub-MIRCs using two-tailed independent samples t tests, and none of the comparisons was significant (p > .013, following Bonferroni correction for multiple comparisons), thus ruling out the possibility that neural or behavioral differential responses elicited by these two sets of stimuli could obviously be accounted for by basic, image-level properties. Comparing MIRCs to scrambled images and sub-MIRCs to scrambled images using the same analysis also did not reveal any significant differences across these sets of stimuli (p > .313 and p > .066 following Bonferroni correction for multiple comparisons for MIRCs and sub-MIRCs, respectively).

In this study, some of the stimuli in each category were presented more than once to allow an equal number of presentations of each category and of each kind of stimulus (MIRC and sub-MIRC). All stimulus manipulations were done using MATLAB R2014b (The MathWorks, Inc., RRID: nlx_153890). Stimuli were presented on an LCD screen placed at the back of the scanner bore behind the participant's head (distance ∼140cm). Stimuli were 5 × 5 cm in size with a visual angle of ∼1.0231° × 1.0231°, and they were presented in the center of the screen. Participants viewed the stimuli through a tilted mirror that was mounted on the head coil above the participants' eyes. To avoid priming effects by ensuring that participants did not view both the MIRC and the sub-MIRC of the same specific object, participants only saw either one of the MIRCs or counterpart sub-MIRCs. This limits the ability to compare directly between MIRCs and sub-MIRCs of the same specific stimulus but still allows a comparison between MIRCs and sub-MIRCs of the same category. Note that the full, original stimuli shown in Figure 1 that were used to generate the MIRCs and the sub-MIRCs were never presented in the experiment.

Scans were conducted on a 3T Philips Ingenia scanner equipped with a 32-channel head coil, located at the Soroka University Medical Center, Beer-Sheva, Israel. We used the gradient-echo echo-planner imaging sequence with parallel acquisition (SENSE: factor 2.8) to acquire fMRI BOLD contrast. Specific scanning parameters were as follows: whole-brain coverage 35 slices, transverse orientation, voxel resolution 2.61 mm × 2.61 mm, 3 mm thickness, no gap, repetition time = 2000 msec, echo time = 35 msec, flip angle = 90°, field of view = 256 × 256, and matrix size 96 × 96. High-resolution anatomical volumes were acquired with a T1-weighted 3-D pulse sequence (1 × 1 × 1 mm3, 170 slices). Note that the main focus of the study was to investigate the responses in occipitotemporal cortex; hence, the first criterion for slice prescription was to ensure full coverage of this region. Depending on participants' brain size, in some cases, this criterion forced us to exclude the most dorsal part of the brain.

Procedure

Participants completed an fMRI scanning session, which included a 3-D anatomical scan, four experimental runs, and two localizer runs. Each experimental run contained 42 blocks (five trials in each block), with six blocks for each subcategory (i.e., MIRCs and sub-MIRCs of faces, objects, places and six blocks for scrambled images). In each block, five stimuli from the same category and of the same type (all MIRCs or all sub-MIRCs of the same category) were presented for 1800 msec each, followed by a central fixation point presented for 200 msec (total duration of the trial was 2000 msec). A red fixation point on a black background was also presented for 6000 msec between the blocks. Participants were instructed to respond during each stimulus presentation and to indicate whether they recognized the stimulus or not by pressing designated keys on a response box. We used these responses as a proxy for recognition performance because we could not obtain verbal responses during scanning. We realize that these responses may not fully account for participants' subjective recognition level, but still the recognition accuracy we measured in this study replicates the general recognition pattern obtained by Ullman et al. (2016) when participants were asked to explicitly name the observed images. Experiments were programmed and presented to participants using E-Prime 2.0 software (Psychology Software Tools, RRID: SCR_009567).

Following the experimental runs, participants completed the localizer runs in which they performed a 1-back task, pressing a designated key on a response box if the stimulus presented was identical to the immediate previous stimulus (Avidan et al., 2014). Each localizer run contained 35 blocks (10 trials in each block), with seven blocks for each of the five categories (objects, famous faces, not famous faces, scrambled, houses). Each stimulus was presented for 800 msec, followed by a 200-msec fixation point (total duration of the trial was 1000 msec). Similar to the experimental runs, a red fixation point on a black background was also presented for 6000 msec between the blocks. In all runs (experimental and localizer), a 20-sec or 10-sec block, which included a blank and fixation point, was presented at the beginning and at the end of each run, respectively. Performance on the 1-back task revealed that all participants attended to the task (accuracy was in the range of 80–100%, M = 93.95%, SD = 5.039%).

The analysis of the imaging data was conducted using Brain Voyager (BrainInnovations; RRID: nif-0000-00274, RRID: SCR_006660), and all statistical analyses were conducted using RStudio0.99.903, JASP Team (2017, RRID: SCR_000432) and Statistica12 (Statsoft.com, 2016, RRID: SCR_014213). We note that, because the stimuli used for the experiment are minimal and impoverished, we expected that the BOLD responses they would elicit in high-level visual areas would be weaker compared with typical BOLD responses. Consequently, the magnitude of the experimental effects was expected to be relatively small. Given these unique circumstances, our approach for choosing ROIs was to maximize the sensitivity and signal-to-noise ratio by optimizing the ROI selection for each participant individually. Specifically, we defined ROIs for each participant (individual-level ROIs, with the exception of the individual who did not complete the localizer task) using the ROI definition procedure successfully employed by Rosenthal, Sporns, and Avidan (2017). ROIs were defined using a cluster size threshold of ≥4 and using the following contrasts, for faces, places, and objects ROIs, respectively: face > house & object conditions, house > faces & object conditions, object > scrambled condition. Similar to previous studies (Weiner et al., 2017; Avidan et al., 2014; Baldassano, Beck, & Fei-Fei, 2013), given that the place-related activation around the PPA is widespread and extensive, the PPA was defined using the most anterior voxels, with a maximal cluster size of 1000 anatomical voxels. An ROI for early visual cortex was defined using the contrast of all conditions (faces, objects, houses and scrambled images > fixation) and was restricted anatomically within the vicinity of the calcarine sulcus. See Table 1 for ROI details.

Table 1. 
Mean Talairach Coordinates of the ROIs ± SD of the Mean across Participants, the Mean Cluster Size ± SD of Each ROI in mm3 and the Number of Participants Exhibiting Each ROI (N)
ROI's Talairach's Coordinates and Cluster Size
ROIHemispherexyzCluster sizenn
FFA 38.62 ± 3.36 −48.79 ± 4.62 −15.58 ± 3.45 86.16 ± 50.82 17   
−38.96 ± 2.75 −51.07 ± 5.27 −14.82 ± 3.77 68.15 ± 32.26 17 17 
OFA 36.15 ± 2.47 −75.45 ± 3.89 −9.6 ± 3.46 30.9 ± 22.92 12   
−38.62 ± 2.88 −77.82 ± 4.77 −11.64 ± 5.17 29.37 ± 21.11 10 14 
PPA 22.62 ± 3.68 −35.23 ± 2.67 −10.03 ± 2.64 22.18 ± 6.78 17   
−21.66 ± 3.8 −37.29 ± 3.07 −8.75 ± 3.61 17.61 ± 8.27 17 17 
TOS 30.77 ± 3.06 −79.77 ± 2.68 8.35 ± 3.05 43.33 ± 25.79 12   
−33.49 ± 1.71 −81.83 ± 2.39 9.49 ± 2.83 46.02 ± 22.26 11 15 
LOC 41.73 ± 3.35 −71.38 ± 5.55 −4.28 ± 4.18 58.81 ± 52.86 13   
−43.65 ± 4.11 −70.82 ± 3.08 −2.84 ± 5.14 60.23 ± 48.39 14 14 
Early visual cortex 9.76 ± 3.26 −92.97 ± 2.53 −2.77 ± 5.88 31.29 ± 6.94 17   
−7.12 ± 4.45 −93.9 ± 3.03 −4.52 ± 6.36 37.52 ± 9.46 16 17 
ROI's Talairach's Coordinates and Cluster Size
ROIHemispherexyzCluster sizenn
FFA 38.62 ± 3.36 −48.79 ± 4.62 −15.58 ± 3.45 86.16 ± 50.82 17   
−38.96 ± 2.75 −51.07 ± 5.27 −14.82 ± 3.77 68.15 ± 32.26 17 17 
OFA 36.15 ± 2.47 −75.45 ± 3.89 −9.6 ± 3.46 30.9 ± 22.92 12   
−38.62 ± 2.88 −77.82 ± 4.77 −11.64 ± 5.17 29.37 ± 21.11 10 14 
PPA 22.62 ± 3.68 −35.23 ± 2.67 −10.03 ± 2.64 22.18 ± 6.78 17   
−21.66 ± 3.8 −37.29 ± 3.07 −8.75 ± 3.61 17.61 ± 8.27 17 17 
TOS 30.77 ± 3.06 −79.77 ± 2.68 8.35 ± 3.05 43.33 ± 25.79 12   
−33.49 ± 1.71 −81.83 ± 2.39 9.49 ± 2.83 46.02 ± 22.26 11 15 
LOC 41.73 ± 3.35 −71.38 ± 5.55 −4.28 ± 4.18 58.81 ± 52.86 13   
−43.65 ± 4.11 −70.82 ± 3.08 −2.84 ± 5.14 60.23 ± 48.39 14 14 
Early visual cortex 9.76 ± 3.26 −92.97 ± 2.53 −2.77 ± 5.88 31.29 ± 6.94 17   
−7.12 ± 4.45 −93.9 ± 3.03 −4.52 ± 6.36 37.52 ± 9.46 16 17 

Because not all participants had bilateral activation in all ROIs, the rightmost column (n′), indicates the total number of ROIs calculated across hemispheres. This is the value that was used for the statistical analyses for each ROI. Specifically, for a given ROI, we calculated the mean activation across the two hemispheres based on the signal from both left and right ROIs in participants who exhibited bilateral activation or based on the signal from only one hemisphere in participants who only exhibited unilateral activation in the ROI.

RESULTS

Analyses of Recognition Performance

To confirm that we obtain the advantage for recognition of MIRCs over sub-MIRCs in this experiment and replicate the existing behavioral finding (Ullman et al., 2016), we examined the percent recognition performance for these two stimulus types for each of the three object classes during the fMRI scan. The 2 × 3 repeated-measures ANOVA with MIRC/sub-MIRC and Category (face, object, place) as within-subject factors, and Percent recognition as the dependent variable (Figure 2) revealed a significant main effect for MIRC/sub-MIRC, F(1, 17) = 165.22, p < .001, ηp2 = .907, and a significant MIRC/sub-MIRC × Category interaction, F(2, 34) = 35.57, p < .001, ηp2 = .677. There was not a significant main effect for Category, F(2, 34) = 0.67, p = .519, ηp2 = .038. To break down the interaction, we conducted three (one for each category) one-tailed paired-sample t tests, which revealed significantly higher recognition rate for MIRCs than for sub-MIRCs in the face category, t(17) = 11.71, p < .001, d = 2.76; object category, t(17) = 10.96, p < .001, d = 2.58; and place category, t(17) = 3.4, p = .002, d = 0.8. Furthermore, although the difference in recognition performance for MIRCs compared with sub-MIRCs is not significantly different for the face compared with the object category, F(1, 17) = 0.10, p = .751, the MIRC–sub-MIRC difference is significantly smaller for the place category compared with both the face, F(1, 17) = 58.99, p < .001, and the object, F(1, 17) = 45.48, p < .001, categories. Although it is not obvious why these class differences exist, for the purpose of the current study, these findings clearly support the hypothesized difference in performance for MIRCs versus sub-MIRCs and thereby permit an exploration of the neural correlates of this distinction. The recognition performance for scrambled images was 3.8% (M = 3.799%, SD = 7, SE = 1.65) and is indicated in Figure 2. This low recognition level indicates that, indeed, the scrambled stimuli were unrecognizable and hence could serve as a good control condition (Lerner et al., 2001).

Figure 2. 

Recognition performance for MIRCs and sub-MIRCs of each category with mean percent recognition obtained during the fMRI scan. Error bars indicate SE across participants. Recognition performance of the scrambled images condition is also shown for comparison, n = 18.

Figure 2. 

Recognition performance for MIRCs and sub-MIRCs of each category with mean percent recognition obtained during the fMRI scan. Error bars indicate SE across participants. Recognition performance of the scrambled images condition is also shown for comparison, n = 18.

Analyses of Neural Activations in High-order Visual Areas

Before examining whether MIRCs elicited higher activation compared with sub-MIRCs in high-level visual areas involved in recognition, we conducted the initial following analyses in each ROI (anatomical details of all ROIs are provided in Table 1): To examine hemispheric differences, we first conducted a 2 × 3 × 2 repeated-measures ANOVA with MIRC/sub-MIRC, Category, and Hemisphere as within-subject factors, and Mean beta weight as the dependent variable. This analysis yielded no meaningful differences between the two hemispheres: Specifically, no significant hemisphere by MIRC/sub-MIRC interaction was found in any of the ROIs (p ≥ .074), and hence, we averaged the data obtained from both hemispheres for all subsequent analyses.

Then, to examine the activation profile in the higher order visual areas for recognizable MIRCs versus poorly recognizable sub-MIRCs, we compared the activation for these stimuli to the activation of scrambled images, which are largely unrecognizable. To do so, in each ROI, we conducted two 1-tailed paired-samples t tests to examine whether the activation for MIRCs of the preferred category is stronger compared with scrambled images and whether the activation for sub-MIRCs of the preferred category is stronger compared with scrambled images. In each ROI, the general pattern was that the activation for both the MIRCs and the sub-MIRCs of the preferred category was significantly larger than the activation obtained for scrambled images (p ≤ .009). Note that, in OFA, the difference between sub-MIRCs faces and scrambled images showed a nonsignificant trend. These findings indicate that both MIRCs and sub-MIRCs of the preferred category in each ROI elicited a differential response compared with the unrecognizable scrambled images.

Following these initial analyses, we conducted a 2 × 3 repeated-measures ANOVA with MIRC/sub-MIRC and Category as within-subject factors in each ROI. Scrambled images were not included in these analyses, as this stimulus category only has a single level. Beta weights of this condition, however, are presented in all graphs in Figure 3 to enable their inspection.

Figure 3. 

Activation profile averaged across right and left (A) FFA (n = 17), (B) OFA (n = 14), (C) LOC (n = 14), (D) PPA (n = 17), and (E) TOS (n = 15) for MIRCs and sub-MIRCs of the three categories; error bars indicate SE across participants. Beta weights of the scrambled images condition are also shown for comparison.

Figure 3. 

Activation profile averaged across right and left (A) FFA (n = 17), (B) OFA (n = 14), (C) LOC (n = 14), (D) PPA (n = 17), and (E) TOS (n = 15) for MIRCs and sub-MIRCs of the three categories; error bars indicate SE across participants. Beta weights of the scrambled images condition are also shown for comparison.

Fusiform Face Area

A 2 × 3 repeated-measures ANOVA revealed a significant main effect for both MIRC/sub-MIRC, F(1, 16) = 40.43, p < .001, ηp2 = .716, and Category, F(2, 32) = 26.15, p < .001, ηp2 = .620, but a nonsignificant interaction effect, F(2, 32) = 2.25, p = .122, ηp2 = .123 (Figure 3A). Despite the nonsignificant two-way interaction, because we had an a priori hypothesis regarding MIRC–sub-MIRC differences in all categories, we continued to examine the simple effects and the contrasts. Contrast analyses revealed that the difference in activation for MIRCs compared with sub-MIRCs was not significantly different for the faces compared with objects and places, F(1, 16) = 0.42, p = .527. Three 1-tailed paired-samples t tests used to examine simple effects for MIRC/sub-MIRC revealed significantly higher activation for MIRCs than for sub-MIRCs in the face category, t(16) = 3.61, p = .001, d = 0.88; object category, t(16) = 3.63, p = .001, d = 0.88; and place category, t(16) = 6.33, p < .001, d = 1.54. Thus, across all three categories, MIRCs elicited higher FFA activation compared with sub-MIRCs. Analysis of the main effect for Category revealed significantly greater activation for faces than for objects and places, F(1, 16) = 15.14, p = .001.

Occipital Face Area

A 2 × 3 repeated-measures ANOVA revealed a significant main effect for both MIRC/sub-MIRC, F(1, 13) = 34.31, p < .001, ηp2 = .725, and Category, F(2, 26) = 15.98, p < .001, ηp2 = .551, but a nonsignificant interaction, F(2, 26) = 2.5, p = .102, ηp2 = .161 (Figure 3B). Despite the nonsignificant two-way interaction effect, as above, because we had a specific hypothesis regarding MIRC–sub-MIRC differences in all categories, we continued to examine the simple effects and the contrasts. Contrasts analyses showed that the difference in activation for MIRCs compared with sub-MIRCs was not significantly different for the faces compared with objects and places, F(1, 13) = 0.8, p = .386. Three 1-tailed paired-samples t tests examining simple effects for MIRC/sub-MIRC revealed significantly higher activation for MIRCs than for sub-MIRCs in the face category, t(13) = 3.47, p = .002, d = 0.93; object category, t(13) = 3.65, p = .001, d = 0.98; and place category, t(13) = 5.23, p < .001, d = 1.4. Thus, similar to FFA, across all three categories, MIRCs elicited higher OFA activation compared with sub-MIRCs. Analysis of the main effect for Category revealed higher activation for faces than for objects and places, F(1, 13) = 5.38, p = .037.

Lateral Occipital Cortex

A 2 × 3 repeated-measures ANOVA revealed a significant main effect for both MIRC/sub-MIRC, F(1, 13) = 59.82, p < .001, ηp2 = .821, and Category, F(2, 26) = 30.12, p < .001, ηp2 = .699, and a significant interaction effect, F(2, 26) = 6.28, p = .006, ηp2 = .236 (Figure 3C). Contrast analyses revealed that the difference in activation for MIRCs compared with sub-MIRCs was not significantly different for the objects compared with faces and places, F(1, 13) = 0.45, p = .515, but was significant for the faces compared with places and objects, F(1, 13) = 8.46, p = .012, and for places compared with objects and faces, F(1, 13) = 10.56, p = .006. These significant differences are due to the large MIRC–sub-MIRC difference in the place category. Three 1-tailed paired-samples t tests examining simple effects for MIRC/sub-MIRC revealed significantly higher activation for MIRCs than for sub-MIRCs in the face category, t(13) = 2.56, p = .012, d = 0.68; object category, t(13) = 3.76, p = .001, d = 1.01; and place category, t(13) = 7.97, p < .001, d = 2.13. Thus, across all three categories, MIRCs elicited higher LOC activation compared with sub-MIRCs. Analysis of the main effect for Category revealed a higher activation for objects than for faces and places, F(1, 13) = 57.06, p < .001.

Parahippocampal Place Area

A 2 × 3 repeated-measures ANOVA revealed a significant main effect for Category, F(2, 32) = 82.92, p < .001, ηp2 = .838, and a significant interaction effect, F(2, 32) = 15.95, p < .001, ηp2 = .499, but a nonsignificant main effect for MIRC/sub-MIRC, F(1, 16) = 1.62, p = .221, ηp2 = .092 (Figure 3D). Contrast analyses revealed that the difference in activation for MIRCs compared with sub-MIRCs is significantly different (higher) for the places compared with faces and objects, F(1, 16) = 30.26, p < .001, and also for the objects compared with faces and places, F(1, 16) = 5.75, p = .029, as well as for the faces compared with objects and places, F(1, 16) = 14.01, p = .002. These differences are significant due to the significant difference between MIRCs and sub-MIRCs of places and the lack of such differences for the objects and faces categories. Three 1-tailed paired-samples t tests used to examine simple effects for MIRC/sub-MIRC revealed significantly higher activation for MIRCs than for sub-MIRCs in the place category, t(16) = 5.28, p < .001, d = 1.28, but not in the face category, t(16) = −2.16, p = .977, d = −0.52, or in the object category, t(16) = −1.19, p = .874, d = −0.29. Thus, MIRCs elicited higher PPA activation compared with sub-MIRCs only for the place category, but not for the face and object categories. Analysis of the main effect for Category revealed higher activation for places than for faces and objects, F(1, 16) = 110.31, p < .001.

Transverse Occipital Sulcus

A 2 × 3 repeated-measures ANOVA revealed a significant main effect for both MIRC/sub-MIRC, F(1, 14) = 5.07, p = .041, ηp2 = .266, and Category, F(2, 28) = 26.65, p < .001, ηp2 = .656, and a significant interaction effect, F(2, 28) = 7.52, p = .002, ηp2 = .349 (Figure 3E). Contrast analyses revealed that the difference in activation for MIRCs compared with sub-MIRCs was significantly different (higher) for the places compared with faces and objects, F(1, 14) = 22.59, p < .001, but not for the objects compared with faces and places, F(1, 14) = 0.93, p < .351. The difference in activation for MIRCs compared with sub-MIRCs was again significantly higher for the faces compared with objects and places, F(1, 14) = 8.3, p < .121, mirroring the results of the contrasts in the PPA. Again, this difference was due to the large difference between the MIRCs and sub-MIRCs activation of the place category. Three 1-tailed paired-samples t tests used to examine simple effects for MIRC/sub-MIRC revealed significantly higher activation for MIRCs than for sub-MIRCs in the place category, t(14) = 4.6, p < .001, d = 1.19, but not in the face category, t(14) = −0.18, p = .572, d = −0.05, and in the object category, t(14) = 0.64, p = .267, d = 0.17. Thus, similarly to the PPA, only for the place category, but not for the face and object categories, MIRCs elicited higher TOS activation compared with sub-MIRCs. Analysis of the main effect for Category revealed a higher activation for places than for faces and objects, F(1, 14) = 5.55, p = .034.

Neural Activation in Early Visual Cortex

To examine whether MIRCs and sub-MIRCs (of each of the three categories) elicited similar activation in early visual cortex, we conducted a 2 × 3 repeated-measures ANOVA with MIRC/sub-MIRC and Category as within-subject factors and with Mean beta weight in early visual cortex (in the vicinity of the calcarine sulcus) as the dependent variable. This analysis revealed a significant main effect for MIRC/sub-MIRC, F(1, 16) = 5.25, p = .036, ηp2 = .247, but not for Category, F(2, 32) = 0.34, p = .716, ηp2 = .021, and a nonsignificant interaction effect, F(2, 32) = 0.87, p = .430, ηp2 = .051 (Figure 4). We conducted three (one for each category) two-tailed paired-samples t tests, which revealed that the activation in early visual cortex for MIRCs was significantly different (higher) than for sub-MIRCs for the object category, t(16) = 2.6, p = .019, d = 0.63, but not for the face, t(16) = 0.81, p = .431, d = 0.2, nor for the place, t(16) = 1.52, p = .147, d = 0.37, categories (Figure 4).

Figure 4. 

Activation profile averaged across right and left early visual cortex (in the vicinity of the calcarine sulcus) for MIRCs and sub-MIRCs of the three categories and for the scrambled images condition; error bars indicate SE across participants. n = 17.

Figure 4. 

Activation profile averaged across right and left early visual cortex (in the vicinity of the calcarine sulcus) for MIRCs and sub-MIRCs of the three categories and for the scrambled images condition; error bars indicate SE across participants. n = 17.

If the activation in early visual cortex for MIRCs or sub-MIRCs is related to recognition, one might expect that the activation level for these stimuli (or at least for MIRCs) would be different from the activation level obtained for scrambled images, which are unrecognizable. To examine this or, more specifically, to rule out this possibility and thus provide a better understanding of the activation for MIRCs and sub-MIRCs in early visual cortex, we also conducted six two-tailed paired-samples t tests, examining differences in activation between MIRCs of each category versus scrambled images and between sub-MIRCs of each category and scrambled images. None of the tests yielded significant effects (p ≥ .16), demonstrating that MIRC and sub-MIRC activation does not differ from scrambled images. These findings further imply that the MIRC–sub-MIRC difference found in this region for objects does not indicate representation of high-level visual information and may instead reflect a response to low-level features such as edges. Finally, we note that the overall activity in early visual cortex was greater for scrambled images compared with sub-MIRCs. Given that sub-MIRCs were better recognized compared with scrambled images (Figure 2), this finding is compatible with findings of previous studies (Murray, Kersten, Olshausen, Schrater, & Woods, 2002), suggesting that higher order visual areas may exert top–down modulation on primary visual cortex that is a function of the extent of shape recognition associated with the stimuli. Specifically, because some shape information is still retained in the sub-MIRCs images, but not in the scrambled images, the activity for the former stimuli is modulated and hence reduced compared with the activity for the latter stimuli (and see also Lerner et al., 2001, for relevant findings).

Whole-brain Analysis

In addition to our hypothesis-driven ROI analyses, we also conducted a whole-brain analysis to examine whether there may be additional regions outside the visual cortex, which exhibit sensitivity to MIRCs compared with sub-MIRCs.

Whole-brain, random-effects general linear model analysis confirmed our ROI analysis and revealed that regions within the ventral occipito-temporal cortex showed higher activation for MIRCs compared with sub-MIRCs (Figure 5). However, higher activation for MIRCs compared with sub-MIRCs in the parietal cortex and specifically in the vicinity of the angular gyrus (Figure 5) and in the premotor cortex (Figure 5). It is possible that this higher activation in the parietal cortex could be related to greater attentional effects elicited by the MIRC stimuli. This higher activation could also be attributed to object selectivity elicited by these stimuli as this region has been shown to exhibit object-related activation (Freud et al., 2017).

Figure 5. 

Visual, parietal, and premotor activations, shown in yellow to orange colors, obtained by the contrast: MIRCs > sub-MIRCs, overlap with visual areas and also with an area within the parietal lobe (in the vicinity of angular gyrus) and in the premotor cortex. N = 18 (whole-brain, random effects analysis). Top row: sagittal (left hemisphere) and coronal slices. Bottom row: inferior (right) and superior (left) transverse slices. VOTC = ventral occipito temporal cortex.

Figure 5. 

Visual, parietal, and premotor activations, shown in yellow to orange colors, obtained by the contrast: MIRCs > sub-MIRCs, overlap with visual areas and also with an area within the parietal lobe (in the vicinity of angular gyrus) and in the premotor cortex. N = 18 (whole-brain, random effects analysis). Top row: sagittal (left hemisphere) and coronal slices. Bottom row: inferior (right) and superior (left) transverse slices. VOTC = ventral occipito temporal cortex.

Examining the MIRC/Sub-MIRC × Category interaction in the whole-brain analysis, we found a significant interaction effect, F(2, 34) = 22.37, p < .001, in the right PPA, similar to the region defined in the ROI analysis (coordinates for the region exhibiting the interaction effect are x = 25.5, y = −41.5, z = −8, cluster size = 5.28 mm3; see Table 1 for comparison; Rosenthal et al., 2017).

Contrast analysis conducted in this region revealed that the difference in activation for MIRCs compared with sub-MIRCs was not significantly different for the face compared with the object category, F(1, 17) = 0.01, p = .918. In contrast, the difference in MIRCs versus sub-MIRCs activation was significantly higher for the place category compared with both the face, F(1, 17) = 44.78, p < .001, and the object, F(1, 17) = 22.51, p < .001, categories. These findings provide independent confirmation of the ROI analysis, which revealed similar effects (see Figure 3D).

Correlations between Recognition Level and Neuronal Activation

Beyond showing selectivity for MIRCs versus sub-MIRCs in higher order visual regions, we also wanted to examine directly whether the signal in these regions is correlated with behavioral performance recorded during the experiment. Such a correlation would provide supportive evidence for the functional relevance of the MIRC stimuli. As in previous studies, for the neural activation difference we used d′ index (Avidan et al., 2014; Grill-Spector, Sayres, & Ress, 2006) calculated as d′(selectivity) = μMIRCμsubMIRCσ2MIRC+σ2subMIRC/2 where μ refers to the average beta weights per stimulus (MIRC or sub-MIRC) and σ2 refers to SE across trials. We used the above formula for each of the three categories separately, such that a larger index reflects greater selectivity in a specific ROI for MIRCs compared with sub-MIRCs of a specific category. For each category, we examined the correlation (using Pearson's r, two-tailed tests) between the d′ associated with recognition performance and the d′ describing the neural activation of each ROI. For indexing the recognition difference between MIRCs and sub-MIRCs, we calculated d′ using MIRCs as a recognizable stimuli and sub-MIRCs as unrecognizable stimuli. The exact definitions for hits, misses, false alarms, and correct rejections are shown in Figure 6. In the classic usage of signal detection theory (SDT), it is often the case that concrete, binary measures of recognition are used. The major advantage of using SDT for the correlation analysis and not standard recognition (as reported in Figure 2) is that SDT permits the separation of sensitivity (d′) and criterion. Disentangling these two measures is particularly critical in this study because the stimuli employed are impoverished. Note also that the behavioral d′ is more akin to the neural d′ measure we employ, compared with other behavioral measures of recognition.

Figure 6. 

Definitions of hits, misses, FA, and CR. FA = false alarms; CR = correct rejections.

Figure 6. 

Definitions of hits, misses, FA, and CR. FA = false alarms; CR = correct rejections.

Face recognition d′ for faces was significantly positively correlated with the fMRI activation d′ in the FFA (Figure 7A; r(15) = .7, p = .002), OFA (Figure 7B; r(12) = .54, p = .046), and LOC (Figure 7C; r(12) = .57, p = .033). That is, better recognition of MIRC faces was positively correlated with better neural discrimination of this stimulus category. Examining the correlation in other ROIs revealed that the recognition d′ for faces was not significantly correlated with the neural activation d′ of PPA, r(15) = −.08, p = .764, or TOS, r(13) = .46, p = .087.

Figure 7. 

Pearson's correlation between recognition d′ for faces and neural activation d′ for faces in (A) FFA, (B) OFA, and (C) LOC.

Figure 7. 

Pearson's correlation between recognition d′ for faces and neural activation d′ for faces in (A) FFA, (B) OFA, and (C) LOC.

There were no significant correlations between the recognition d′ for objects and the fMRI BOLD d′ in any region of cortex: FFA, r(15) = −.06, p = .810; OFA, r(12) = .34, p = .238; LOC, r(12) = −.45, p = .108; PPA, r(15) = −.26, p = .321; TOS, r(13) = −.33, p = .234.

There were also no significant correlations between the recognition d′ for places and the fMRI d′ in any region of cortex: FFA, r(15) = −.06, p = .825; OFA, r(12) = −.12, p = .694; LOC, r(12) = .4, p = .153; PPA, r(15) = −.004, p = .989; TOS, r(13) = .19, p = .488.

DISCUSSION

The goal of this study was to examine the claim that visual object recognition is supported by a class of fragments, MIRCs, which serve as the minimal units for recognition. We obtained behavioral recognition data as well as fMRI BOLD activation profiles throughout the visual system elicited by MIRCs and sub-MIRCs from different categories including faces, places, objects, and scrambled objects.

MIRCs Support Object Recognition

Consistent with the findings from Ullman et al. (2016), recognition performance revealed that MIRCs were significantly better recognized compared with sub-MIRCs in each of the three categories (Figure 2). Having established this key result during an fMRI scan, we then explored the neural correlates of the behavioral advantage for MIRCs over sub-MIRCs.

High-level Visual Areas Exhibit Greater Activation for MIRCs Compared with Sub-MIRCs

All high-level visual ROIs exhibited higher activation for MIRCs compared with sub-MIRCs of their preferred category (Figure 3). This implies that, beyond the behavioral effect, this difference between MIRCs and sub-MIRCs is also reflected at the level of the neural response in high-order visual areas. These results are compatible with findings presented by Lerner et al. (2008), in which informative image fragments elicited greater activation in high-order visual areas as well as better behavioral classification performance compared with the “random” fragments.

Future studies could employ multivoxel pattern analysis, which is sensitive even to minimal changes across stimuli in visual cortex (Nestor, Plaut, & Behrmann, 2011), to reveal more information about the nature of the MIRCs representation and to examine which regions show sensitivity to the MIRCs' diagnostic value. In addition, in this study, we used only specific ROIs (as well as whole-brain analysis), but future studies that use retinotopic mapping will be able to reveal the nature of MIRC representation along the full hierarchy of visual cortex (Lerner, Hendler, & Malach, 2002; Lerner et al., 2001) and perhaps the functional connectivity between regions associated with the MIRC advantage.

Selectivity for MIRC versus Sub-MIRC Is Exhibited in High-order Visual Cortex for the Preferred and Nonpreferred Categories

Face (FFA and OFA; Figure 2A, B) and object (LOC; Figure 2C) ROIs showed higher activation for MIRCs compared with sub-MIRCs for their preferred categories, but also for their nonpreferred categories. These findings are in line with Lerner et al. (2008), who showed similar results for informative compared with random image fragments. Also, of note, in face- and object-selective ROIs, the difference between MIRCs and sub-MIRCs of the preferred category is not significantly larger than the difference between MIRCs and sub-MIRCs of the nonpreferred categories.

A possible account for the differences between MIRCs and sub-MIRCs in nonpreferred categories may be related to the finding that areas within the ventral stream contain a heterogeneous neuronal population with a majority of neurons, which are selective for the preferred category of that ROI, but these are intermixed with neuronal populations that respond selectively to the nonpreferred categories. Hence, such ROIs evince responses not only to their preferred category but also to nonpreferred categories (Grill-Spector et al., 2006; Avidan, Hasson, Hendler, Zohary, & Malach, 2002; Haxby et al., 2001; Ishai, Ungerleider, Martin, Schouten, & Haxby, 1999). As noted above, further investigations with more fine-grained tools (e.g., adaptation) might shed further light on the response profile of different regions.

In the place selective areas (PPA and TOS), MIRCs elicited significantly greater activation compared with sub-MIRCs only for the place category (Figure 2D, E). Importantly, this difference was significantly different from the (nonsignificant) difference between MIRCs and sub-MIRCs of the nonpreferred categories (faces and objects). These findings are in-line with studies showing that these place selective areas are selective mostly for places or scenes (Baldassano et al., 2013; Dilks, Julian, Paunov, & Kanwisher, 2013; Grill-Spector, Knouf, & Kanwisher, 2004; Ishai et al., 1999; Epstein & Kanwisher, 1998). Interestingly, the MIRC–sub-MIRC distinction was evident in PPA even though the overall level of activation (beta weights) for all categories was lower than that in other regions. This indicates that the failure to find MIRC–sub-MIRC differences in the other ROIs or for the other categories in place-selective regions is not because of insufficient sensitivity. Note also that the pattern of activation in TOS was similar to that of PPA, whereas the overall signal amplitude in the TOS was higher and comparable to the amplitude obtained in other regions, thus endorsing the point that the pattern obtained in TOS and PPA is not related to a lack of power. That the signals obtained for the place category compared with faces and objects is lower might be related to the foveal presentation of MIRCs/sub-MIRCs and the usage of small stimuli. Previous studies have revealed that place-selective regions respond better to larger images that encompass the periphery of the visual field (Larson & Loschky, 2009; Levy, Hasson, Harel, & Malach, 2004). Hence, the presentation used in the current study might not be optimal for these place-selective regions, although we still observe category selectivity even under these presentation conditions.

Importantly, the main effects for category that we have observed also reveal the well-documented category selectivity in higher order visual regions (Grill-Spector & Weiner, 2014; Dilks et al., 2013; Grill-Spector et al., 2004, 2006; Avidan et al., 2002; Ishai et al., 1999; Epstein & Kanwisher, 1998). This consistent fMRI category effects are obtained with MIRC images, which are a tiny fraction of the original images from which they were generated, thus supporting their role in classification. These findings are compatible with the notion that, although high-level visual regions may include heterogeneous neuronal populations, they still contain a majority of voxels with category-selective neuronal populations (Grill-Spector et al., 2006; Avidan et al., 2002).

Activations in High-level Visual Regions Compared with Scrambled, Unrecognizable Images

Although MIRCs are minimal units and not complete images, in all ROIs, MIRCs of the preferred category elicited higher activation compared with meaningless scrambled images. Another intriguing finding is that in four of five ROIs used in the current study (all ROIs except OFA), sub-MIRCs of the preferred category also elicited higher activation compared with scrambled images. Note that this is compatible with the results of the computational model from Ullman et al. (2016), showing that at least some of the sub-MIRCs elicit above threshold recognition rate. This finding is interesting as it reveals that, although sub-MIRCs are poorly recognized, they still contain some information compared with unrecognizable scrambled images, and these results are consistent with previous findings reported by Lerner et al. (2008).

Activation in Early Visual Cortex

Given that the main difference between MIRCs and sub-MIRCs is related to their recognizability and that differences in low-level visual parameters could not account for the class distinction, we expected to find no difference in activation between these stimuli in early visual cortex. Indeed, for the face and place categories, there were no significant differences between the activation level elicited by MIRCs versus sub-MIRCs in early visual cortex (Figure 4). However, surprisingly, in the object category, the activation for MIRCs was significantly greater than the activation for sub-MIRCs in early visual cortex (Figure 4).

If the activation in early visual cortex for MIRCs or sub-MIRCs is indeed related to recognition, we would have expected that the activation level for these stimuli (or at least for MIRCs) would be different from the activation level obtained for scrambled images. To evaluate this, we compared the activation elicited by MIRCs and the activation elicited by sub-MIRCs of all categories to the activation elicited by scrambled images in these early visual areas. Importantly, the activation in early visual areas for both MIRCs and sub-MIRCs of all categories did not differ significantly from the activation for scrambled images. Thus, we provide supporting evidence showing that the activity in this region was overall similar to the activity elicited by unrecognizable, scrambled objects and hence not related to the representation of high-level visual information.

Whole-brain Analysis

The whole-brain analyses have corroborated our ROI hypothesis-driven findings as evident in the stronger activation for MIRCs compared with sub-MIRCs in higher order visual areas of the visual ventral stream (Figure 5).

In addition, the whole-brain analyses also revealed that MIRCs elicited higher activation than sub-MIRCs in additional areas within the parietal lobe (specifically, in the vicinity of the angular gyrus) and in the vicinity of the premotor cortex (Figure 5). Similar findings were previously obtained in a study about informative and random image fragments (Lerner et al., 2008). It is possible that the higher activation for MIRCs compared with sub-MIRCs in the vicinity of the angular gyrus is due to enhanced attention exerted toward the MIRCs compared with the sub-MIRC images (Singh-Curry & Husain, 2009; Chambers, Payne, Stokes, & Mattingley, 2004) and perhaps also due to semantic processing occurring mainly when observing the highly recognizable MIRC stimuli (Seghier, 2013; Binder, Desai, Graves, & Conant, 2009).

Interestingly, the whole-brain analysis also revealed a MIRC/Sub-MIRC × Category interaction in the vicinity of the PPA, and probing of this interaction revealed that it stemmed from the greater MIRC–sub-MIRC activation difference for places compared with objects and faces. These findings also support our ROI analysis, which revealed that the PPA region is selective mostly for places or scenes (see Figure 3D).

Correlations between Recognition Level and Neuronal Activation

We found that the activation differences in FFA, OFA, and LOC (Figure 7A, B, and C, respectively) were positively correlated with the behavioral recognition differences between MIRCs and sub-MIRCs of faces. These findings are in accordance with those of Grill-Spector et al. (2004), showing that neural activation in the FFA, as well as in other face selective regions and regions that respond to both objects and faces, like LOC, are positively correlated with the behavioral performance of face stimuli, but not of other stimuli like objects and houses. In addition, the correlation we found matches our results showing that FFA, OFA, and LOC elicited higher activation for MIRCs than for sub-MIRCs of faces. Although these findings reveal a correlation and not a causation, they may still attest to the biological relevance of the MIRC stimuli.

A possible account for the significant BOLD–behavior correlation, which was evident only for the face category, may be related to the homogeneity and uniqueness of these stimuli compared with the object and place stimuli. In contrast to these latter stimuli, in a natural environment, faces and face parts usually appear together; moreover, the face parts may facilitate and prime face recognition and face-related neuronal activation (Bentin & Golland, 2002). Therefore, it is possible that viewing a MIRC of a full face or a MIRC of face parts both elicit strong activation in the neuronal populations associated with face recognition (in FFA, OFA, and to some extent in LOC), and this activity further facilitates the recognition of the following face MIRC stimuli. These results are also consistent with the notion of a possible topological organization (faciotopy) of face representation in these regions as suggested, for example, by Henriksson, Mur, and Kriegeskorte (2015).

Conclusions and Significance of This Study

This study is the first to use fMRI to investigate the processing of MIRC images from different visual categories in human observers. The recognition performance supports the notion that MIRCs are recognizable whereas sub-MIRCs are hardly recognized (Ullman et al., 2016). The behavioral results received additional support from the neural findings, and together, these findings imply that the MIRCs, but not the sub-MIRCs, contain some local informative features, to which the human visual system is sensitive and which may be necessary for recognition (Ullman et al., 2016). It could be that this local information elicits top–down processing, which may guide and assist the recognition of MIRCs as described by Ullman et al. (2016). Moreover, it is possible that MIRCs, but not sub-MIRCs, allow the initiation of a top–down process and hence the creation of such relevant predictions. Indeed, previous studies imply that the activation in high-order visual regions may be modulated by stimulus-specific predictions or expectations (Freud, Ganel, & Avidan, 2015). Moreover, some studies using magnetoencephalography suggest that such predictions are generated in regions such as orbitofrontal cortex and are based on a coarse, low spatial resolution representation of the visual stimuli (Bar et al., 2006). We note, however, that the coarse temporal resolution of fMRI does not permit us to assess these specific effects in this study. Together, these findings raise the possibility that MIRCs serve as minimal units (Ullman et al., 2016) that activate higher-order visual regions and enable recognition. Better understanding of the human visual system would allow us, in turn, to make further progress on deriving a more complete theory of biological object recognition and to improve computational models (Ullman et al., 2016) and perhaps will have implications for neuropsychological patients suffering from specific object recognition deficits (Farah, 2004).

Acknowledgments

This work was supported by Israel Science Foundation (ISF) grant 296/15 to G. A., NIH grant EY026701 to M. B. (PI: J. Snow) and a BSF-NSF grant 2016731 to S. U.

Reprint requests should be sent to Galia Avidan, Ben-Gurion, Psychology, P.O. Box 653, Beer-Sheva 8410501, Israel, or via e-mail: galiaa@post.bgu.ac.il.

REFERENCES

Avidan
,
G.
,
Hasson
,
U.
,
Hendler
,
T.
,
Zohary
,
E.
, &
Malach
,
R.
(
2002
).
Analysis of the neuronal selectivity underlying low fMRI signals
.
Current Biology
,
12
,
964
972
.
Avidan
,
G.
,
Tanzer
,
M.
,
Hadj-Bouziane
,
F.
,
Liu
,
N.
,
Ungerleider
,
L. G.
, &
Behrmann
,
M.
(
2014
).
Selective dissociation between core and extended regions of the face processing network in congenital prosopagnosia
.
Cerebral Cortex
,
24
,
1565
1578
.
Baldassano
,
C.
,
Beck
,
D. M.
, &
Fei-Fei
,
L.
(
2013
).
Differential connectivity within the parahippocampal place area
.
Neuroimage
,
75
,
228
237
.
Bar
,
M.
,
Kassam
,
K. S.
,
Ghuman
,
A. S.
,
Boshyan
,
J.
,
Schmid
,
A. M.
,
Dale
,
A. M.
, et al
(
2006
).
Top–down facilitation of visual recognition
.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
449
454
.
Bentin
,
S.
, &
Golland
,
Y.
(
2002
).
Meaningful processing of meaningless stimuli: The influence of perceptual experience on early visual processing of faces
.
Cognition
,
86
,
B1
B14
.
Binder
,
J. R.
,
Desai
,
R. H.
,
Graves
,
W. W.
, &
Conant
,
L. L.
(
2009
).
Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies
.
Cerebral Cortex
,
19
,
2767
2796
.
Chambers
,
C. D.
,
Payne
,
J. M.
,
Stokes
,
M. G.
, &
Mattingley
,
J. B.
(
2004
).
Fast and slow parietal pathways mediate spatial attention
.
Nature Neuroscience
,
7
,
217
218
.
Dilks
,
D. D.
,
Julian
,
J. B.
,
Paunov
,
A. M.
, &
Kanwisher
,
N.
(
2013
).
The occipital place area is causally and selectively involved in scene perception
.
Journal of Neuroscience
,
33
,
1331
1336
.
Epstein
,
R.
, &
Kanwisher
,
N.
(
1998
).
A cortical representation of the local visual environment
.
Nature
,
392
,
598
601
.
Farah
,
M. J.
(
2004
).
Visual agnosia
.
Cambridge, MA
:
MIT Press
.
Freud
,
E.
,
Culham
,
J. C.
,
Plaut
,
D. C.
, &
Behrmann
,
M.
(
2017
).
The large-scale organization of shape processing in the ventral and dorsal pathways
.
Elife
,
6
,
e27576
.
Freud
,
E.
,
Ganel
,
T.
, &
Avidan
,
G.
(
2015
).
Impossible expectations: fMRI adaptation in the lateral occipital complex (LOC) is modulated by the statistical regularities of 3D structural information
.
Neuroimage
,
122
,
188
194
.
Gauthier
,
I.
, &
Tarr
,
M. J.
(
2016
).
Visual object recognition: Do we (finally) know more now than we did?
Annual Review of Vision Science
,
2
,
377
396
.
Grill-Spector
,
K.
,
Knouf
,
N.
, &
Kanwisher
,
N.
(
2004
).
The fusiform face area subserves face perception, not generic within-category identification
.
Nature Neuroscience
,
7
,
555
562
.
Grill-Spector
,
K.
,
Kushnir
,
T.
,
Edelman
,
S.
,
Avidan
,
G.
,
Itzchak
,
Y.
, &
Malach
,
R.
(
1999
).
Differential processing of objects under various viewing conditions in the human lateral occipital complex
.
Neuron
,
24
,
187
203
.
Grill-Spector
,
K.
,
Kushnir
,
T.
,
Edelman
,
S.
,
Itzchak
,
Y.
, &
Malach
,
R.
(
1998
).
Cue-invariant activation in object-related areas of the human occipital lobe
.
Neuron
,
21
,
191
202
.
Grill-Spector
,
K.
,
Kushnir
,
T.
,
Hendler
,
T.
, &
Malach
,
R.
(
2000
).
The dynamics of object-selective activation correlate with recognition performance in humans
.
Nature Neuroscience
,
3
,
837
843
.
Grill-Spector
,
K.
,
Sayres
,
R.
, &
Ress
,
D.
(
2006
).
High-resolution imaging reveals highly selective nonface clusters in the fusiform face area
.
Nature Neuroscience
,
9
,
1177
1185
.
Grill-Spector
,
K.
, &
Weiner
,
K. S.
(
2014
).
The functional architecture of the ventral temporal cortex and its role in categorization
.
Nature Reviews Neuroscience
,
15
,
536
548
.
Gross
,
C. G.
(
2002
).
Genealogy of the “grandmother cell”
.
Neuroscientist
,
8
,
512
518
.
Haxby
,
J. V.
,
Gobbini
,
M. I.
,
Furey
,
M. L.
,
Ishai
,
A.
,
Schouten
,
J. L.
, &
Pietrini
,
P.
(
2001
).
Distributed and overlapping representations of faces and objects in ventral temporal cortex
.
Science
,
293
,
2425
2430
.
Henriksson
,
L.
,
Mur
,
M.
, &
Kriegeskorte
,
N.
(
2015
).
Faciotopy—A face-feature map with face-like topology in the human occipital face area
.
Cortex
,
72
,
156
167
.
Ishai
,
A.
,
Ungerleider
,
L. G.
,
Martin
,
A.
,
Schouten
,
J. L.
, &
Haxby
,
J. V.
(
1999
).
Distributed representation of objects in the human ventral visual pathway
.
Proceedings of the National Academy of Sciences, U.S.A.
,
96
,
9379
9384
.
Ito
,
M.
,
Tamura
,
H.
,
Fujita
,
I.
, &
Tanaka
,
K.
(
1995
).
Size and position invariance of neuronal responses in monkey inferotemporal cortex
.
Journal of Neurophysiology
,
73
,
218
226
.
James
,
T. W.
,
Culham
,
J.
,
Humphrey
,
G. K.
,
Milner
,
A. D.
, &
Goodale
,
M. A.
(
2003
).
Ventral occipital lesions impair object recognition but not object directed grasping: An fMRI study
.
Brain
,
126
,
2463
2475
.
Kanwisher
,
N.
,
Chun
,
M. M.
,
McDermott
,
J.
, &
Ledden
,
P.
(
1996
).
Functional imaging of human visual recognition
.
Cognitive Brain Research
,
5
,
55
67
.
Kourtzi
,
Z.
, &
Kanwisher
,
N.
(
2000
).
Cortical regions involved in perceiving object shape
.
Journal of Neuroscience
,
20
,
3310
3318
.
Kravitz
,
D. J.
,
Saleem
,
K. S.
,
Baker
,
C. I.
,
Ungerleider
,
L. G.
, &
Mishkin
,
M.
(
2013
).
The ventral visual pathway: An expanded neural framework for the processing of object quality
.
Trends in Cognitive Sciences
,
17
,
26
49
.
Larson
,
A. M.
, &
Loschky
,
L. C.
(
2009
).
The contributions of central versus peripheral vision to scene gist recognition
.
Journal of Vision
,
9
,
1
16
.
Lerner
,
Y.
,
Epshtein
,
B.
,
Ullman
,
S.
, &
Malach
,
R.
(
2008
).
Class information predicts activation by object fragments in human object areas
.
Journal of Cognitive Neuroscience
,
20
,
1189
1206
.
Lerner
,
Y.
,
Hendler
,
T.
,
Ben-Bashat
,
D.
,
Harel
,
M.
, &
Malach
,
R.
(
2001
).
A hierarchical axis of object processing stages in the human visual cortex
.
Cerebral Cortex
,
11
,
287
297
.
Lerner
,
Y.
,
Hendler
,
T.
, &
Malach
,
R.
(
2002
).
Object-completion effects in the human lateral occipital complex
.
Cerebral Cortex
,
12
,
163
177
.
Levy
,
I.
,
Hasson
,
U.
,
Harel
,
M.
, &
Malach
,
R.
(
2004
).
Functional analysis of the periphery effect in human building related areas
.
Human Brain Mapping
,
22
,
15
26
.
Malach
,
R.
,
Reppas
,
J. B.
,
Benson
,
R. R.
,
Kwong
,
K. K.
,
Jiang
,
H.
,
Kennedy
,
W. A.
, et al
(
1995
).
Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex
.
Proceedings of the National Academy of Sciences, U.S.A.
,
92
,
8135
8139
.
Murray
,
S. O.
,
Kersten
,
D.
,
Olshausen
,
B. A.
,
Schrater
,
P.
, &
Woods
,
D. L.
(
2002
).
Shape perception reduces activity in human primary visual cortex
.
Proceedings of the National Academy of Sciences, U.S.A.
,
99
,
15164
15169
.
Nestor
,
A.
,
Plaut
,
D. C.
, &
Behrmann
,
M.
(
2011
).
Unraveling the distributed neural code of facial identity through spatiotemporal pattern analysis
.
Proceedings of the National Academy of Sciences, U.S.A.
,
108
,
9998
10003
.
Peissig
,
J. J.
, &
Tarr
,
M. J.
(
2007
).
Visual object recognition: Do we know more now than we did 20 years ago?
Annual Review of Psychology
,
58
,
75
96
.
Rajalingham
,
R.
,
Schmidt
,
K.
, &
DiCarlo
,
J. J.
(
2015
).
Comparison of object recognition behavior in human and monkey
.
Journal of Neuroscience
,
35
,
12127
12136
.
Rosenthal
,
G.
,
Sporns
,
O.
, &
Avidan
,
G.
(
2017
).
Stimulus dependent dynamic reorganization of the human face processing network
.
Cerebral Cortex
,
27
,
4823
4834
.
Seghier
,
M. L.
(
2013
).
The angular gyrus: Multiple functions and multiple subdivisions
.
Neuroscientist
,
19
,
43
61
.
Singh-Curry
,
V.
, &
Husain
,
M.
(
2009
).
The functional role of the inferior parietal lobe in the dorsal and ventral stream dichotomy
.
Neuropsychologia
,
47
,
1434
1448
.
Ullman
,
S.
,
Assif
,
L.
,
Fetaya
,
E.
, &
Harari
,
D.
(
2016
).
Atoms of recognition in human and computer vision
.
Proceedings of the National Academy of Sciences, U.S.A.
,
113
,
2744
2749
.
Ullman
,
S.
, &
Sali
,
E.
(
2000
).
Object classification using a fragment-based representation
. In
S. W.
Lee
,
H. H.
Bülthoff
, &
T.
Poggio
(Eds.),
Biolofically motivated computer vision. BMCV 2000. Lecture notes in computer science
(
Vol. 1811
, pp.
73
87
).
Berlin, Heidelberg
:
Springer
.
Ullman
,
S.
,
Vidal-Naquet
,
M.
, &
Sali
,
E.
(
2002
).
Visual features of intermediate complexity and their use in classification
.
Nature Neuroscience
,
5
,
682
687
.
Ungerleider
,
L. G.
, &
Bell
,
A. H.
(
2011
).
Uncovering the visual “alphabet”: Advances in our understanding of object perception
.
Vision Research
,
51
,
782
799
.
Vanni
,
S.
,
Revonsuo
,
A.
,
Saarinen
,
J.
, &
Hari
,
R.
(
1996
).
Visual awareness of objects correlates with activity of right occipital cortex
.
NeuroReport
,
8
,
183
186
.
Weiner
,
K. S.
,
Barnett
,
M. A.
,
Lorenz
,
S.
,
Caspers
,
J.
,
Stigliani
,
A.
,
Amunts
,
K.
, et al
(
2017
).
The cytoarchitecture of domain-specific regions in human high-level visual cortex
.
Cerebral Cortex
,
27
,
146
161
.