To estimate the size of an indoor space, we must analyze the visual boundaries that limit the spatial extent and acoustic cues from reflected interior surfaces. We used fMRI to examine how the brain processes the geometric size of indoor scenes when various types of sensory cues are presented individually or together. Specifically, we asked whether the size of space is represented in a modality-specific way or in an integrative way that combines multimodal cues. In a block-design study, images or sounds that depict small- and large-sized indoor spaces were presented. Visual stimuli were real-world pictures of empty spaces that were small or large. Auditory stimuli were sounds convolved with different reverberations. By using a multivoxel pattern classifier, we asked whether the two sizes of space can be classified in visual, auditory, and visual–auditory combined conditions. We identified both sensory-specific and multimodal representations of the size of space. To further investigate the nature of the multimodal region, we specifically examined whether it contained multimodal information in a coexistent or integrated form. We found that angular gyrus and the right medial frontal gyrus had modality-integrated representation, displaying sensitivity to the match in the spatial size information conveyed through image and sound. Background functional connectivity analysis further demonstrated that the connection between sensory-specific regions and modality-integrated regions increases in the multimodal condition compared with single modality conditions. Our results suggest that spatial size perception relies on both sensory-specific and multimodal representations, as well as their interplay during multimodal perception.

You do not currently have access to this content.