Theories of visual recognition postulate that our ability to understand our visual environment at a glance is based on the extraction of the gist of the visual scene, a first global and rudimentary visual representation. Gist perception would be based on the rapid analysis of low spatial frequencies in the visual signal and would allow a coarse categorization of the scene. We aimed to study whether the low spatial resolution information available in peripheral vision could modulate the processing of visual information presented in central vision. We combined behavioral measures (Experiments 1 and 2) and fMRI measures (Experiment 2). Participants categorized a scene presented in central vision (artificial vs. natural categories) while ignoring another scene, either semantically congruent or incongruent, presented in peripheral vision. The two scenes could either share the same physical properties (similar amplitude spectrum and spatial configuration) or not. Categorization of the central scene was impaired by a semantically incongruent peripheral scene, in particular when the two scenes were physically similar. This semantic interference effect was associated with increased activation of the inferior frontal gyrus. When the two scenes were semantically congruent, the dissimilarity of their physical properties impaired the categorization of the central scene. This effect was associated with increased activation in occipito-temporal areas. In line with the hypothesis of predictive mechanisms involved in visual recognition, results suggest that semantic and physical properties of the information coming from peripheral vision would be automatically used to generate predictions that guide the processing of signal in central vision.