In the absence of sensory information, we can generate meaningful images and sounds from representations in memory. However, it remains unclear which neural systems underpin this process and whether tasks requiring the top–down generation of different kinds of features recruit similar or different neural networks. We asked people to internally generate the visual and auditory features of objects, either in isolation (car, dog) or in specific and complex meaning-based contexts (car/dog race). Using an fMRI decoding approach, in conjunction with functional connectivity analysis, we examined the role of auditory/visual cortex and transmodal brain regions. Conceptual retrieval in the absence of external input recruited sensory and transmodal cortex. The response in transmodal regions—including anterior middle temporal gyrus—was of equal magnitude for visual and auditory features yet nevertheless captured modality information in the pattern of response across voxels. In contrast, sensory regions showed greater activation for modality-relevant features in imagination (even when external inputs did not differ). These data are consistent with the view that transmodal regions support internally generated experiences and that they play a role in integrating perceptual features encoded in memory.