Although many neuroimaging studies have considered verbal and visual short-term memory (STM) as relying on neurally segregated short-term buffer systems, the present study explored the existence of shared neural correlates supporting verbal and visual STM. We hypothesized that networks involved in attentional and executive processes, as well as networks involved in serial order processing, underlie STM for both verbal and visual list information, with neural specificity restricted to sensory areas involved in processing the specific items to be retained. Participants were presented sequences of nonwords or unfamiliar faces, and were instructed to maintain and recognize order or item information. For encoding and retrieval phases, null conjunction analysis revealed an identical fronto-parieto-cerebellar network comprising the left intraparietal sulcus, bilateral dorsolateral prefrontal cortex, and the bilateral cerebellum, irrespective of information type and modality. A network centered around the right intraparietal sulcus supported STM for order information, in both verbal and visual modalities. Modality-specific effects were observed in left superior temporal and mid-fusiform areas associated with phonological and orthographic processing during the verbal STM tasks, and in right hippocampal and fusiform face processing areas during the visual STM tasks, wherein these modality effects were most pronounced when storing item information. The present results suggest that STM emerges from the deployment of modality-independent attentional and serial ordering processes toward sensory networks underlying the processing and storage of modality-specific item information.