Speakers plan the phonological content of their utterances before their release as speech motor acts. Using a finite alphabet of learned phonemes and a relatively small number of syllable structures, speakers are able to rapidly plan and produce arbitrary syllable sequences that fall within the rules of their language. The class of computational models of sequence planning and performance termed competitive queuing models have followed K. S. Lashley [The problem of serial order in behavior. In L. A. Jeffress (Ed.), Cerebral mechanisms in behavior (pp. 112–136). New York: Wiley, 1951] in assuming that inherently parallel neural representations underlie serial action, and this idea is increasingly supported by experimental evidence. In this article, we developed a neural model that extends the existing DIVA model of speech production in two complementary ways. The new model includes paired structure and content subsystems [cf. MacNeilage, P. F. The frame/content theory of evolution of speech production. Behavioral and Brain Sciences, 21, 499–511, 1998] that provide parallel representations of a forthcoming speech plan as well as mechanisms for interfacing these phonological planning representations with learned sensorimotor programs to enable stepping through multisyllabic speech plans. On the basis of previous reports, the model's components are hypothesized to be localized to specific cortical and subcortical structures, including the left inferior frontal sulcus, the medial premotor cortex, the basal ganglia, and the thalamus. The new model, called gradient order DIVA, thus fills a void in current speech research by providing formal mechanistic hypotheses about both phonological and phonetic processes that are grounded by neuroanatomy and physiology. This framework also generates predictions that can be tested in future neuroimaging and clinical case studies.