In primates, different vocalizations are produced, at least in part, by making different facial expressions. Not surprisingly, humans, apes, and monkeys all recognize the correspondence between vocalizations and the facial postures associated with them. However, one major dissimilarity between monkey vocalizations and human speech is that, in the latter, the acoustic output and associated movements of the mouth are both rhythmic (in the 3- to 8-Hz range) and tightly correlated, whereas monkey vocalizations have a similar acoustic rhythmicity but lack the concommitant rhythmic facial motion. This raises the question of how we evolved from a presumptive ancestral acoustic-only vocal rhythm to the one that is audiovisual with improved perceptual sensitivity. According to one hypothesis, this bisensory speech rhythm evolved through the rhythmic facial expressions of ancestral primates. If this hypothesis has any validity, we expect that the extant nonhuman primates produce at least some facial expressions with a speech-like rhythm in the 3- to 8-Hz frequency range. Lip smacking, an affiliative signal observed in many genera of primates, satisfies this criterion. We review a series of studies using developmental, x-ray cineradiographic, EMG, and perceptual approaches with macaque monkeys producing lip smacks to further investigate this hypothesis. We then explore its putative neural basis and remark on important differences between lip smacking and speech production. Overall, the data support the hypothesis that lip smacking may have been an ancestral expression that was linked to vocal output to produce the original rhythmic audiovisual speech-like utterances in the human lineage.