Whether nonhuman primates can decouple their innate vocalizations from accompanied levels of arousal or specific events in the environment to achieve cognitive control over their vocal utterances has been a matter of debate for decades. We show that rhesus monkeys can be trained to elicit different call types on command in response to arbitrary visual cues. Furthermore, we report that a monkey learned to switch between two distinct call types from trial to trial in response to different visual cues. A controlled behavioral protocol and data analysis based on signal detection theory showed that noncognitive factors as a cause for the monkeys' vocalizations could be excluded. Our findings also suggest that monkeys also have rudimentary control over acoustic call parameters. These findings indicate that monkeys are able to volitionally initiate their vocal production and, therefore, are able to instrumentalize their vocal behavior to perform a behavioral task successfully.