Several previous functional imaging experiments have demonstrated that auditory presentation of speech, relative to tones or scrambled speech, activate the superior temporal sulci (STS) bilaterally. In this study, we attempted to segregate the neural responses to phonological, lexical, and semantic input by contrasting activation elicited by heard words, meaningless syllables, and environmental sounds. Inevitable differences between the duration and amplitude of each stimulus type were controlled with auditory noise bursts matched to each activation stimulus. Half the subjects were instructed to say “okay” in response to presentation of all stimuli. The other half repeated back the words and syllables, named the source of the sounds, and said “okay” to the control stimuli (noise bursts). We looked for stimulus effects that were consistent across task. The results revealed that central regions in the STS were equally responsive to speech (words and syllables) and familiar sounds, whereas the posterior and anterior regions of the left superior temporal gyrus were more active for speech. The effect of semantic input was small but revealed more activation in the inferior temporal cortex for words and familiar sounds than syllables and noise. In addition, words (relative to syllables, sounds, and noise) enhanced activation in the temporo-parietal areas that have previously been linked to modality independent semantic processing. Thus, in cognitive terms, we dissociate phono-logical (speech) and semantic responses and propose that word specificity arises from functional integration among shared phonological and semantic areas.