A variety of data indicate that the cerebellum participates in perceptual tasks requiring the precise representation of temporal information. Access to the word form of a lexical item requires, among other functions, the processing of durational parameters of verbal utterances. Therefore, cerebellar dysfunctions must be expected to impair word recognition. In order to specify the topography of the assumed cerebellar speech perception mechanism, a functional magnetic resonance imaging study was performed using the German lexical items “Boden” ([bodn], Engl. “floor”) and “Boten” ([botn], “messengers”) as test materials. The contrast in sound structure of these two lexical items can be signaled either by the length of the wordmedial pause (closure time, CLT; an exclusively temporal measure) or by the aspiration noise of wordmedial “d” or “t” (voice onset time, VOT; an intrasegmental cue). A previous study found bilateral cerebellar disorders to compromise word recognition based on CLT whereas the encoding of VOT remained unimpaired. In the present study, two series of “Boden—Boten” utterances were resynthesized, systematically varying either in CLT or VOT. Subjects had to identify both words “Boden” and “Boten” by analysis of either the durational parameter CLT or the VOT aspiration segment. In a subtraction design, CLT categorization as compared to VOT identification (CLT VOT) yielded a significant hemodynamic response of the right cerebellar hemisphere (neocerebellum Crus I) and the frontal lobe (anterior to Broca's area). The reversed contrast (VOT CLT) resulted in a single activation cluster located at the level of the supra-temporal plane of the dominant hemisphere. These findings provide first evidence for a distinct contribution of the right cerebellar hemisphere to speech perception in terms of encoding of durational parameters of verbal utterances. Verbal working memory tasks, lexical response selection, and auditory imagery of word strings have been reported to elicit activation clusters of a similar location. Conceivably, representation of the temporal structure of speech sound sequences represents the common denominator of cerebellar participation in cognitive tasks acting on a phonetic code.