Auditory communication signals such as monkey calls are complex FM vocal sounds and in general induce action potentials in different timing in the primary auditory cortex. Delay line scheme is one of the effective ways for detecting such neuronal timing. However, the scheme is not straightforwardly applicable if the time intervals of signals are beyond the latency time of delay lines. In fact, monkey calls are often expressed in longer time intervals (hundreds of milliseconds to seconds) and are beyond the latency times observed in the brain (less than several hundreds of milliseconds). Here, we propose a cochleotopic map similar to that in vision known as a retinotopic map. We show that information about monkey calls could be mapped on a cochleotopic cortical network as spatiotemporal firing patterns of neurons, which can then be decomposed into simple (linearly sweeping) FM components and integrated into unified percepts by higher cortical networks. We suggest that the spatiotemporal conversion of auditory information may be essential for developing the cochleotopic map, which could serve as the foundation for later processing, or monkey call identification by higher cortical areas.