Spoken language is central to human communication, influencing cognition, learning, and social interactions. Despite its spontaneous nature, characterized by disfluencies, fillers, self-corrections and irregular syntax, it effectively serves its communicative purpose. Understanding how the brain processes natural language offers valuable insights into the neurobiology of language.Recent neuroscience advancements allow us to study neural processes in response to ongoing speech, requiring detailed, time-locked descriptions of speech material to capture the nuances of spoken language. While there are many speech-to-text tools available, obtaining a time-locked true verbatim transcript, reflecting everything that was uttered, requires additional effort to achieve an accurate representation.Our work outlines a semi-automatic pipeline for annotating natural speech, developed for German and Hebrew but adaptable to other languages, for creating temporally precise time-courses describing key linguistic features of continuous speech, which can be used to analyze their neural representation and level of processing. We discuss the methodological challenges and opportunities this presents, for improving our understanding of how the brain processes everyday language.

This content is only available as a PDF.

Author notes

*

Shared first author

**

Shared senior author

Handling Editor: Kate Watkins

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Article PDF first page preview

First page of Challenges and Methods in Annotating Natural Speech for Neurolinguistic Research