Our brains “time-stamp” sounds to process the words we hear
Resume: The brain processes speech by using a buffer, keeping a “time stamp” of the past three speech sounds. Findings also reveal that the brain processes multiple sounds simultaneously without confusing each sound’s identity by passing information between neurons in the auditory cortex.
Source: NYU
Our brains “time-stamp” the sequence of incoming sounds, helping us correctly process the words we hear, according to a new study by a team of researchers in the fields of psychology and linguistics.
His findings, which appear in the journal Nature communicationprovide new insights into the intricacies of neurological function.
“To understand speech, your brain must accurately interpret both the identity of the speech sounds and the order in which they were spoken in order to correctly recognize the words being said,” explains Laura Gwilliams, the paper’s lead author, an NYU researcher. doctoral candidate at the time of the research and now a postdoctoral fellow at the University of California, San Francisco.
“We’re showing how the brain achieves this feat: different sounds are responded to with different neural populations. And each sound is timestamped with how much time has passed since it got to the ear. This allows the listener to understand both the order and the know the identity of the sounds someone is saying in order to correctly figure out what words the person is saying.
While the brain’s role in processing individual sounds has been well studied, we still don’t know much about how we interact with the rapid auditory sequences that make up speech. Additional understanding of brain dynamics could potentially lead to addressing neurological conditions that impair our ability to understand the spoken word.
In the Nature communication study, the scientists sought to understand how the brain processes the identity and sequence of speech sounds, given that they unfold so quickly. This is important because your brain must accurately interpret both the identity of the speech sounds (e.g. lemon) and the order in which they are spoken (e.g. 1-2-3-4-5) in order to understand the words being said (e.g. ” lemon” and not “melon”).

To do this, they recorded the brain activity of more than 20 human subjects – all native English speakers – as these subjects listened to an audiobook for two hours. Specifically, the researchers correlated the subjects’ brain activity with the properties of the speech sounds that distinguish one sound from another (eg “m” versus “n”).
The researchers found that the brain processes speech using a buffer, which maintains a running representation – i.e. timestamp – of the past three speech sounds.
The results also showed that the brain processed multiple sounds simultaneously without confusing the identity of each sound by passing information between neurons in the auditory cortex.
“We found that each speech sound triggers a cascade of neurons that fire in different places in the auditory cortex,” explains Gwilliams, who will return to NYU’s Department of Psychology in 2023 as an assistant professor.
“This means that the information about each individual sound in the phonetic word ‘ka-t’ is passed between different neural populations in a predictable way, which serves to timestamp each sound with its relative order.”
The other authors of the study were Jean-Remi King of École Normale Supérieure in Paris, Alec Marantz, a professor in NYU’s Department of Linguistics and NYU Abu Dhabi Institute, and David Poeppel, a professor in NYU’s Department of Psychology and managing director from the Ernst Struengmann Institute for Neuroscience in Frankfurt, Germany.
About this auditory neuroscience research news
Author: Press Office
Source: NYU
Contact: Press Service – NYU
Image: The image is in the public domain
Original research: Open access.
“Neural dynamics of phoneme sequences reveal position-invariant code for content and order” by Laura Gwilliams et al. Nature communication
Abstract
Neural dynamics of phoneme sequences reveal position-invariant code for content and order
Speech consists of a continuously varying acoustic signal. Yet human listeners perceive it as sequences of discrete speech sounds, used to recognize discrete words.
To investigate how the human brain properly sequences the speech signal, we recorded two-hour magnetoencephalograms of 21 participants who listened to short stories.
Our analyzes show that the brain continuously encodes the three most recently heard speech sounds in parallel, retaining this information long after the dissipation of the sensory input. Each representation of speech sound evolves over time, collectively encoding both its phonetic features and the amount of time that has elapsed since its onset.
As a result, this dynamic neural pattern encodes both the relative order and the phonetic content of the speech sequence. These representations are more likely to be active when phonemes are more predictable, and persist longer when lexical identity is uncertain.
Our results show how phonetic sequences are represented in natural speech at the level of populations of neurons, providing insight into what intermediate representations exist between the sensory input and sublexical units.
The flexibility in the dynamics of these representations paves the way for a better understanding of how such sequences can be used to interact with higher-order structures such as lexical identity.