How the human brain recognizes speech in the context of changing speakers

Von Kriegstein, Katharina; Smith, David R.R.; Patterson, Roy D.; Kiebel, Stefan J.; Griffiths, Timothy D.

doi:10.1523/JNEUROSCI.2742-09.2010

How the human brain recognizes speech in the context of changing speakers

Von Kriegstein, Katharina; Smith, David R.R.; Patterson, Roy D.; Kiebel, Stefan J.; Griffiths, Timothy D.

Authors

Katharina Von Kriegstein

Dr David Smith D.R.Smith@hull.ac.uk
Senior Lecturer, Director of Studies for Psychology

Roy D. Patterson

Stefan J. Kiebel

Timothy D. Griffiths

Abstract

We understand speech from different speakers with ease, whereas artificial speech recognition systems struggle with this task. It is unclear how the human brain solves this problem. The conventional view is that speech message recognition and speaker identification are two separate functions and that message processing takes place predominantly in the left hemisphere, whereas processing of speaker-specific information is located in the right hemisphere. Here, we distinguish the contribution of specific cortical regions, to speech recognition and speaker information processing, by controlled manipulation of task and resynthesized speaker parameters. Two functional magnetic resonance imaging studies provide evidence for a dynamic speech-processing network that questions the conventional view. We found that speech recognition regions in left posterior superior temporal gyrus/superior temporal sulcus (STG/STS) also encode speaker-related vocal tract parameters, which are reflected in the amplitude peaks of the speech spectrum, along with the speech message. Right posterior STG/STS activated specifically more to a speaker-related vocal tract parameter change during a speech recognition task compared with a voice recognition task. Left and right posterior STG/STS were functionally connected. Additionally, we found that speaker-related glottal fold parameters (e. g., pitch), which are not reflected in the amplitude peaks of the speech spectrum, are processed in areas immediately adjacent to primary auditory cortex, i.e., in areas in the auditory hierarchy earlier than STG/STS. Our results point to a network account of speech recognition, in which information about the speech message and the speaker's vocal tract are combined to solve the difficult task of understanding speech from different speakers.

Citation

Von Kriegstein, K., Smith, D. R., Patterson, R. D., Kiebel, S. J., & Griffiths, T. D. (2010). How the human brain recognizes speech in the context of changing speakers. Journal of Neuroscience, 30(2), 629-638. https://doi.org/10.1523/JNEUROSCI.2742-09.2010

Journal Article Type	Article
Acceptance Date	Nov 5, 2009
Online Publication Date	Jan 13, 2010
Publication Date	Jan 13, 2010
Journal	Journal of Neuroscience
Print ISSN	0270-6474
Publisher	Society for Neuroscience
Peer Reviewed	Peer Reviewed
Volume	30
Issue	2
Pages	629-638
DOI	https://doi.org/10.1523/JNEUROSCI.2742-09.2010
Public URL	https://hull-repository.worktribe.com/output/396209