Characterizing the generative mechanisms underlying the cortical tracking of natural speech

NIH RePORTER · NIH · R01 · $385,000 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Speech is central to human life. Yet how the human brain converts patterns of acoustic speech energy into meaning remains unclear. This is particularly true for natural, continuous speech, which requires us to efficiently parse and process speech at multiple timescales in the context of our ongoing conversation and situational knowledge. Much progress has been made on this problem in recent years by the realization that the dynamics of cortical activity track those of natural speech. This has led to the development of new methods to study the neurophysiology of speech processing in more naturalistic paradigms. However, the field still lacks consensus regarding the precise physiological mechanisms and neurostructural origins of this tracking. In particular, two contrasting theories have been advanced that attempt to explain the genesis of this phenomenon. The first proposes that the quasi-rhythmic nature of continuous speech “entrains” intrinsic, endogenous oscillations in the brain as a way to parse that continuous speech into smaller units for further (linguistic) processing. Meanwhile, the second proposes that the cortical tracking of speech reflects the summation of a series of transient evoked responses from hierarchically organized neural networks that are tuned to the different acoustic and linguistic features of speech. The contrast between these two ideas is reflected in the emergence of two almost completely non-overlapping literatures in the field of speech electrophysiology. This is highly problematic as the design and interpretation of most studies on this important topic are now filtered through either one or the other of these theoretical lenses. Without a clear understanding of the true mechanisms involved, our collective work on this topic thus runs the risk of being distorted through misconception. This project aims to address this urgent need by critically examining these two frameworks side by side. We aim to do so by collecting scalp EEG from human adults as they listen to natural and manipulated speech. These manipulations will involve varying speech across several dimensions that should maximize the differences in the predictions made by each theory. We specifically aim to test the hypothesis that both evoked responses and entrained oscillations contribute to the cortical tracking of speech, with their relative contributions varying as a function of the statistics of the speech and attention. We will test this hypothesis by analyzing the EEG data with reference to computational models of both evoked and oscillatory activity. Furthermore, we will use the same analytical framework to model signals from different regions of the speech/language processing hierarchy acquired using intracranial recordings in neurosurgical patients. This will allow us to test the deeper hypothesis that evoked and oscillatory mechanisms operate differently in different cortical areas. We will also leverage these intracranial findings...

Key facts

NIH application ID: 10884434
Project number: 5R01DC021140-02
Recipient: UNIVERSITY OF ROCHESTER
Principal Investigator: Edmund Lalor
Activity code: R01
Funding institute: NIH
Fiscal year: 2024
Award amount: $385,000
Award type: 5
Project period: 2023-07-07 → 2028-06-30