# Characterizing the generative mechanisms underlying the cortical tracking of natural speech

> **NIH NIH R01** · UNIVERSITY OF ROCHESTER · 2024 · $385,000

## Abstract

PROJECT SUMMARY
Speech is central to human life. Yet how the human brain converts patterns of acoustic speech energy into
meaning remains unclear. This is particularly true for natural, continuous speech, which requires us to efficiently
parse and process speech at multiple timescales in the context of our ongoing conversation and situational
knowledge. Much progress has been made on this problem in recent years by the realization that the dynamics
of cortical activity track those of natural speech. This has led to the development of new methods to study the
neurophysiology of speech processing in more naturalistic paradigms. However, the field still lacks consensus
regarding the precise physiological mechanisms and neurostructural origins of this tracking. In particular, two
contrasting theories have been advanced that attempt to explain the genesis of this phenomenon. The first
proposes that the quasi-rhythmic nature of continuous speech “entrains” intrinsic, endogenous oscillations in the
brain as a way to parse that continuous speech into smaller units for further (linguistic) processing. Meanwhile,
the second proposes that the cortical tracking of speech reflects the summation of a series of transient evoked
responses from hierarchically organized neural networks that are tuned to the different acoustic and linguistic
features of speech. The contrast between these two ideas is reflected in the emergence of two almost completely
non-overlapping literatures in the field of speech electrophysiology. This is highly problematic as the design and
interpretation of most studies on this important topic are now filtered through either one or the other of these
theoretical lenses. Without a clear understanding of the true mechanisms involved, our collective work on this
topic thus runs the risk of being distorted through misconception. This project aims to address this urgent need
by critically examining these two frameworks side by side. We aim to do so by collecting scalp EEG from human
adults as they listen to natural and manipulated speech. These manipulations will involve varying speech across
several dimensions that should maximize the differences in the predictions made by each theory. We specifically
aim to test the hypothesis that both evoked responses and entrained oscillations contribute to the cortical tracking
of speech, with their relative contributions varying as a function of the statistics of the speech and attention. We
will test this hypothesis by analyzing the EEG data with reference to computational models of both evoked and
oscillatory activity. Furthermore, we will use the same analytical framework to model signals from different
regions of the speech/language processing hierarchy acquired using intracranial recordings in neurosurgical
patients. This will allow us to test the deeper hypothesis that evoked and oscillatory mechanisms operate
differently in different cortical areas. We will also leverage these intracranial findings...

## Key facts

- **NIH application ID:** 10884434
- **Project number:** 5R01DC021140-02
- **Recipient organization:** UNIVERSITY OF ROCHESTER
- **Principal Investigator:** Edmund Lalor
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $385,000
- **Award type:** 5
- **Project period:** 2023-07-07 → 2028-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10884434

## Citation

> US National Institutes of Health, RePORTER application 10884434, Characterizing the generative mechanisms underlying the cortical tracking of natural speech (5R01DC021140-02). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10884434. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*