Dynamic temporal integration of speech structure in the human brain

NIH RePORTER · NIH · R01 · $579,622 · view on reporter.nih.gov ↗

Abstract

Abstract Meaning in speech is conveyed by time-varying structures, such as phonemes and words, that have highly variable durations. As a consequence, there is a fundamental difference between integrating across physical time (e.g., 100 ms) and speech structure (e.g., a phoneme). Auditory neurophysiology models typically assume that neural integration is yoked to physical time, while many psycholinguistic theories posit that integration in speech is yoked to abstract structures such as phonemes. At present, very little is known about whether neural computations in the cortex are yoked to time or structure. As a consequence, it is unclear if there is a change from time- to structure-yoked integration across the cortex, and if so, where this transition occurs and what types of structures and computations might explain it. Filling this knowledge gap is essential to linking auditory models and cognitive theories, constructing integrated neurocomputational models of auditory-speech processing, and understanding how auditory deficits and neurological disorders impact the neural computations that underlie speech perception. Here, we fill this knowledge gap by systematically testing whether neural integration windows throughout the human cortex are yoked to time or structure and developing unified computational models that can account for both time- and structure-yoked computation in the brain. Our experimental approach is to rescale the duration of all speech structures (e.g., using stretching/compression) and measure the extent to which the neural integration window rescales with structure duration. We measure integration windows using temporally precise intracranial recordings from human neurosurgical patients, combined with a novel experimental method that makes it possible to estimate integration windows from highly nonlinear systems like the brain (Aim I). We also use the dense, whole-brain coverage of functional MRI to spatially map time- and structure-yoked integration (Aim II), and we leverage statistical decomposition techniques developed by the PI to integrate our intracranial and fMRI data. Finally, we use encoding models to directly examine the neural integration of specific, theoretically important acoustic features and speech structures, as well as develop new computational models that can explain time- and structure-yoked integration in a common framework (Aim III). Preliminary data suggest there is a transition from time- to structure-yoked integration across the putative cortical hierarchy with weak structure yoking in the superior temporal gyrus, where selectivity for speech structure first emerges, and strong structure yoking in higher-order regions of the superior temporal sulcus that integrate over longer multi-second timescales. We also show that deep neural networks trained to recognize speech structure directly from sound learn to integrate across speech using short time-yoked windows at early layers and long structure-yoked windo...

Key facts

NIH application ID: 10797605
Project number: 1R01DC020960-01A1
Recipient: UNIVERSITY OF ROCHESTER
Principal Investigator: Samuel V Norman-Haignere
Activity code: R01
Funding institute: NIH
Fiscal year: 2024
Award amount: $579,622
Award type: 1
Project period: 2024-05-01 → 2029-04-30