Detecting and classifying non-fluent speech in aphasia using machine learning

NIH RePORTER · NIH · F32 · $74,302 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Among the approximately 2 million Americans living with post-stroke aphasia, many experience difficulties with verbal expression that render everyday communication effortful, inefficient, and stressful.1,32 For persons with aphasia (PWA), speech non-fluency is often experienced as a visible disability with significant social consequences.36,37 Given this functional salience, speech fluency is an important construct to assess, monitor, and treat. It is, however, a longstanding clinical challenge to index fluency in a way that is comprehensive, interpretable, and efficient,7 and current approaches rely on either expert clinician ratings or time-intensive linguistic analyses using detailed coding. Temporal acoustic measures, by contrast, are objective measures that can be automatically or semi-automatically derived from connected speech. Prior research has demonstrated that the rate and rhythm of speech output reflect underlying impairments in both speech and language (e.g., motor speech, lexical retrieval), suggesting the utility of temporal acoustic measures to index non-fluency in PWA. The goal of the current study is to investigate the feasibility of using automated temporal acoustic features to identify non-fluent aphasia and to better understand the latent speech, language, and cognitive constructs underlying these surface speech features. To achieve this goal, we leverage machine learning techniques as part of a predictive modeling approach to identify speech features whose clinical utility can be generalized to inform future assessment of fluency in aphasia. In Aim 1, we will investigate whether temporal acoustic features accurately predict fluency status using a supervised machine learning approach (Aim 1a), and which features are most important to clinical distinctions of interest (fluent v. non-fluent; present v. absent motor speech impairment; Aim 1b). In Aim 2, we will determine the underlying speech, language, and cognitive contributors to inter-individual variability in temporal acoustic measures, thereby augmenting the explanatory power of study results. These aims are a first step toward an interpretable and automatable predictive model of fluency in PWA that can be generalized to novel diagnostic situations. Results of this research will help clinicians identify important features for efficient assessment of and treatment planning for patients as well as provide a mechanistic understanding of surface level features by mapping those features to explanatory clinical sub-constructs.

Key facts

NIH application ID: 10459913
Project number: 1F32DC020342-01
Recipient: BOSTON UNIVERSITY (CHARLES RIVER CAMPUS)
Principal Investigator: Claire Elizabeth Cordella
Activity code: F32
Funding institute: NIH
Fiscal year: 2022
Award amount: $74,302
Award type: 1
Project period: 2022-04-01 → 2024-03-31