Expanding articulatory information from ultrasound imaging of speech using MRI-based image simulations and audio measurements

NIH RePORTER · NIH · F31 · $39,694 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Ultrasound imaging provides articulatory feedback useful for remediating speech sound disorders, which affect 5% of children and cause long-term deficits in social health and employment in adulthood. However, ultrasound imaging can be difficult to interpret for clinicians and individuals, limiting the understanding of articulatory data and ultrasound biofeedback therapy speech outcomes. A likely source of difficulty is the articulatory information missing from ultrasound images, such as the tongue tip and reference vocal tract structures (e.g., palate) that cannot be consistently imaged with ultrasound due to air. Much of this missing information from ultrasound can be ascertained in magnetic resonance imaging (MRI) because MRI images the entire vocal tract. Comparing ultrasound images and MRI will improve interpretation of ultrasound images by confirming that certain characteristics of ultrasound images (e.g., obscured tongue tip, double edge artifacts) occur from characteristics of tongue shapes; as well, models can be trained to predict from ultrasound images the articulatory information shown in MRI. However, articulatory variability prevents direct comparison between these images. A novel approach to avoid variability is to simulate ultrasound wave propagation in tissue segmented from MRI. Recent advancements in deep learning have also demonstrated ability to address the inverse problem of predicting articulation from acoustic data. Thus, to meet the needs of improving ultrasound image interpretation, the goal for this proposal is to use simulated ultrasound images and neural network models to characterize and predict articulatory information missing from 2D midsagittal ultrasound images. These models will be trained on MRI and audio data. We will characterize missing articulatory information by developing efficient simulation of ultrasound images from MRI tissue segmentation. One hypothesis that will be tested is the guideline for using the lower edge of double edge artifacts in ultrasound images as the tongue surface. To test this guideline for a greater range of data (including disordered child speakers and different simulated probe rotations), double edge artifacts will be compared with tissue maps used to generate the simulated images. Another comparison will estimate the amount of tongue tip typically missing in /r/ tongue shapes. We will then develop a deep learning model that trains on information from MRI to predict midsagittal vocal tract shapes (including the tongue tip and palate) from the inputs of tongue contours from ultrasound and audio. With these aims, we will add insight to ultrasound imaging for speech and provide a tool with future applications in expanding articulatory information, e.g., testing outcomes of using more complete vocal tract information in ultrasound biofeedback therapy. Training for this fellowship will occur at the University of Cincinnati, with opportunities to visit labs at two addi...

Key facts

NIH application ID
10537976
Project number
1F31DC020672-01
Recipient
UNIVERSITY OF CINCINNATI
Principal Investigator
Sarah Rotong Li
Activity code
F31
Funding institute
NIH
Fiscal year
2022
Award amount
$39,694
Award type
1
Project period
2022-08-01 → 2024-07-31