# Next-Generation Expressive Personalized Voices for Speech-Generating Devices

> **NIH NIH R41** · SYNFONICA, LLC · 2022 · $275,752

## Abstract

Project Summary/Abstract
The creation of personalized synthetic voices has wide application in medical/rehabilitation settings for pa-
tients who rely on a speech-generating device (SGD) for communication. One common application is voice
banking, wherein a person who risks losing their voice, such as somebody with a neurodegenerative disease
like Amyotrophic Lateral Sclerosis (ALS), records their own speech before the onset of disease-related dysar-
thria for later use in an SGD that mimics their natural speech characteristics. While the technology underlying
the creation of such personalized synthetic voices is growing in maturity and adoption by SGD users, it still suf-
fers from two primary limitations: a lack of expressiveness and a burdensome amount of recording needed to
create highly natural-sounding voices. The proposed project aims to remedy this situation by marrying the ma-
chine-learning technology behind ModelTalker, a pioneering voice-banking text-to-speech service developed at
Nemours Children’s Health, with the knowledge-based technology underlying Synfony, a rule-based text-to-
speech system developed by Synfonica LLC, which is capable of generating a variety of speech styles and ex-
pressive modes. The expert knowledge built into Synfonica will be used to design an optimal set of sentences
for voice bankers to record, and its algorithms for the generation of natural-sounding prosody in different
modes and styles will be integrated into ModelTalker’s machine-learning algorithms, creating a hybrid system
that embraces the best qualities of both approaches. The new text-to-speech (TTS) system resulting from this
project will (a) require a minimal amount of recorded speech from the voice banker, (b) accurately capture
their vocal identity, and (c) be structured such that new expressive modes and speech styles can be added easily
without additional recording. The feasibility of the project will be demonstrated by recording the voices of an
adult male, an adult female, and a child, and generating TTS voices that can speak in three expressive modes
(neutral, happy, and sad). Perceptual experiments will be run to evaluate their intelligibility, naturalness, suc-
cess in capturing the vocal identity of the speaker, and the appropriateness of their expressive modes. In gen-
eral, the project will be a major step forward in enabling the users of personalized synthetic voices to express
their emotions and intentions.

## Key facts

- **NIH application ID:** 10547241
- **Project number:** 1R41DC020693-01
- **Recipient organization:** SYNFONICA, LLC
- **Principal Investigator:** H TIMOTHY Bunnell
- **Activity code:** R41 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $275,752
- **Award type:** 1
- **Project period:** 2022-08-15 → 2024-08-14

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10547241

## Citation

> US National Institutes of Health, RePORTER application 10547241, Next-Generation Expressive Personalized Voices for Speech-Generating Devices (1R41DC020693-01). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10547241. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
