# An automated machine learning approach to language changes in Alzheimer’s disease and frontotemporal dementia across Latino and English-speaking populations

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA, SAN FRANCISCO · 2023 · $1,773,132

## Abstract

PROJECT SUMMARY
Alzheimer's disease (AD) and frontotemporal dementia (FTD) are highly prevalent in Latinos, the largest and
fastest-growing minority in the United States (US). Yet, due to financial and cultural inequities, this group is
challenged to afford standard diagnostic and monitoring procedures. Also, research on Latinos lacks scalable,
culturally valid tests and it rarely examines whether potential markers are robust across socio-biological profiles.
Such issues can be tackled with low-cost automated speech and language analyses (ASLA). Participants are
asked to produce natural speech, generating multiple acoustic (sound wave) and linguistic (e.g., semantic) data
that can be digitally extracted and analyzed to identify diseases or predict neurocognitive disruptions. Yet, ASLA
findings are minimal in Latinos. Also, most ASLA studies are small and very few ha differentiated between AD
and FTD variants, compared ASLA with standard measures, accounted for socio-biological factors (e.g., sex,
race, brain profile, bilingualism) or tested for validity across languages and dialects.
This project will develop a novel ASLA framework to jointly address such challenges. To capture socio-biological
diversity and meet requisites for robust machine and deep learning analyses, we will leverage 2740 participants.
These encompass Spanish speakers from five Latin American countries (700 AD, 700 FTD, 800 controls),
English speakers from the US (140 AD, 140 FTD, 160 controls), and US-based Latinos (30 AD, 30 FTD, 40
controls), including the main variants of each disease. This is possible due to a strategic partnership between
UCSF and the Consortium to Expand Dementia Research in Latin America, a multi-funded network bringing a
fully harmonized environment and a large, growing cohort. The Global Brain Health Institute, a dementia training
hub at UCSF, hosts expert clinicians in all sites. Speech and language data will be gleaned through our new
Toolkit to Examine Lifelike Language, a HIPPA-compliant app for speech collection, storage, and visualization,
supported by a language battery and survey. Enrollees are characterized with demographic, clinical, cognitive,
and social determinants of health measures, alongside MRI and fMRI. Our ASLA approach comprises top
predicted markers for each syndrome, added fine-grained features, and embedding features. Novel machine
and deep learning algorithms for high-dimensional settings will be used to pursue three aims.
In Aim 1, we will employ machine and deep learning to reveal the ASLA markers that best identify AD and FTD
syndromes; compare them with cognitive and imaging measures; and test them for generalizability from Spanish
onto English (a typologically different language). In Aim 2, via linear regressions, we will use optimal ASLA
markers to capture syndrome-specific patterns of cognitive dysfunction, brain atrophy, and connectivity. In Aim
3, using high-dimensional machine learning, we will test such markers for ...

## Key facts

- **NIH application ID:** 10662053
- **Project number:** 1R01AG075775-01A1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA, SAN FRANCISCO
- **Principal Investigator:** MARIA LUISA GORNO TEMPINI
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $1,773,132
- **Award type:** 1
- **Project period:** 2023-08-15 → 2028-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10662053

## Citation

> US National Institutes of Health, RePORTER application 10662053, An automated machine learning approach to language changes in Alzheimer’s disease and frontotemporal dementia across Latino and English-speaking populations (1R01AG075775-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10662053. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
