# Leveraging a Unique Dataset to Identify Outcome Predictors in Late Talkers

> **NIH NIH R21** · UNIVERSITY OF CALIFORNIA, SAN DIEGO · 2024 · $437,479

## Abstract

Late talking in toddlers is not only a common reason for pediatrician concern and referral for developmental
evaluation, but it can be a precursor of either persistent language disorder; worsening functional ability such as
autism or global delay; or, oddly enough, the opposite: a neurotypical outcome. Yet, predictors of such
dramatically divergent, heterogenous outcomes remain elusive. Literature reviews reveal substantial lack of
replication in the field, even among the strongest findings, perhaps because many studies tend to have samples
too small to address heterogenous trajectories; use non-comparable ascertainment and recruitment; and engage
limited language, clinical, social, and neurobehavioral assessments. In contrast, our existent sample: (a) is very
large and includes N=1,667 toddlers (552 late talkers), (702 ASD) and (413 typical); (b) were all ascertained,
recruited and clinically characterized in a uniform procedure by licensed clinical psychologists; (c) is
representative of the spectrum of late talkers and typical toddlers; and (d) were longitudinally phenotyped at
toddler (mean age 20 months) and preschool ages (mean age 36 months) using the same language and clinical
tests. In our sample of N=552 late talking toddlers defined using a cut-off of expressive language (EL) < -1 SD,
51% had persistent expressive language delays or worsening language outcomes such as ASD or global delay
by preschool. Yet, 49% of our late talkers made rapid and substantial expressive language advances, achieving
neurotypical levels by preschool. AIM 1 will leverage this unique sample to identify toddler-age precursors
predictive of one of 5 divergent language & clinical outcomes at preschool ages (Transient EL Delay; Persistent
EL Delay; Conversion to LD; Conversion to GDD; Conversion to ASD). Nine commonly reported predictors of
expressive language outcomes will be analyzed using linear regression specifically: receptive language ability
at intake; expressive vocabulary size at intake; % nouns and shape nouns in vocabulary composition; SES, sex,
socialization, mean length of utterance (MLU) and % verbal initiations. Multinomial logistic regression will
determine which of the toddler-age variables are most strongly associated with clinical and language outcome
group membership. Change across time for each measure within each outcome group will also be analyzed.
AIM 2: Social and language development are inextricably linked, and measures of attention to social speech
such as motherese and social images have been shown to be associated with language ability. To go beyond
commonly examined predictors, AIM 2 will leverage previously collected eye tracking (ET) data of auditory and
visual social attention in late talking, ASD, and TD toddlers. Using our large TD sample, reference standards for
levels of social auditory and social visual attention based on 7 key metrics (e.g., level of attention to motherese
speech) across 2-month age bands will be created. ...

## Key facts

- **NIH application ID:** 11031009
- **Project number:** 1R21DC022449-01
- **Recipient organization:** UNIVERSITY OF CALIFORNIA, SAN DIEGO
- **Principal Investigator:** ERIC COURCHESNE
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $437,479
- **Award type:** 1
- **Project period:** 2024-09-01 → 2026-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/11031009

## Citation

> US National Institutes of Health, RePORTER application 11031009, Leveraging a Unique Dataset to Identify Outcome Predictors in Late Talkers (1R21DC022449-01). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/11031009. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
