Advancing Identification of Late-Talking Children and Mapping their Developmental Trajectories Using Real World Data from Electronic Health Records

NIH RePORTER · NIH · R21 · $442,750 · view on reporter.nih.gov ↗

Abstract

Late language emergence (late talking), affects one in five children in the United States today. Therefore, a question pediatricians often face at the 24-month well child visit is which late talking children will require specific assessment and early intervention services to improve their language abilities and long-term outcomes. To connect the right child with the right early intervention there is an urgent need to identify specific developmental trajectories associated with late talking. To date, late talking research has mostly relied on data from relatively small cohort and intervention studies, and included participants who may not be representative of broader pediatric populations growing up in America today. The use of real world data from electronic health records (EHR) offer a unique opportunity to address these gaps by studying large, representative cohorts of children, over longer periods of time to better understand distinct late talking developmental trajectories. While EHR data has been used to study neurodevelopmental and neurological conditions associated with late language emergence, only two EHR studies have specifically focused on late talking. These studies have relied on ICD diagnostic codes to identify the late talking phenotype in EHR records, an approach with inherent weaknesses. Disparities related to child sex, race, ethnicity, primary home language, and insurance status may result in delayed capture of ICD diagnostic codes in the medical record. Furthermore, delays may occur between when parents’ first share concerns with a professional about their child’s development and documentation via an ICD diagnosis of developmental conditions, including late talking. This study aims to improve how EHR data are used to study late talking by employing novel machine learning approaches (natural language processing) to identify late talkers within EHR databases, create open and shared data resources to identify late talking children within EHR, and leverage inherent advantages of EHR data to delineate late talking developmental trajectories. Our experienced research team of psychiatrists, language experts, pediatricians, informaticists, and data scientists are well positioned to achieve our study aims. Our long-term goal is to delineate distant trajectories associated with late talking, thus enabling a personalized intervention approach to improve child outcomes. In follow-up work, we will seek to develop a multi-site consortia to study and intervene with late talkers in real world environments. This will ultimately improve clinical decision-making and referral practices for late talking children.

Key facts

NIH application ID
11031139
Project number
1R21DC022440-01
Recipient
DUKE UNIVERSITY
Principal Investigator
Lauren Franz
Activity code
R21
Funding institute
NIH
Fiscal year
2024
Award amount
$442,750
Award type
1
Project period
2024-09-01 → 2026-08-31