# Multisite Electronic Health Record-Based Surveillance of the Burden of Diabetes by Type in Young Adults

> **NIH ALLCDC U18** · CORNELL UNIVERSITY · 2020 · $250,000

## Abstract

Approximately 3 million young adults aged 18-44 years currently have diabetes in the United States. This
number is projected to increase to ~5.8 million by 2060. Differentiating diabetes types is crucial, because the
etiology, treatments, and outcomes of diabetes differ substantially by type. Type 1 diabetes (T1D) accounts for
~17% and type 2 diabetes (T2D) ~75% of total diabetes in US young adults. This distribution of diabetes types
continuously evolves. We do not have a large-scale surveillance system to monitor the prevalence and
incidence of T1D and T2D in US young adults. The widespread use and increasing functionality of electronic
health record (EHR) systems substantially increase the quantity, breadth, and timeliness of data available for
surveillance and reduce costs compared with population-based registries and surveys. EHR algorithms have
shown great potential in identifying diabetes cases. This study will analyze both structured EHR data (e.g.,
diagnosis codes, medications, and laboratory results) and unstructured clinical notes. We will apply expert
knowledge, machine learning, and natural language processing to develop the best algorithms for identifying
prevalent and incident T1D and T2D cases. The primary objective of this study is to establish an EHR-based
surveillance system for monitoring the burden of T1D and T2D in US young adults. We will collaborate with 3
EHR research networks from the National Patient-Centered Clinical Research Network (PCORnet), covering
~6 million racially, ethnically, and socioeconomically diverse young adults from 4 states (IL, LA, NY, and TX) in
3 Census regions. The patient populations in this study are roughly representative of the source populations in
the catchment areas. The specific aims of this study are 1) to estimate the prevalence of T1D and T2D in US
young adults by age, sex, race/ethnicity, and geographic region in 2019; 2) to estimate the incidence of T1D
and T2D in US young adults by age, sex, race/ethnicity, and geographic region in 2019; 3) to estimate 10-year
trends in the prevalence and incidence of T1D and T2D in US young adults by age, sex, race/ethnicity, and
geographic region, 2014-2023; and 4) to compare the prevalence and incidence of diabetes by type, as well as
temporal trends, in US young adults with those in young adults from other countries and regions. This study is
innovative, because it will detect a false negative rate as low as 0.2%, leverage EHRs for surveillance (more
efficient and cost-effective than registries and surveys), use advanced statistical approaches (e.g., machine
learning and natural language processing), estimate a denominator using patient zip codes, build flexibility into
the surveillance methods according to local availability of clinical notes, and use a 2-staged sampling approach
to improve chart review efficiency. This study will advance our understanding of the age, sex, racial/ethnic, and
geographic differences in the burden of T1D and T2D in...

## Key facts

- **NIH application ID:** 10085448
- **Project number:** 1U18DP006502-01
- **Recipient organization:** CORNELL UNIVERSITY
- **Principal Investigator:** Wenze Zhong
- **Activity code:** U18 (R01, R21, SBIR, etc.)
- **Funding institute:** ALLCDC
- **Fiscal year:** 2020
- **Award amount:** $250,000
- **Award type:** 1
- **Project period:** 2020-09-30 → 2025-09-29

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10085448

## Citation

> US National Institutes of Health, RePORTER application 10085448, Multisite Electronic Health Record-Based Surveillance of the Burden of Diabetes by Type in Young Adults (1U18DP006502-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10085448. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
