# EHR-based vs population-based CVD risk predictions for older patients with diabetes

> **NIH NIH R01** · NEW YORK UNIVERSITY SCHOOL OF MEDICINE · 2021 · $320,767

## Abstract

Our application entitled “Fair Risk Predictions for Underrepresented Populations using Electronic Health
Records” responds to NOT-OD-21-094: “Administrative Supplements to Support Collaborations to Improve the
Artificial Intelligence/Machine Learning [AI/ML]-Readiness of NIH-Supported Data”, and supplements our
parent NIA R01 grant (R01 AG065330) entitled “EHR-based vs population-based CVD risk predictions for older
patients with diabetes”. The overarching goal of the parent grant is to develop individualized absolute risk
predictions of Cardiovascular diseases for older patients with diabetes using EHRs. We propose the
supplement study, in parallel, to 1) investigate under-representations (bias) of racial/ethnic minorities and
patients with disadvantaged Social Determinants of Health (SDOHs) in EHRs, 2) develop fairness-aware EHR
prediction methods; and 3) share the simulated EHR datasets, linked SDOH datasets along with the developed
fair risk prediction tools to inspire and enable the AI/ML research community for further investigations of fair
EHR algorithms. During the implementation of our parent R01 project, we found emerging evidence of under-
sampling bias in EHRs for racial/ethnic minorities and patients with disadvantaged SDOHs. Such patients are
more likely to visit multiple institutions to receive care, and often receive fewer diagnostic tests and
medications in the EHR data of a single institution. We hypothesize that racial/ethnic minorities and patients
with disadvantaged SDOHs are under-represented with smaller sample sizes, insufficient diagnostics and
laboratory information, and less frequent encounters in EHRs. Consequently, we hypothesize that EHR-based
risk prediction models (including conventional linear models and modern AI/ML methods) ignoring the
unbalanced samplings will have less-accurate predictions for these under-represented patient populations.
Little or no work has been done to systematically investigate the impact of these biases. We then propose to
develop fairness improvement prediction approaches for EHRs. Upon the supplement project completion, the
developed fair predictions will lay the groundwork and provide resources for the broader AI/ML research
community for developing fair predictions to advance disease predictions and detections for racial/ethnic
minorities and patients with disadvantaged SDOHs.

## Key facts

- **NIH application ID:** 10412553
- **Project number:** 3R01AG065330-02S1
- **Recipient organization:** NEW YORK UNIVERSITY SCHOOL OF MEDICINE
- **Principal Investigator:** Hua Judy Zhong
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $320,767
- **Award type:** 3
- **Project period:** 2020-08-15 → 2025-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10412553

## Citation

> US National Institutes of Health, RePORTER application 10412553, EHR-based vs population-based CVD risk predictions for older patients with diabetes (3R01AG065330-02S1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10412553. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
