# Identifying and addressing missingness and bias to enhance discovery from multimodal health data

> **NIH NIH R01** · BRIGHAM AND WOMEN'S HOSPITAL · 2024 · $387,919

## Abstract

PROJECT SUMMARY
 Recent successes of machine learning (especially deep learning) in analyzing electronic health record (EHR) data have
not only stimulated excitement in stake holders but have also raised concerns potential unfair or biased clinical decision
making facilitated by machine learning. A number of fairness measurements have been proposed. However, they
underappreciate the chronical systematic differences between the distributions of protected and unprotected groups. Hence,
when used to develop machine learning methods, they may worsen within-group issues and dampen performance of the
trained machine learning models. The situation can be further complicated by missing values that are common in EHR data,
which will exacerbate unfairness if not handled properly. In this project, we aim to develop a novel fairness evaluation
methodology (Aim 1) and incorporate it into the development of innovative machine learning models and techniques to
reduce biases and increase interpretability (Aim 2). To better and more fairly handle missing values, we will develop new
machine learning models that contain trainable in-process missing value imputation components and new algorithms to train
them with constraints defined by our new fairness evaluation method (Aim 3). In addition, we will develop proactive
machine learning techniques to advance heath equity (Aim 4). We will evaluate and improve our new fairness measurements
and machine learning techniques in the context of facilitating clinical decision making (Aim 5). Large datasets from two of
the largest US healthcare systems will be used in carrying out the proposed research.

## Key facts

- **NIH application ID:** 10839345
- **Project number:** 5R01LM014239-02
- **Recipient organization:** BRIGHAM AND WOMEN'S HOSPITAL
- **Principal Investigator:** Pengyu Hong
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $387,919
- **Award type:** 5
- **Project period:** 2023-05-10 → 2027-02-28

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10839345

## Citation

> US National Institutes of Health, RePORTER application 10839345, Identifying and addressing missingness and bias to enhance discovery from multimodal health data (5R01LM014239-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10839345. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
