Identifying and addressing missingness and bias to enhance discovery from multimodal health data

NIH RePORTER · NIH · R01 · $387,919 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Recent successes of machine learning (especially deep learning) in analyzing electronic health record (EHR) data have not only stimulated excitement in stake holders but have also raised concerns potential unfair or biased clinical decision making facilitated by machine learning. A number of fairness measurements have been proposed. However, they underappreciate the chronical systematic differences between the distributions of protected and unprotected groups. Hence, when used to develop machine learning methods, they may worsen within-group issues and dampen performance of the trained machine learning models. The situation can be further complicated by missing values that are common in EHR data, which will exacerbate unfairness if not handled properly. In this project, we aim to develop a novel fairness evaluation methodology (Aim 1) and incorporate it into the development of innovative machine learning models and techniques to reduce biases and increase interpretability (Aim 2). To better and more fairly handle missing values, we will develop new machine learning models that contain trainable in-process missing value imputation components and new algorithms to train them with constraints defined by our new fairness evaluation method (Aim 3). In addition, we will develop proactive machine learning techniques to advance heath equity (Aim 4). We will evaluate and improve our new fairness measurements and machine learning techniques in the context of facilitating clinical decision making (Aim 5). Large datasets from two of the largest US healthcare systems will be used in carrying out the proposed research.

Key facts

NIH application ID: 10839345
Project number: 5R01LM014239-02
Recipient: BRIGHAM AND WOMEN'S HOSPITAL
Principal Investigator: Pengyu Hong
Activity code: R01
Funding institute: NIH
Fiscal year: 2024
Award amount: $387,919
Award type: 5
Project period: 2023-05-10 → 2027-02-28