# Mining minority enriched AllofUs data for innovative ethnic specific risk prediction modeling

> **NIH NIH R21** · UNIVERSITY OF MINNESOTA · 2024 · $196,273

## Abstract

PROJECT SUMMARY/ABSTRACT
Advancement of health equity requires evidence and tools tailored for minority groups. The shift towards
individualized precision medicine requires risk prediction tools to guide prevention and intervention. Due to the
genetic heterogeneity and social economic disparity, risk factors may disproportionately impact race/ethnicity
(R/E) groups. Overall risk prediction constructed from predominantly white populations can perform poorly on
other ethnic groups, leading to mis-diagnosis, over-treatment and other adverse health consequences. Efforts
on developing R/E-specific risk prediction at local healthcare systems are limited by the small sample size
caused by inadequate representability of minority populations. To address the gap and to advance precision
medicine for non-white patients, it is crucial to harness minority enriched clinical data and develop risk models
transferable to point of care. The All of Us (AoU) program offers a wealth of comprehensive multi-modal data
on whole genome sequencing (WGS), real-world electronic health records (EHR) and patient reported
outcomes (PRO) with enhanced minority participation, providing the common evidence base for learning
general R/E-specific risk patterns and training risk models for minority populations at local healthcare systems.
In this proposal, we develop innovative methods for risk modeling in AoU data tailored for minority populations
and its validation on external healthcare data. We will showcase the proposed methods in two use cases: 1)
rheumatoid arthritis (RA) genome-wide association study (GWAS) at Mass General Brigham (MGB) focusing
on the genetic risk factors; 2) cancer cardiotoxicity prediction study at M Health Fairview (MHF) focusing on
clinical and social determinants of health (SDoH) risk factors. In Aim 1, we integrate risk factor and disease
onset outcome data across WGS, EHR and PRO in AoU data to construct the risk prediction model that yields
better risk prediction accuracy, risk factor identification and fairness across R/E groups. In Aim 2, we design
privacy preserving algorithms to validate the generalizability risk modeling from AoU data on external
healthcare data and establish the transfer learning strategy to adapt AoU risk models for local healthcare
systems. We intend for the methods to facilitate development of risk modeling using AoU data with focus on
minority populations, as well as toe demonstrate the potential impact of the AoU program on improving care at
local healthcare.

## Key facts

- **NIH application ID:** 10935987
- **Project number:** 5R21MD019134-02
- **Recipient organization:** UNIVERSITY OF MINNESOTA
- **Principal Investigator:** Jue Hou
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $196,273
- **Award type:** 5
- **Project period:** 2023-09-25 → 2025-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10935987

## Citation

> US National Institutes of Health, RePORTER application 10935987, Mining minority enriched AllofUs data for innovative ethnic specific risk prediction modeling (5R21MD019134-02). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10935987. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
