# SCH: A New Computational Framework for Learning from Imbalanced Biomedical Data

> **NIH NIH R01** · UNIVERSITY OF MINNESOTA · 2024 · $300,000

## Abstract

Advances in cancer prevention, diagnosis, and treatment have dramatically improved long-term survival of
those diagnosed with breast cancer. However, this success has been tempered by a parallel increased
incidence of chronic conditions in breast cancer survivors, in particular cardiovascular disease (CVD), due
at least in part to cardiotoxic treatment regimens. Current evidence-based guidelines for preventing and
controlling CVD in breast cancer survivors are broad, and we lack clear guidance for assessing
individualized risks of cardiovascular events. Existing CVD risk prediction models focus on the general
population and rely only on a limited number of variables. The adoption and integration of electronic
health record (EHR) systems has provided a wealth of information about individual characteristics at the
point of care, including unstructured clinical narratives, imaging data, and structured clinical variables.
However, the real-world EHR data is highly imbalanced including the fraction of patients with CVD
outcomes and the uniform distribution of time for the CVD development since BC diagnosis. Our
overarching goal is to develop solid computational and theoretical foundations for learning from
imbalanced real-world data, with an emphasis on BC-CVD outcome risk prediction. Specifically, we will
develop a computational framework for imbalanced classification and imbalanced regression tasks on the
CVD risk prediction among BC survivors using multimodal EHR data. The successful implementation of
this project would lay a computational foundation for imbalanced learning and can provide more accurate
tools for predicting BC CVD outcomes.

## Key facts

- **NIH application ID:** 10895490
- **Project number:** 5R01CA287413-02
- **Recipient organization:** UNIVERSITY OF MINNESOTA
- **Principal Investigator:** Ying Cui
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $300,000
- **Award type:** 5
- **Project period:** 2023-08-01 → 2027-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10895490

## Citation

> US National Institutes of Health, RePORTER application 10895490, SCH: A New Computational Framework for Learning from Imbalanced Biomedical Data (5R01CA287413-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10895490. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
