# Methods for Enhancing Polygenic Risk Prediction Models for Complex Disease

> **NIH NIH R01** · UNIVERSITY OF PENNSYLVANIA · 2024 · $771,338

## Abstract

PROJECT SUMMARY
Early screening and prevention of individuals at risk of complex diseases are important strategies for reducing
morbidity and mortality. Polygenic risk scores (PRS) are the cumulative, mathematical aggregation of risk derived
from the contributions of many DNA variants across the genome. PRS are an emerging technology in the field
of disease risk prediction and have been shown to be correlated with disease incidence. While PRS have shown
great promise for complex diseases, current PRS models are overly simplistic and have limited predictive power
and clinical utility. PRS do not account for the effects of rare genetic variants or other risk factors (clinical,
environmental, social determinants of health) on disease risk. Rare variants generally have greater effects on
disease risk due to selective pressure, but only a small number of individuals carry any single rare variant. The
sparsity of rare variants makes it difficult to directly incorporate them into PRS. Additionally, while it is known that
clinical, environmental, and social risk factors also influence risk, few analyses have successfully integrated PRS
with these important non-genetic factors.
To address this issue, we will develop novel translational informatics methods that integrate clinical,
environmental, and genetic data to improve disease risk prediction. We will assess the clinical utility of these
integrated risk prediction models using cardiovascular disease (CVD) to evaluate the potential for translation to
clinical use. Based on the complexity of CVD, we hypothesize that a comprehensive range of risk factors along
with rare variants need to be incorporated into PRS to improve the risk prediction and maximize the clinical utility
of PRS for CVD.
To achieve our goal, our specific aims are: 1) To develop novel methods that incorporate rare genetic variants
into Polygenic Risk Scores (PRS); 2) To evaluate Integrated Risk Models that combine clinical, environmental,
and social risk factors with PRS; 3) To develop and evaluate deep learning models integrating genetic, clinical,
environmental, and social risk factors; 4) To translate our integrated models into the electronic health record
(EHR). If these specific aims are achieved, we will have a set of integrated models that can be used in
downstream clinical implementation programs to ultimately have a translational impact on disease treatment and
prevention. Using these novel computational risk prediction models for precision health, along with our EHR
integration approaches, will allow for the translation of integrated risk prediction into routine clinical care.

## Key facts

- **NIH application ID:** 10851827
- **Project number:** 5R01HL169458-02
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Dokyoon Kim
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $771,338
- **Award type:** 5
- **Project period:** 2023-07-01 → 2027-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10851827

## Citation

> US National Institutes of Health, RePORTER application 10851827, Methods for Enhancing Polygenic Risk Prediction Models for Complex Disease (5R01HL169458-02). Retrieved via AI Analytics 2026-05-26 from https://api.ai-analytics.org/grant/nih/10851827. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
