# Statistical Methods for Genetic Risk Predictions across Diverse Populations

> **NIH NIH R01** · YALE UNIVERSITY · 2023 · $568,665

## Abstract

Summary
Although genome-wide association studies (GWAS) have been very successful in identifying genetic variants
associated with complex diseases and traits, it is still challenging to translate GWAS results into clinically
useful disease risk models for improved disease prediction, prevention, diagnosis, prognosis, monitoring, and
treatment. Furthermore, most GWAS conducted to date have focused on individuals of European ancestry,
making it difficult to derive risk models in other populations. Recent research has suggested shared genetic
contributions to complex diseases across populations and the potential benefit of considering functional
annotations in cross-population analysis. The ultimate objective of this project is to develop rigorous, efficient,
and robust integrative modeling approaches for risk prediction across populations by capitalizing on the vast
amount of publicly available GWAS summary data, abundant functional annotations, and a growing number of
studies with participants from underrepresented populations. This will be accomplished through five specific
aims. The first three aims will develop three complementary approaches for cross-population risk predictions,
including: (Aim 1) a Bayesian approach (ME-Pred), along the line of our published work to incorporate either
functional annotation information or multiple trait information, that explicitly models joint effect sizes from
multiple populations and functional annotations; (Aim 2) an empirical Bayes approach (GWEB) that considers a
more general and flexible effect size distribution and statistical inference that does not need a validation cohort
for tuning some model parameters; and (Aim 3) a fast and robust Bayesian nonparametric method (SDPR) that
is highly adaptive to different genetic architectures and is computationally efficient. Extensive simulations will
be performed to compare the performance of these methods and other existing methods. In Aim 4, we will
apply these methods to evaluate the potential clinical utility for various diseases and traits, with a focus on
underrepresented populations. We will also work closely with investigators from the Yale Generations Project
to study the potential benefit of these tools for its study participants, including many from the underrepresented
populations. We will then refine the implementations of some methods to reduce computational time and
improve the user interface and analysis pipeline in Aim 5. We have assembled a team of investigators with
expertise in statistical genetics, medical genetics, and high-performance computing to develop, implement,
evaluate, and disseminate the proposed methods. If successful, these methods and tools will lead to more
accurate genetic risk predictions in underrepresented populations, addressing a critical need in reducing health
disparity.

## Key facts

- **NIH application ID:** 10662188
- **Project number:** 5R01HG012735-02
- **Recipient organization:** YALE UNIVERSITY
- **Principal Investigator:** HONGYU ZHAO
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $568,665
- **Award type:** 5
- **Project period:** 2022-07-08 → 2026-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10662188

## Citation

> US National Institutes of Health, RePORTER application 10662188, Statistical Methods for Genetic Risk Predictions across Diverse Populations (5R01HG012735-02). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10662188. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
