Statistical Methods for Genetic Risk Predictions across Diverse Populations

NIH RePORTER · NIH · R01 · $566,768 · view on reporter.nih.gov ↗

Abstract

Summary Although genome-wide association studies (GWAS) have been very successful in identifying genetic variants associated with complex diseases and traits, it is still challenging to translate GWAS results into clinically useful disease risk models for improved disease prediction, prevention, diagnosis, prognosis, monitoring, and treatment. Furthermore, most GWAS conducted to date have focused on individuals of European ancestry, making it difficult to derive risk models in other populations. Recent research has suggested shared genetic contributions to complex diseases across populations and the potential benefit of considering functional annotations in cross-population analysis. The ultimate objective of this project is to develop rigorous, efficient, and robust integrative modeling approaches for risk prediction across populations by capitalizing on the vast amount of publicly available GWAS summary data, abundant functional annotations, and a growing number of studies with participants from underrepresented populations. This will be accomplished through five specific aims. The first three aims will develop three complementary approaches for cross-population risk predictions, including: (Aim 1) a Bayesian approach (ME-Pred), along the line of our published work to incorporate either functional annotation information or multiple trait information, that explicitly models joint effect sizes from multiple populations and functional annotations; (Aim 2) an empirical Bayes approach (GWEB) that considers a more general and flexible effect size distribution and statistical inference that does not need a validation cohort for tuning some model parameters; and (Aim 3) a fast and robust Bayesian nonparametric method (SDPR) that is highly adaptive to different genetic architectures and is computationally efficient. Extensive simulations will be performed to compare the performance of these methods and other existing methods. In Aim 4, we will apply these methods to evaluate the potential clinical utility for various diseases and traits, with a focus on underrepresented populations. We will also work closely with investigators from the Yale Generations Project to study the potential benefit of these tools for its study participants, including many from the underrepresented populations. We will then refine the implementations of some methods to reduce computational time and improve the user interface and analysis pipeline in Aim 5. We have assembled a team of investigators with expertise in statistical genetics, medical genetics, and high-performance computing to develop, implement, evaluate, and disseminate the proposed methods. If successful, these methods and tools will lead to more accurate genetic risk predictions in underrepresented populations, addressing a critical need in reducing health disparity.

Key facts

NIH application ID
10834227
Project number
5R01HG012735-03
Recipient
YALE UNIVERSITY
Principal Investigator
HONGYU ZHAO
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$566,768
Award type
5
Project period
2022-07-08 → 2026-04-30