Methods for multi-ancestry and multi-trait fine-mapping and genetic risk prediction

NIH RePORTER · NIH · F31 · $34,571 · view on reporter.nih.gov ↗

Abstract

Project Summary: Two fundamental goals in genetic epidemiology are the identification of genetic variants that cause disease (fine-mapping) and the development of polygenic risk scores (PRS) that predict individual- level disease risk using genetic information. As genetic datasets expand, these goals become increasingly realistic. However, most genetic datasets overrepresent European populations, limiting the generalizability of scientific findings, the discovery of causal variants, and the accuracy of PRS in non-European populations. If unaddressed, differences in PRS accuracy will widen ancestry-based health disparities. Most methods in genetic epidemiology consider one ancestry and disease at a time. This research proposes methods for causal variant identification and genetic risk prediction that share information across ancestries and diseases. The first aim is to develop a method for fine-mapping using data from multiple ancestry groups. Causal variant identification provides insight into disease etiology and helps researchers identify drug targets. The sum of single effects (SuSiE) model is a powerful approach for fine-mapping in a single population. Incorporating data from multiple populations can greatly improve fine-mapping due to ancestry-based differences in patterns of correlation between variants and the presence of variants with causal effects in some, but not all ancestries. In this aim, MultiSuSiE, a multi-population fine-mapping method motivated by SuSiE will be developed and applied. SuSiE provides substantial benefits in terms of speed, power, and interpretability compared to other fine-mapping methods. MultiSuSiE will bring the state-of-the-art in fine-mapping to the multi-ancestry context. The second aim is to develop and apply ssCTPR, a summary statistic based PRS method that leverages shared information across diseases. PRS show great promise for informing medical treatment decisions and disease screening interventions. A recent method, cross-trait penalized regression (CTPR), boosts prediction accuracy by leveraging shared genetic bases across diseases but requires difficult-to-obtain individual-level data. In this aim, ssCTPR, a multi-trait summary statistic-based method motivated by CTPR will be developed and applied. ssCTPR is innovative in its statistical approach: ssCTPR will jointly model variants and diseases, use penalized regression, and share information across traits using a Laplacian quadratic penalty that is effective in the multi-disease setting, but has not been investigated using summary statistics. The third aim is to develop a method that uses the methodological advances of aims 1 and 2 to improve PRS prediction in non-European populations. PRS prediction accuracy in non-European populations is much lower than in European populations. As PRS enter the clinic, populations with inequitable health outcomes will fail to benefit from the latest in precision medicine innovation. In this aim, MultiPolyPred, a...

Key facts

NIH application ID
10937068
Project number
5F31HG013040-02
Recipient
HARVARD UNIVERSITY D/B/A HARVARD SCHOOL OF PUBLIC HEALTH
Principal Investigator
Jordan Rossen
Activity code
F31
Funding institute
NIH
Fiscal year
2024
Award amount
$34,571
Award type
5
Project period
2023-09-25 → 2025-06-30