Mendelian imputation for family-based GWAS and association-by-proxy in diverse ancestries

NIH RePORTER · NIH · R01 · $800,648 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract For this application, “Mendelian imputation for family-based GWAS and association-by-proxy in diverse ancestries,” we propose to develop methods to enable more powerful estimation of family-based genome- wide association studies (GWASs) and apply these methods to a wide range of health, disease, and aging phenotypes in diverse populations. In brief, we propose to: · Meta-analyze family-based GWAS summary statistics on 30 phenotypes from 14 cohorts of predominantly European ancestry. In addition, through collaboration with the China Kadoorie Biobank and 23andMe, we will perform family-based GWAS in a set of diverse ancestries. Using the summary statistics, we will test within- and cross-ancestry prediction using polygenic indexes (PGIs, also called polygenic scores) derived from family-based and standard GWAS summary statistics, enabling us to determine the role of confounding in the drop in predictive accuracy of PGIs across ancestries. We will investigate methods that combine standard GWAS summary statistics and family-based GWAS summary statistics to improve polygenic prediction across ancestries. · Boost the power of family-based GWAS by adding genotyped individuals without any close relatives to the estimation sample. We will derive analytical formulas that can be used to quantify the efficiency gains in specific settings. We will develop an efficient linear mixed model algorithm that simultaneously performs standard- and family-based GWAS, maximizing power for both. Preliminary results from UK Biobank indicate this method results in an increase in effective sample size for estimation of direct genetic effects of between 30 and 40%. · Increase power for association-by-proxy methods by imputing relatives’ genotypes. Theory shows that power for discovery of associations could be increased when the genotype of the un-genotyped relative is imputed to give a more accurate estimate of the relative’s genotype. We will apply the methods to phenotypes available for UK Biobank participants’ parents, including Alzheimer’s disease and longevity. · Extend the algorithm for imputing parental genotypes to diverse populations and additional relatives. We will develop an algorithm that uses a diverse haplotype reference panel as the basis of pedigree- based imputation. In addition to removing bias from imputation in diverse samples, our approach will generalize the imputation algorithm to include relatives other than full-siblings and parents, thereby increasing imputation accuracy and power of downstream family-based genetic association analyses. The software for implementing the methods will be made publicly available on a GitHub repository. The summary statistics will be made publicly available to the maximum extent consistent with data use agreements.

Key facts

NIH application ID
10717993
Project number
1R01AG083379-01
Recipient
UNIVERSITY OF CALIFORNIA LOS ANGELES
Principal Investigator
Alexander Thomas Ian Strudwick Young
Activity code
R01
Funding institute
NIH
Fiscal year
2023
Award amount
$800,648
Award type
1
Project period
2023-09-15 → 2028-05-31