Statistical analysis of large genomic data sets

NIH RePORTER · NIH · R01 · $404,435 · view on reporter.nih.gov ↗

Abstract

Heritability analysis in the largest whole genome sequence (WGS) dataset, the NHLBI Trans-omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), strongly suggested that “missing heritability” can be attributed to rare variants that are not well targeted by array-based genotype variants. Large genome wide association studies (GWAS), complemented by whole genome sequencing studies (WGS), will be a cost efficient strategy to identify genetic variants and understand the genetic architecture of complex traits. Multiple large Biobanks with SNP-array data and whole genome sequencing data, such as the NHLBI Trans-omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), provide an unprecedented but challenging opportunity to understand the genetic mechanisms underlying complex diseases. We have identified three pressing challenges in utilizing large GWAS and WGS datasets and propose the following four specific aims to meet the challenges: 1) Differentiate horizontal pleiotropy from mediation using GWAS summary statistics and apply the methods to publicly existing data. 2) Prioritize genetic variants sensitive to interactions, and estimate the overall contribution of interactions to a phenotype. 3) Incorporate family linkage/local ancestry to identify genetic variants in the TOPMed whole genome sequencing data. 4) Develop corresponding software that will be made publicly available. We will apply our new analytic methods to TOPMED WGS, UK Biobank data and many existing GWAS summary statistics. Our data analysis will focus on blood pressure, obesity and sleep disorders, and their effects on disease outcomes such as cardiovascular disease, diabetes, heart failure and dementia.

Key facts

NIH application ID
9943545
Project number
1R01HG011052-01
Recipient
CASE WESTERN RESERVE UNIVERSITY
Principal Investigator
XIAOFENG ZHU
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$404,435
Award type
1
Project period
2020-05-08 → 2024-02-29