Rarely Common: Uncovering the dominant role of rare variants in the genetic architecture of complex human traits.

NIH RePORTER · NIH · R01 · $545,748 · view on reporter.nih.gov ↗

Abstract

ABSTRACT: The vast majority of human mutations have minor allele frequencies (MAF) under 1%, with the plurality observed only once (i.e., “singletons”). While Mendelian diseases are predominantly caused by rare alleles, the cumulative contribution of rare variants to complex phenotypes remains hotly debated. In our recent work, we demonstrated that ultrarare variants (MAF<0.01%) make a substantial contribution to the genetic architecture of human transcriptional regulation (an intermediate between genetic variation and complex disease)1, and low frequency variants constitute nearly half the heritability of several complex traits (on average)2. In this study, we will functionally validate the role that ultrarare variants play in human gene expression using massively parallel reporter assays (MPRAs). MPRA have revolutionized the way enhancers can be assayed for activity. We will utilize MPRAs to functionally validate our finding that ultrarare variants dominate the genetic architecture of human gene expression. We will use insights from this technology to drive statistical and bioinformatic improvements in the way genetic variation data are analyzed. We will then expand our analysis to quantify the genetic architecture of gene expression across tissues. All tissues in the human body derive from essentially the same DNA but exhibit remarkably different patterns of gene expression. We will extend our Haseman-Elston (HE) regression approach for modeling the genetic architecture of gene expression to multiple traits to uncover cross-tissue and tissue-specific genetic effects using WGS and multi-tissue RNA-sequencing data from the GTEx project5. Finally, we will improve genomic-based precision medicine efforts for all by characterizing the population-specific genetic architecture of complex traits. Every human population has experienced a different evolutionary history in the recent past (different pathogens, different limits on reproductive growth, etc). Each population therefore has a different distribution of genetic variation. As a consequence, different populations likely have different genetic architectures for complex traits. Further, many understudied populations are admixed (with ancestry deriving from multiple populations). We will extend our HE regression approach to model shared and population-specific genetic effects using >140 thousand samples from multiple populations with whole genome sequencing data and complex trait data from the TOPMed Project6.

Key facts

NIH application ID
10745276
Project number
5R01GM142112-04
Recipient
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO
Principal Investigator
Ryan D. Hernandez
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$545,748
Award type
5
Project period
2021-04-01 → 2025-12-31