Identifying ancestry-specific and distal components of disease-associated gene regulation and cellular function

NIH RePORTER · NIH · R01 · $491,366 · view on reporter.nih.gov ↗

Abstract

Summary/Abstract Genome-wide association studies (GWAS) have associated hundreds of thousands of genetic variants with human disease and complex traits. 90% of associated variants reside in noncoding sequences that can enhance or suppress gene expression levels. While GWAS does not reveal the target genes of associated variants, extraordinary effort has been dedicated to mapping target genes that carry out the functional effects of noncoding genetic variation. While knowing which genetic variants cause disease is not often sufficient for clinical intervention, identifying disease genes can efficiently accelerate the development of therapeutics. Correlations between genotype and gene expression, known as expression quantitative trait loci (eQTL) studies, can provide valuable insight into the mechanism of disease-associated variants. For example, a previous study found MAPK3 to be associated with schizophrenia and neurodevelopmental phenotypes via a key role in neuronal proliferation. Thousands of genetic associations are still uncharacterized in terms of their target genes and cell types of action. This proposal will develop new algorithms to robustly map disease- associated variants to disease-critical genes and infer their cell-type-specific regulatory behavior across three aims. First, we hypothesize that new disease-critical genes will be discovered if variants are accurately mapped to target genes in non-Europeans, where cohorts are small and variant-to-gene mapping is imprecise. To this end, we will develop a novel gene-disease mapping technique for understudied populations. Second, we hypothesize that linking distal regulatory variants to target genes should provide mechanistic explanations for many uncharacterized GWAS variants. To this end, we will develop a high-dimensional feature selection technique to detect distal effects on gene expression. Third, we hypothesize that novel variant-to-gene links can be identified by analyzing rare cell types from single cell RNA-sequencing. To this end, we will link variants to genes in cell-type-specific contexts using mixed models for heritability estimation. Overall, while gene expression prediction models are a powerful tool to link genes to disease, they have been applied to only limited study designs: single ancestry gene expression cohorts (which are not powerful in non-European populations with limited sample sizes), predictor variants in the cis regulatory window, and bulk tissue or cell type gene expression data. There are many open questions due to these limitations that our proposal aims to address including the degree to which genetic variation regulates gene expression in population-specific manners, via long-range mechanisms, in cell-type-specific or cell-state-specific manners, and in ways that are relevant to complex traits and diseases. Our contribution is expected to be significant because still 30% of disease-associated variants have no known target gene and because this work will...

Key facts

NIH application ID
10943180
Project number
1R01HG013671-01
Recipient
UNIVERSITY OF CALIFORNIA, SAN DIEGO
Principal Investigator
Tiffany Amariuta-Bartell
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$491,366
Award type
1
Project period
2024-09-04 → 2029-06-30